peer reviewed article

 

REFLECTION BEFORE ACTION:

THE STATISTICAL CONSULTANT CONFRONTS ETHICAL ISSUES

 by S. Andrew Ostapski and Claude R. Superville


S. Andrew Ostapski sostapsk@valdosta.edu  is an associate professor of management at Valdosta State University. Claude R. Superville is a professor in and the chair of the Department of Business and Economics at Benedict College. Ostapski specializes in areas dealing with business ethics. Superville teaches business statistics and management science.


Statisticians have become integral members of research and consulting teams that conduct projects for industry and government.  They face a number of ethical issues that are somewhat unique to their profession.  They attempt to derive sense out of data and develop strategies for the proper interpretation of such information.  The article explores ethical issues faced by statisticians pertaining to their use and ownership of data, the search for significance in data that may be environmentally induced, their conduct as members of a profession and the scientific community, and their roles in consultant-client relationships.  Without reflecting before taking action, statisticians' interpretations of data may not be defensible if they are later challenged on ethical grounds. 

INTRODUCTION

Typically, research projects germinate as an idea in the mind of a researcher.  The scientific method provides a useful  basis for the progression of such projects. [Levin, Rubin, Stinson and Gardner, 1989)] As a first step, the problem in question is carefully observed, analyzed and clearly defined. A model is initially developed that takes into account the use, objectives and limitations of the project.  Data are collected, inspected and used as input to test the validity of the model.  A solution is proposed, tested for reasonableness, and implemented if it is feasible. 

Statisticians are nimble problem solvers whose work is often interdisciplinary and proceeds into previously uncharted areas. [Kettenring, 1995] However, the ability to be creative in building interdisciplinary bridges can be risky, especially when the parties that are served do not understand the statistical process.  The statistician must not only make sense out of the data but also develop the means to insure the proper interpretation of such information by all relevant parties. [Hunter, 1994]  Determining what is proper is not always easy.  The American Statistical Association's Ethical Guidelines for Statistical Practice [1] offer guidance in making that determination. 

Statisticians establish professional standards for their continuing education and accountability to insure the advancement and integrity of their art.  Certification is one means to promote professional values to cultivate a greater sense of responsibility [Imrey, 1994] and to generally enhance professionalism by distinguishing well-trained statisticians from those who may possess lesser qualifications or abilities. Certification by itself does not resolve how any individual will respond when faced with conflicting pressures to arrive at one of several very critically different determinations.  Certification by itself does not guarantee that the individual will reflect upon the ethical implications of a statistical task before taking action.      

Professionalism also may be defined appropriately by an association's ability to self-regulate through an established code of ethics. [DeGeorge, 1999]  The American Statistical Association has recently revised its ethical guidelines after nearly four years of study and debate.  The revised guidelines have extensively supplemented the principles of 1989 [2]  to more than three times their original length.  However, more words do not necessarily insure more clarity.

A code of ethics, which represents the collective moral responsibility of the profession, provides essential guidance toward reaching a solution to a vexing statistical problem.  However, ethical statements, no matter how well developed, are still no more than general guidelines. As such, they may or may not address a given situation but the ideals, which serve as the foundation for the principles, can be extrapolated and applied.  In practice, the professional and ethical goal is to avoid the misuse of data which necessarily results in harm to the immediate people represented by this data, to clients, and to others as well. [Eberstadt, 1994]

ETHICAL ANALYSIS

When confronted with an ethical problem, statisticians should address the issue directly and honestly without the false hope that the problem will resolve itself or simply go away.  Philosophers, since the dawn of civilization, have developed approaches to ethical analysis.  Instead of a detailed systems history and analysis, the following general questions may be useful for statisticians to consider as a basis for ethical inquiry:

(1)  Is there a duty to act in a certain way that is recognized and also followed by others? [(Ross 1930]

(2)  What are the consequences of the proposed action and, most of the time, will it benefit the most people? [Mill 1971]

(3)  Would my decision toward someone else, if it were made concerning me, also be acceptable to me? [Shaw 1991]

(4)  Would the decision be the same if it were published on the front page of  The Wall Street Journal? [Murphy 1993]

Philosophers may argue for generations as to which approach is best, but the statistician's ultimate objective is to develop a process to deal with ethical issues as they arise.  Unfortunately, in divisive situations, the statistician still may not avoid a controversial result but, at least, the statistician has the comfort in possessing integrity to face the issue squarely. The ethical inquiry is not a detour from the statistical assignment, but a necessary part of the journey where the ultimate destination has greater meaning than the particular assignment.  Ethical landscapes are littered with many concerns on the destination to right conduct. Two scenarios, familiar to statisticians, are examined below. 

NON-OWNER USE OF DATA

In the implementation of research projects, the statistician may be asked to collect and/or analyze data and to report on the results.  Ownership of data, rights to the data and co-authorship are some questions for which there appear to be no definitive answers, because they are so dependent upon their individual factual circumstances.

 If clients collect the data, ownership clearly belongs to them.  However, should the statistician expect ownership of data collected while in the client's employ on an entity that is client-owned?  The equities of the situation seem to favor the client, but this is not a conclusive result.  If it was agreed contractually that the client owns the data, is the statistician owed a right to use the data for publication purposes?  If the statistician is allowed to use the data in this way, obviously there is an obligation to mask the client's and company's identities to protect their rights to anonymity in accordance with this right of limited use.  In any event, the client should be informed of such intended use of data and supplied with a copy of the final written article. To eliminate doubt from the onset of the statistical engagement, a memorandum of understanding between client and statistician certainly would clarify each party's respective expectations and rights.

SEARCH FOR SIGNIFICANCE AND OUTLIERS

Another issue that the statistical consultant may encounter in performing his duties is known as "the search for significance." [Derr, 1994] The client may have a pre-conceived notion of which results a particular study should conclude. For example, the client may wish to show that employees' salary levels can best be predicted by their years of experience in their chosen field. However, once a regression analysis is performed on the data, it may be revealed that the employees' education level is a more significant (useful) predictor of their salary level. While it is ethical for statisticians to report all results, both those in favor and against his client's views, they may have no control on how the results are used and whether selective reporting of the results will occur. In a more extreme example, should the statistician be an employee of the client and not just a consultant, his future employment with the company may be in jeopardy.

In the analysis of a data set, the statistical consultant may observe the occurrence of unusually large or small observations, called outliers, in the data set.  For example, the average yearly salary of production employees at a certain factory is $36,000, excluding the salary of the plant manager, which is $80,000, an outlier. With the inclusion of the plant manager's salary, the average salary increases to $40,000. The client may direct the statistician to include or exclude the outlier in his final report. Mann (1993) surveyed the responses to such a problem.  The results showed that, while a final analysis with and without outliers is considered ethical, at a minimum, mention of any unusual values is necessary to prevent misrepresentation of findings.  The extent of disclosure can be the subject of a moment of ethical reflection prior to the final release of information.

ETHICS PROVIDES A DEFENSIBLE RESPONSE

Each of the two previous scenarios may be subject to general legal and/or contractual requirements, which may dictate the outcome of the situation. The law, as established by society, condemns fraud.  A contract, the private law between the parties, may state the terms of the statistical engagement as, for example, that data belongs exclusively to the client. If the client has the right to change statistical findings, that contract is illegal because it is against public policy.  But, even following the law is no guarantee that it is the morally right choice. The outlier, if disclosed, is legally defensible, but ethically it may not provide a complete account of what is being measured.  Particularly in the absence of control factors, such as the established bounds of law, only ethics remains as the persuasive means toward right conduct despite opposing pressures to do otherwise.  Right conduct consists of pursuing "the good," which may be defined as those moral goals and objectives which we choose to pursue and serve to define who we are. [Corley, Reed, Shedd and Morehead, 1999]

Relevant professional organizations can be sources of ethical guidance.  The American Statistical Association's Ethical Guidelines for Statistical Practice contain ethical considerations that include the following major principles:  [3]

  1. maintain integrity by being honest and objective,
  2. collect only the data required for the inquiry's purpose,
  3. provide information about the general nature of the inquiry and the intended use of the data,
  4. protect the confidentiality of information,
  5. delineate the boundaries of the inquiry,
  6. inform a client or employer of anything that may affect or conflict with impartiality,
  7. fulfill all commitments in any inquiry undertaken,
  8. apply statistical procedures without concern for a favorable result,
  9. disclose no private information about or belonging to any client without the client's consent,
  10. submit issues to the ASA Committee on Professional Ethics. [4]  

Each of these fundamental principles of good conduct governs the statistician's behavior in whatever role he finds himself and provides a basis for evaluating statistical validity.

GUIDELINES FOR EVALUATING STATISTICAL VALIDITY

Statistics are useful ways of abbreviating large amounts of information into simpler summary numbers. Best (2001) suggests that no statistic is " … perfect. Inevitably, some information, some of the complexity, is lost whenever we use statistics." He suggests a critical thoughtful approach to evaluating any statistic that consists of the following:

  1. Be prepared to ask questions about numbers.
  2. What might be the source for this number?
  3. How can one go about producing the figure?
  4. Who produced the number, and what interest might they have?
  5. What are the different ways key terms might be defined, and which definitions have been chosen?
  6. What sort of sample was gathered, and how might that sample affect the result?
  7. Is the statistic being properly interpreted?

THE ROLES OF A STATISTICIAN

The statistician's role as an individual, professional, part of an institution, and member of society may affect her duty as she reflects upon her response to the "Non-Owner Use" and "Search for Significance and Outlier" situations described earlier. [5]

The Statistician as an Individual

The individual may be a paid advocate for a particular faction or a non-partisan with no particular allegiance or pecuniary interest.  As a non-partisan, there may be fewer economic pressures, which could skew results in the "Search for Significance and Outlier" situations.  Still a non-partisan may be tempted to use data to which he is not entitled for publication purposes.  Ethically, he must inform his client of his intended use and protect his client's confidentiality, especially if given permission to publish.

As to the scenarios described in the "Search for Significance Outlier" section above, full disclosure of statistical issues is warranted whether the individual is a paid advocate or not. It is essential to maintain honest findings as opposed to the client's interest in validating a certain expected outcome. Integrity in each situation cannot be compromised otherwise the entire basis for the soundness of the statistical process may be held in disrepute.

The statistician must be concerned with not just ethical behavior but also the appearance of impropriety.  For example, a statistician should not accept gifts, which observers may construe as an attempt to influence results, even if the gesture was absolutely innocent. 

As Members of a Professional Society and the Scientific Community

Professionals hold themselves to a higher standard of conduct than the rest of society. [DeGeorge, 1999]  Scientific communities pride themselves on the level of quality achieved through the process of self-regulation.  Rivalry exists when one community claims to be superior to another because of its strict adherence to standards, which define the ideal norm.  Industry, academia, and government present the consultant with both rules of behavior and pressures that may corrupt the integrity of statistical practice as established by a professional society or a scientific community.  Industry and government may desire to skew numbers to project a more favorable image than objectively warranted as in the situations described in the "Search for Significance and Outlier" section above. The pressure to publish or obtain funding in academia may drive the researcher in the  "Search for Significance and Outlier" scenarios to redirect data to the desired path.  A researcher may pursue non-ownership use of data, regardless of its moral implications, just to advance professional aspirations.  In the long run, unethical conduct will be condemned as it becomes discovered by members of society, which may then revoke the profession's or scientific community's self-regulation and impose harsher standards of accountability.

As a Participant in an Institution

The statistician, unless independently wealthy, may find himself dependent on the given institutional framework within which he must work.  Industry, government, or academia may bring to him an outcome-driven agenda.  To maintain the dignity of the profession, the statistician must preserve his independence when dealing with statistical issues. The standards of ethical behavior remain unchanged regardless of whether one's definition of good scientific practice, even if driven by outside pressure, falls below that.  Institutions as well as individuals will be held accountable for their unethical breaches.  The ethical statistician, if unable to change the institution, may need to abandon it. Absent breach of law or the shame of public disclosure, there is no enforcement mechanism, which could make employers accountable.  For this reason, statisticians must be prepared to assume ethical responsibility for their work and not simply declare that the matter is exclusively their employer's concern.       

As a Member of Society-at-Large

The statistician is a member of society and should be concerned with its greater good.  Utilizing his professional background, he has an obligation to be a conduit, through the use of data methodology or other means, to promote a better understanding of societal issues. Ethical awareness produced by deep reflection can preserve the underlying integrity of the statistical process for the greater good of society.

CONCLUSIONS

This article has examined the issues of data use and ownership, data manipulation, and the search for significance that the statistical consultant faces. It also provides a template for evaluating the validity of statistics and the motives behind the providers of these statistics. Due to the complexity of environmental pressures, these concerns are ethical issues that defy easy resolution, but nevertheless must be resolved through ethical reflection before action is taken.  Objective and ethically defensible answers are certainly possible  if ethical responsibility is assumed as more than words, but as a statement of practice.


SOURCES

Best, J. (2001), Telling the Truth About Damned Lies and Statistics, The Chronicle of Higher Education, Section 2, May 4, 2001, B7-B9.

Corley, R. N., Reed, O. L., Shedd, P.J. and Morehead, J.W. (1999), The Legal and Regulatory Environment of Business, McGraw-Hill, New York (11th ed.).

DeGeorge, R. (1999), Business Ethics, Prentice Hall, New Jersey (5th ed.).

Derr, J. (1994), Teaching Ethical Practice in Statistics, Amstat News, No. 209, 13-14.

Eberstadt, N. (1994), The Tyranny of Numbers, The American Enterprise, No. 6, 35-42.

Hunter, J. (1994), Statistics as a Profession, Journal of the American Statistical Association, No. 425, 1-6.

Imrey, P. (1994), Statistical Values, Quality, and Certification. The American Statistician, No. 2, 65-70.

Kettenring, J. (1995), What Industry Needs. The American Statistician, No. 1, 2-4.

Levin, R. I., Rubin, D. S., Stinson, J. P. and Gardner, E. S. Jr. (1989), Quantitative Approaches to  Management, McGraw-Hill, New York (7th ed.).

Mann, C. R. (1993), Distinguishing Unethical Behavior from Technical Error. Amstat News, No. 203, 18.

Mill, J.S. (1971), Utilitarianism,  Bobbs-Merrill, Indianapolis, IN.

Murphy, K.R. (1993), Honesty in the Workplace, Brooks/Cole Publishing Co., Pacific Grove, CA.                 

Ross, W.D. (1930), The Right and the Good, Oxford University Press, Oxford.

Shaw, W.H. (1991), Business Ethics, Wadsworth Publishing Co., Belmont, CA.


FOOTNOTES

[1] Approved on August 7, 1999 by the Board of Directors of the American Statistical Association and may be found at http://www.amstat.org/profession/ethicalstatistics.html.

[2] The ASA Ethical Guidelines for Statistical Practice, 1989, may be found at http://www.tenj.edu/~asaethic/asagui.html.

[3] These principles were clearly stated in the ASA Ethical Guidelines for Statistical Practice published in 1989.  These considerations are still present but buried within the expanded expression of the 1999 revision.

[4] Further guidance may be obtained from the Office of Scientific and Public Affairs, American Statistical Association, 1429 Duke Street, Alexandria, Virginia  22314-3402 (703) 684-1221/fax (703) 684-3410 as well as from its web-site at http://www.amstat.org.

[5] These three categories were identified in the Summary Report of the Steering Committee pertaining to the American Statistical Association's Workshop on Ethical Issues in Statistical Expert Testimony, held on January 14-15, 1994 at Washington, D.C.