Case Studies as Proof of General Causation
Case Studies as Proof of General Causation
By Bill Masters
Wallace, Klor & Mann, P.C.
This article is the first in a two part series about proof of “general causation.” Specifically, it concerns the evidential value of a “case study” to prove “general causation.” The second article, to appear next month in this Newsletter, concerns the use of the process of a “differential diagnosis” to prove general causation in Oregon.
A “case study” has two forms: The first is that in which the patient’s clinical signs and symptoms correlate with an established disease or syndrome. The first form generally stirs little academic interest in medical circles. The second is that in which the signs and symptoms fail to correlate with an established disease or syndrome. Unlike the first form of a case study, this second form is often of interest in academic medical circles because it may stimulate medical investigation into whether a new disease or syndrome exists. The value of this second form of a case study as proof of general causation is the topic of this article.
- Definitions and Distinctions
A preface to this analysis is the following important distinctions.
- Causation v Association
“Causation” differs from “association.” An “association” is, at the very least, merely the co-occurrence of two or more events by chance. It is, at the very most, the degree of statistical dependence between two or more events such that they occur together more or less frequently than one would expect by chance.
Coincident Association =df as an association between variables owing to chance alone or, less prosaically, “a surprising occurrence of events, perceived as meaningfully related, with no apparent causal connection.”
Statistical Association =df as an association between variables not owing to chance.
An association does not necessarily imply causation. That is, that an exposure is statistically associated with an effect is a necessary, but not a sufficient condition for inferring causation. K.E. Stanovich, How to Think Straight About Psychiatry, 75-86 (2001); P. Skrabanek. The Emptiness of the Black Box. Epidemiology, 5: 553-555 (1994). If it cannot be demonstrated that the co-occurrence of an exposure and an effect is a statistical association, it cannot be demonstrated that the exposure causes the effect.
Causation is “an event, condition, characteristic, or agent that is a necessary element of a set of other events that produce an outcome, such as disease.” Epidemiology in Reference Manual in Scientific Evidence; UCJI No. 23.01; K.J. Rothman & S. Greenland. Modern Epidemiology, 7-28 (1998); L. Gordis. Epidemiology, pp. 191-201 (2000); E. Sousa & M. Tooley (eds.) Causation (1998); D. Lewis. Causation as Influence. The Journal of Philosophy, XCVII: 182-197 (2000). As Wittgenstein remarked, “cause isn’t just temporal coincidence but influence.” Wittgenstein, Philosophical Occasions 1912-51, p. 407 (1993).
- General Causation v Specific Causation
“Specific causation” differs from “general causation.” “General causation” concerns whether an exposure causes an event (such as disease) in the population. That is, general causation concerns sets of things.
General Causation =df as “X is capable of causing Y in the population Z.”
“Specific causation” concerns whether a particular exposure causes a particular event (or disease) in a particular individual. Specific causation, that is, concerns whether something is an element of a particular set of things.
Specific Causation =df as “X did cause Y in z = Z.”
E.g., Casey v Ohio Medical Products, 877 F Supp 1380, 1383 (ND Cal 1995); Cavallo v Star Enterprise, 892 F Supp 756 (ED Va 1995); Zarecki v National RR Passenger Corp., 914 F Supp 1566 (ND Ill 1996); Merrell Dow Pharmaceuticals, Inc. v Havner, 953 SW 2d 706, 714-16 (Tex 1997).
III. Case Studies
- What is A Case Study?
A case study is a medical analysis of the clinical profile of a single patient where the patients signs and symptoms do not fit within any established disease category or syndrome. It includes a description of the patient’s signs and symptoms, an effort to rule out all other established disease categories, and an hypothesis about what may have caused that clinical profile. The analysis is sometimes published to bring attention to what the clinician considers to be an “interesting clinical presentation.” M.H.M. Dykes. Uncritical Thinking in Medicine. JAMA, 227 (11): 1275-1277 (1974); D.L. Sackett et al. Clinical Epidemiology, pp.360-361 (1991).
- The Generally Accepted Opinion in the Scientific Community About the Probative Value of Case Studies
The generally accepted opinion in the scientific community about the power of a case study to confirm an hypothesis about general causation is that—except in rare cases—it has none. A case study is often referred to as “anecdotal evidence,” a term of derision in scientific circles. Ordinarily, a case study or case series is a basis for hypotheses about general causation. But it does not have the power to confirm that hypothesis. It is mere speculation. K.E. Stanovich. How To Think Straight About Psychology, 55-74 (2001); R.H. Fletcher et al. Clinical Epidemiology The Essentials, pp.208-209 (1996); D.L. Sackett et. al. Clinical Epidemiology, pp. 290-291 (1991); L. Gordis. Epidemiology, p. 102 (2000); S. Greenland et. al. The Value of Risk-Factor (“Black Box”) Epidemiology. Epidemiology, 15: 529-535 (2004); Venning, G.R. Validity of Anecdotal Reports of Suspected Adverse Drug Reactions: The Problem of False Alarms. BMJ, 284:249-254 (1982); Cook, D.J. et al. Rules of Evidence & Clinical Recommendations on the Use of Antithrombotic Agents. Chest, 305S-311S (1992); Dykes, M.H.M. Uncritical Thinking in Medicine. JAMA, 227:1275-1277 (1974); Simpson, R.J. & Griggs, T.R. Case Reports and Medical Progress. Perspectives in Biology & Medicine, 28:402-406 (1985); Muzzey v. Kerr-McGee Chemical Corp., 921 F. Supp. 511 (N.D. Ill. 1996) (anecdotal case studies are insufficient to establish general causation); Casey v. Ohio Med. Prods., 877 F. Supp. 1380 (N.D. Cal. 1995) (such case reports are not reliable scientific evidence of causation, because they simply described reported phenomena without comparison to the rate at which the phenomena occur in the general population or in a defined control group; do not isolate and exclude potentially alternative causes; and do not investigate or explain the mechanism of causation); Haggerty v. Upjohn Co., 950 F. Supp. 1160 (S.D. Fla. 1996), aff’d, 158 F3d 588 (11th Cir. 1998).
- The Argument Against a Case Study Confirming a Hypothesis About General Causation.
A case study has inherent weaknesses, particularly in design, limiting or nullifying its power to confirm an hypothesis about general causation. Simply, case studies, unlike other studies with the power to predict causative relationships, do not account for: (1) random error, (2) the various well-known biases that surreptitiously enter the analytical space of the study, and (3) confounding.
- Random Variation or Error
Random variation is that part of experience that cannot be predicted, that part of experience owing to chance. The major constituents of random variation are “measurement error” and “sampling error.”
- Measurement Error
Measurement error is the difference between the observed values of a variable recorded in similar conditions and some fixed true value of that variable. For instance, on 11.14.94, Dr X tested his patient’s for Achilles Reflex and found it to be absent. That measurement could have been in error; that is, the measured observation or value of the Achilles Reflex on that date may not have reflected the true value of that reflex. But, in isolation, absent further measurements by Dr X and other doctors, it would be difficult to determine whether it was in error.
(Ankle Jerk or Achilles Reflex)
|Dr X (11.14.94)||Abs /Abs|
This potential error in this measurement could be seen in relief had Dr X provided a chart of all his observations of this variable over the course of his observations of this patient as shown below.
(Ankle Jerk or Achilles Reflex)
|Dr X (11.01.93)||2+/2+|
|Dr X (03.09.94)||2+/2+|
|Dr X (11.14.94)||Abs /Abs|
|Dr X (03.29.95)||2+/2+|
Given this series of measurements, it is probable that the measurement on 11.14.94 was an error.
- Sampling Error
Sampling error is the difference between the value of the sample statistic (the value of a measured variable from the sample) and the value of the population parameter (the value of the measured variable from the population). To establish a context for assessing sampling error, epidemiologists and clinicians need to identify the characteristics of the sample that they are studying and consider the population from which the sample was drawn. [Was the sample just one person while the population was 250 million people?] When epidemiologists study the characteristics of the sample, they typically calculate the basic measures of association—relative risk, odds ratio, and risk difference—using data drawn from a “2 x 2 table” (see, for example, the table in the next paragraph). These data help epidemiologists discern whether an association exists–that is, whether or not the events of exposure and disease are from random variation.
When an expert such as Dr X relies on a single case study (a minimum subset of the likely sample) to confirm an hypothesis that an event is associated with disease, the expert essentially does so without considering the data in the appropriate 2 x 2 table. For instance, this is how the 2 x 2 table (involving “categorical” data) of such an expert would appear:
What is the “odds ratio”? It cannot be computed owing to the lack of data in the other cells of the 2 x 2 table.
Odds Ratio =df as odds an exposed person develops disease
odds an unexposed person develops disease
Now suppose that after the court rules that a single case study or case series is adequate data upon which to base a conclusion of general causation, the expert discloses that the data for the complete sample in the 2 x 2 table as follows:
What is the odds ratio? It is .075, indicating no harmful association. An odds ratio less than one indicates that the exposure is beneficial. A.J. Silman. Epidemiological Studies: A Practical Guide, pp. 117-119 (1995). A harmful association would be suspected if the odds ratio significantly exceeded one. Marcia Angell, The Interpretation of Epidemiologic Studies. NEJM, 323: 823 (1990); Taubes, Epidemiology Faces Its Limits. Science, 269: 164 (1995).
What these data reveal is that the exposure is not significantly associated with disease and that the individual in the case study who had the disease most probably did not contract the disease from the exposure, given that there were 100 people with the disease who were not exposed, and 200 people who were exposed who did not develop the disease.
The findings in the population may or may not be consistent with the findings from the sample. Statistics are used to assess whether or not the sample statistic is likely to mirror the population parameter.
Here are the data for the population. They too reflect that no significant association exists between the exposure and the disease.
What is the odds ratio for the population? It is .00000075, indicating that there is no harmful association.
- Bias (Systematic Error)
Bias is defined as “systematic deviation from the truth.” K.J. Rothman & S. Greenland. Modern Epidemiology, 118-120 (1998); D.L. Sackett. Bias in Analytic Research. J. Clinical Disease, 32: 51-63 (1979); C.V. Phillips. Quantifying & Reporting Uncertainty from Systematic Errors. Epidemiology, 14: 459-466 (2003). Ideally, the “truth” would be established by reference to a gold standard, the randomized, controlled, double-blinded clinical trial. Absent that, the “reliability” of the data or observations would provide a convenient surrogate for a gold standard. Reliability concerns both “intra-observer reliability” (the extent to which the same rater or observer judges the phenomena the same way upon multiple observations or ratings) and “inter-observer reliability” (the extent to which different raters or observers judge phenomena the same way upon each individual observation or rating). That is, bias can be often discerned from the lack of reliability of the measurements of the variables in question.
Here is an example of intra-observer reliability. Dr X has made four clinical observations of a single patient over 15 months. With the exception of the first observation, his observations seem to be consistent from observation to observation. (The observation on 11.01.93 may be a measurement error.)
(Ankle Jerk or Achilles Reflex)
|Dr X (11.01.93)||1+/2+|
|Dr X (03.09.94)||Abs/ Abs|
|Dr X (11.14.94)||Abs/ Abs|
|Dr X (03.29.95)||Abs/ Abs|
- Confirmatory Bias
But the nature of systematic bias is that the observer is consistently reporting values for a variable that deviate from the true value of that variable. A source of such deviation is, for example, “confirmatory” or “investigator” bias, a mind predisposed to see and accept data that confirm the favored hypothesis and reject or fail to see data that disconfirms the hypothesis. Determining whether that kind of bias exists requires placing the data in a wider context. Consider the preceding data in the wider context of the findings of other treating physicians, with a focus on comparing Dr X’s findings with those of the other treating physicians:
(Ankle Jerk or Achilles Reflex)
(These data are from an actual case)
|Dr X (11.01.93)||1+ /2+|
|Dr A (11.15.93)||2+ /2+|
|Dr B (11.23.93)||Normal [2+/ 2+]|
|Dr C (01.04.94)||2+/ 2+|
|Dr C (01.25.94)||2+/ 2+|
|BAF (01.12.94)||Normal [2+ /2+]|
|Urgency Care Clinic (01.19.94)||Normal [2+/ 2+]|
|Dr A (01.25.94)||2+ /2+|
|Dr X (03.09.94)||Abs/ Abs|
|Dr B (05.03.94)||2+/ 2+|
|Dr X (11.14.94)||Abs/ Abs|
|Dr C (11.30.94)||2+/ 2+|
|Dr A (02.21.95)||2+/ 2+|
|Dr X (03.29.95)||Abs/ Abs|
Obviously, Dr X’s findings considered in isolation from the other physicians’ findings demonstrate a degree of intra-observer reliability. That is, Dr X, after his exam in November 2003, consistently found the same result. When presented to a jury, that data would be impressive and tend to cause the jury to believe in Dr X’s findings of impairment.
But when Dr X’s findings are compared with the findings of the many other different physicians, it becomes apparent immediately that Dr X’s findings, while demonstrating intra-observer consistency, lack inter-observer reliability. The other physicians’ findings, without those of Dr X, demonstrate strong inter-observer reliability, thereby placing Dr X’s findings in an even more damming light.
- Selection Bias
Considering the likelihood of random variation is obviously imperative when confirming hypotheses about general causation. At a minimum, the sample being studied must be randomly selected from the population. A random selection is one in which each member of the population has an equal chance of being selected. When the sample is not randomly collected, then it is most likely to be biased (selection bias). For instance, suppose an investigator has a sample of 60 women with silicone breast implants. In each of these 60 women, he finds sensory deficits. That information by itself seems persuasive that the implants are causing the deficits. But further suppose that the 60 women were not randomly selected from the population of women with implants, and that in context they are not representative sample of the population at all.
What is the odds ratio for the population? It is .00045, indicating no harmful association.
A confounder is a variable that is associated with both the exposure and effect but is not an intermediate step in the causal pathway between the exposure and effect. K.J. Rothman & S. Greenland. Modern Epidemiology, 62; 120-125 (1998). It is, in short, an alternate explanation for the effect or result of the study, whose identity may reveal that the purported exposure is not in fact the cause of the effect.
- The Argument for a Case Study as Confirming a Hypothesis about General Causation?
“To prove P, assume not P; if the assumption of not P leads to an absurdity, then not P must be false and P must be true.”
Is there an argument, however threadbare, that a case study has the power to confirm an hypothesis about general causation? For it to have that power, it should have, at minimum based on the preceding discussion, three qualities: (1) the quality of controlling for random variation (selection bias), (2) the quality of controlling for systematic biases of confirmatory bias and (3) the quality of having ruled out confounding.
- Controlling for Random Variation
To control for random variation, the investigator must insure that the subject of the case study reflects the characteristics of the population. To do that, the subject of the case study must be either the population or a random sample of the population. (Most case and epidemiological studies are not randomly selected from the population.) S. Greenland. Randomization, Statistics, & Causal Inference. Epidemiology, 1:421-429 (1990); P. Breman & P. Croft. Interpreting The Results of Observational Research: Chance Is Not Such a Fine Thing. BMJ, 309: 727-730 (1994). The case study cannot have any evidential power to confirm an hypothesis about general causation if it is offered merely as an coincidental instance of the co-occurrence of the exposure and the disease or syndrome.
The Golden Rule: If it cannot be demonstrated that the co-occurrence of an exposure and an effect is a statistical association, it cannot be demonstrated that the exposure causes the effect.
The case study must be placed in a wider context. First, it must be nested in the wider data needed to establish a measure of association, preferably in the form of a 2 x 2 table. Second, it must be established that the sample, whose data are in that 2 x 2 table, reasonably reflects the characteristics of the population. That is, that the sample is randomly selected (not the result of the process of selection bias—or of cooking the books so to speak).
The following two situations are often considered adequate surrogates for a random sample:
- Rare Exposure and Rare Effect
Where the exposure is rare and the effect is rare, use of case studies to prove general causation is based on the so-called “principle of common cause:” when apparent coincidences occur that are too improbable to be attributed to chance, they can be explained by reference to a common cause. W.C. Salmon. Scientific Explanation and the Causal Structure of the World, pp. 158-183 (1984); R.H. Fletcher et. al. Clinical Epidemiology: The Essentials, pp.208-211 (1996).
- Slam Bang Effects
With “slam-bang” effects, what convinces the scientific community that the exposure was the cause and that the effect was not due to chance, is the immediacy and magnitude of the ensuing putative effect. See Marcum v. Adventist Health System, 345 Or 237, 193 P3d 1 (2008).
- Controlling for Systematic Bias
The systematic bias that is the most serious threat to the value of a case study as confirming an hypothesis about general causation is “confirmatory bias.” To control for confirmatory bias, the case study or its hypothesized results must have been subject to a test of the hypothesis about general causation—a test posing a real risk of producing data rejecting the hypothesis. That is, (i) the hypothesis under consideration can be falsified by some conceivable evidence and (ii) the case study has the capability of providing this conceivable evidence. More to the point, the results of the case study–the association of an exposure to a disease or syndrome–must not have been the result of a process of ad hoc data dredging by the clinician to support the litigant’s theory of recovery. R.L. Graham & J.H. Spencer, Ramsey Theory. Scientific American, p. 112 (July 1990). [Ramsey Theory: Frank Ramsey proved a variety of theorems on the inevitability of some kind of order (relations between or among variables) in large sets.]
Data Dredging =df as “collecting data without any test-hypothesis being stipulated in advance or collecting data for other purposes and then analyzing those data for other correlations between this or that exposure and these or those diseases.”
The test hypothesis must be subject to a real risk of rejection. This requires that the investigator proceed in a way that makes the rejection of the test hypothesis highly likely. Assume that given prior background knowledge, the probability that the particular event or outcome will fail is very high. Failure of an outcome favors the rival hypothesis. But if the event or outcome occurs, then the occurrence of the outcome is strong support for the test hypothesis.
When the clinician is unaware of the status of the patient as to exposure, then when the clinician performs her examination of the patient, and finds signs and symptoms consistent with a kind of exposure, and forms a hypothesis (a prediction) that the patient was exposed to that kind of exposure and the patient then confirms that prediction, the process has reduced the influence of confirmatory bias. Unfortunately, it would be exceedingly rare for the clinician to be blinded to whether the patient has been subject to the exposure before the clinical examination.
This requirement–of subjecting the hypothesis to a real risk of not being confirmed–has been the focus of debate in the area in which the case study has found it most ardent support as having the power to confirm hypotheses of general causation—psychoanalytic theory. There psychoanalytic theorists, most prominently Sigmund Freud and his disciples, have depended on the case study to support a thick network of hypotheses about general causation. In this regard, it has been proposed that the “convergence” or “bootstrap” argument, when nested in a case study, provides a form of confirmation of hypothesis of general causation.
The Convergence or Bootstrap Argument
The convergence or bootstrap method of confirming an hypothesis entails the following characteristics. There are multiple “independent” deductions from the same hypothesis about what observations should be expected in the process of a scientific experiment or investigation. These deductions are “independent” if and only if (a) the different scientific observations are obtained from different sources, by different methods, or in different settings and, when possible, are obtained by different unrelated observers, and (b) no available knowledge entails that these different kinds of scientific observations are actually of the same sort (have the same cause) and should co-exist. C. Glymour. Theory and Evidence, pp. 48-62; 263-277 (1980); M. Edelson. Hypothesis in Psychoanalysis, pp. 147-153 (1984).
Because each deduction independently has exposed the hypothesis to the possibility of being rejected, and because by making multiple independent deductions, the investigator has increased the opportunity for such rejection, the risk of rejecting the hypothesis is thereby increased. As a result, when the test data support the test hypothesis, that hypothesis is considered more probably true than false.
- Ruling Out Confounding
To account for the possible presence of confounding, the investigator must endeavor to rule out alternate plausible explanations for the result of the case study. The process of doing so is in the nature of a differential diagnosis. The investigator should do more than merely say “I have no idea what other factor may be causing this outcome or event.” Instead, the investigator should identify the possible risk factors and explain why each is not a plausible candidate for the outcome. So if one alternate risk factor is identified that the investigator failed to consider, that investigator’s forensic opinion would be nullified.
Except in rare instances, case studies should not be allowed by the court to be considered by a jury as proof of general causation. Those rare instances would entail that the clinician has done the following: (1) established good grounds to believe that the association is statistical and not merely coincidental; (2) effectively controlled for systematic biases; (3) identified potential confounders and ruled them out; and (4) discussed the specific mechanism of cause and effect.
A case study should never be allowed as evidence of general causation when it has not been published or subjected to peer review. The single most telling feature of unreliable “scientific evidence” is its insulation from meaningful critique, particularly from those most knowledgeable about the relevant scientific literature.