[Letter on meta-analysis in parapsychology]
(Original publication and copyright: The Journal of Parapsychology, 2006, Volume 70, pp. 410-413)
To The Editor:
Caroline Watt's (2005) thoughtful and stimulating presidential address to the Parapsychology Association raised several points that could help parapsychology. For example, she pointed out that the proposals in a recent article by me on "A Proposal and Challenge for Proponents and Skeptics of Psi" (Kennedy, 2004a) would enhance the evidential value of meta-analysis. That point is true, but it may be useful to extend the discussion beyond enhancements to meta-analysis.
In particular, effective implementation of the proposals I suggested could virtually eliminate the need for meta-analysis. If appropriate power analyses were incorporated into the design of studies as recommended in my paper, 80% of clearly identified confirmatory or pivotal studies would be expected to be statistically significant, assuming that psi experiments conform to the assumptions for standard statistical research. A few such studies would provide strong evidence for psi without the need for meta-analysis and the associated controversies over alternative methods, criteria, and outcomes.
In the areas of medical research I currently work in, large well-designed studies are given greater weight than meta-analyses. The situation was well summarized in a recent book on statistical methods in cancer research:
Our inclusion of [meta-analysis] in a chapter on exploratory analyses is an indication of our belief that the importance of meta-analysis lies mainly in exploration, not confirmation. In settling therapeutic issues, a meta-analysis is a poor substitute for one large well-conducted trial. In particular, the expectation that a meta-analysis will be done does not justify designing studies that are too small to detect realistic differences with adequate power. (Green, Benedetti, & Crowley, 2003, p. 231)
Among the medical researchers I have worked with in recent years, the conclusions of a meta-analysis are typically accepted only to the extent that they are supported by statistically significant results from large well-designed studies. In making the transition from parapsychology to other areas of research, the greatest adjustment for me was to start recognizing the fundamental importance of power analysis and the value of large well-designed studies.
To put the matter in concrete terms, I know of no well-designed ganzfeld studies. As noted in the previous article (Kennedy, 2004a), a reasonable power analysis for ganzfeld studies indicates a sample size of at least 192. I know of no ganzfeld studies with a preplanned sample size of at least 192. There are some cases in which a series of studies were combined to reach such a sample size, but these appear to have been done in a post hoc manner, and often combined exploratory studies that had variations in methodology or design. For example, the widely cited Bem and Honorton (1994) article is a meta-analysis of a series of studies from one laboratory. Based on a power analysis from previous research, each of those studies was severely underpowered and therefore would be considered poorly designed by the standards I work with now. I do not believe that post hoc meta-analysis can compensate for poor designs. In addition, the variability among the studies combined with the negative correlation between effect size and sample size raise doubts about the meta-analysis. A negative correlation between effect size and sample size is normally diagnostic of bias in a meta-analysis (Egger, Smith, Schneider, & Minder, 1997).
Of course, these arguments and proposals assume that psi conforms to the properties of standard statistical research. As noted in the previous article (Kennedy, 2004a), I have come to expect that the results of large confirmatory psi studies will not be more reliably significant than the results of small exploratory studies, which is contrary to the basic assumptions of statistical research including meta-analysis. In addition to the references in my previous article, a more recent meta-analysis of PK studies with electronic random number generators found that the z scores (significance level) did not increase with sample size and that effect size was negatively related to sample size (Bosch, Steinkamp, & Boiler, 2006).
The authors of the meta-analysis proposed that the pattern of results was due to publication bias rather than PK, but admitted that they could not provide convincing evidence for that hypothesis. Such controversies are common in meta-analyses in parapsychology. It is also noteworthy that a similar pattern of results occurred in the Bem and Honorton (1994) ganzfeld meta-analysis when publication bias presumably could not have been a factor because it included all studies of a certain type from one laboratory.
I suspect that the basically universal disregard in parapsychology for power analysis and for the value of large studies reflects the fact that most psi researchers implicitly (perhaps unconsciously) recognize that psi does not conform to the assumptions for standard statistical research. However, efforts to provide convincing evidence for psi will fail if the experimental results have unexplained properties that are inconsistent with the statistical foundations for the claimed evidence. Cautious scientists will continue to favor methodological problems as the most likely explanation, particularly if the results are unpredictable and appear to be associated with certain experimenters.
The finding that z score does not increase with sample size implies that the standard methods for data analysis including binomial tests, t tests, and analysis of variance do not have their usual meaning and applicability in psi research. The experiment as a whole may be the appropriate unit of analysis rather than the individual trials or subjects as assumed for those tests. The hypothesis of goal-oriented psi experimenter effects is logically consistent with the basic assumptions for psi research (Kennedy, 1995) and now has strong empirical support that the outcomes of psi experiments are typically unrelated to sample size. Appropriate statistical methods for this type of phenomena remain to be developed. Using statistical assumptions that do not fit the phenomena will inevitably result in failure to make scientific progress.
A two-stage statistical strategy may be needed. The first stage would be based on normal statistical methods to provide evidence that something anomalous occurred. The second stage would utilize more novel statistical assumptions appropriate for the phenomena. The concepts of goal-oriented psi (Kennedy, 1995) and evasive psi (Kennedy, 2004b) may be useful starting points for developing relevant methods.
Bem, D. J., & Honorton, C. (1994). Does psi exist? Replicable evidence for an anomalous process of information transfer. Psychological Bulletin,115, 4-18.
Bosch, H., Steinkamp, E, & Boller, E. (2006). In the eye of the beholder: Reply to Wilson and Shadish (2006) and Radin, Nelson, Dobyns, and Houtkooper (2006). Psychological Bulletin, 132, 533-537.
Egger, M., Smith, G. D., Schneider, M., & Minder, C. (1997). Bias in meta-analysis detected by a simple graphical test. British Medical Journal, 315, 629-634.
Green, S., Benedetti, J., & Crowley, J. (2003). Clinical Trials in Oncology (2nd ed.). New York: Chapman & Hall/CRC.
Kennedy, J. E. (1995). Methods for investigating goal-oriented psi. Journal of Parapsychology, 59, 47-62.
Kennedy, J. E. (2004a). A proposal and challenge for proponents and skeptics of psi. Journal of Parapsychology, 68, 157-167.
Kennedy, J. E. (2004b). What is the purpose of psi? Journal of the American Society for Psychical Research, 98, 1-27 (also available at http://jeksite.org/psi.htm).
Watt, C. (2005). 2005 Presidential address: Parapsychology's contribution to psychology: A view from the front line. Journal of Parapsychology, 69, 215-232.
J. E. Kennedy
Broomfield, CO USA