Statistical Inference (and What is Wrong With Classical Statistics)

Scope

This page concerns statistical inference as described by the most prominent and mainstream school of thought, which is variously described as ‘classical statistics’, ‘conventional statistics’, ‘frequentist statistics’, ‘orthodox statistics’ or ‘sampling theory’. Oddly, statistical inference—to draw conclusions from the data—is never defined within the paradigm.

The practice of statistical inference as described here includes estimation (point estimation and interval estimation (using confidence intervals)) and significance tests (testing a null hypothesis and calculating p-values).

The important point is that all of these methods involve pretending that our sample came from an imaginary experiment that involved considering all possible samples of the same size from the population.

History

The first formal significance test (Arbuthnott, 1710) correctly demonstrated that the excess of male births is statistically significant, but erroneously concluded that this was due to Divine Providence (intelligent design, rather than chance). Modern hypothesis testing is an anonymous hybrid of the tests proposed by Ronald Fisher (1922, 1925) on the one hand, and Jerzy Neyman and Egon Pearson (1933) on the other. Since Berkson (1938) people have questioned the use of hypothesis testing in the sciences. For a historic account of significance testing, see Huberty (1993).

The frequentist interpretation of probability is very limited

A frequentist subscribes to the long run relative frequency interpretation of probability. This is defined as the limiting frequency with which that outcome appears in a long series of similar events. Dice, coins and shuffled playing cards can be used to generate random variables; therefore, they have a frequency distribution, and thus the frequency definition of probability theory can be used. Unfortunately, the frequency interpretation can only be used in cases such as these. The Bayesian interpretation of probability can be used in any situation.

The nature of the null hypothesis test

Why should we choose between just two hypotheses, and why can't we put a probability on a hypothesis? A typical null hypothesis, that two populations means are equal, is daft: they will almost never be exactly equal. What does it mean to accept and reject a hypothesis? If a significance level is used to decide whether a null hypothesis is true or not, note that the level, such as 0.05, is totally arbitrary (the level effectively acts as a prior, but classical statisticians fail to appreciate this).

Prior information is ignored

Almost all prior information is ignored and no opportunity is given to incorporate what we already know.

Assumptions are swept under the carpet

The subjective elements of classical statistics, such as the choice of null hypothesis, determining the outcome space, the appropriate significance level and the dependence of significant tests on the stopping rule are all swept under the carpet. Bayesian methods put them where we can see them - in the prior.

p values are irrelevant (which leads to incoherence) and misleading

With little loss of generality, let us consider a simple problem of inference. Assume that we have a large population with known mean and one sample. All of this makes up our evidence, E. Our hypothesis, H, is that the sample came from a different population (one with a different mean).

The frequentist theory of probability is only capable of dealing with random variables which generate a frequency distribution ‘in the long run’. We have one fixed population and one fixed sample. There is nothing random about this problem and the experiment is conducted only once, so there is no ‘long run’. So, versed in frequentist probability, what is our hapless orthodox statistician to do?

We pretend that the experiment was not conducted once, but an infinite number of times (that is, we consider all possible samples of the same size). Incredibly, all samples are considered equal, that is, our actual sample is not given any privileges over any other (imaginary) sample. We assume that each sample mean includes an ‘error’, which is independently and normally distributed about zero. Optimistically, we now claim that our sample was ‘random’. Voila! The sample mean now becomes our random variable, which we call our ‘statistic’. We can now apply the frequentist interpretation of probability.

We are now able to determine the (frequentist) probability of a (randomly chosen) sample mean having a value at least as extreme as our original sample mean. Note that we are implicitly assuming that the sample mean and the population mean are equal. This probability is our p-value which, incredibly, is assumed to apply to the original problem.

A method similar to that outlined above is common to all Fisher-Neyman-Pearson inference. The p-value also suffers from being an incoherent measure of support, in the sense that we can reject a hypothesis that is a superset of a second hypothesis without rejecting the second. P-values are not just irrelevant, they are dangerous because they are often misunderstood to be probabilities about the hypothesis, given the data (which would be far more intuitive). As the prominent Bayesian Harold Jeffreys observed, ‘What the use of P implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred’ (Jeffreys, 1961).

In summary:

What does it mean to accept and reject a hypothesis?
ignores prior information
assumptions swept under the carpet
p values are irrelevant (which leads to incoherence) and misleading

Important Publications

ANDERSON, D.R., K.P. BURNHAM and W.L. THOMPSON, 2000. Null hypothesis testing: Problems, prevalence, and an alternative, The Journal of wildlife management 64, 912-923. [Cited by 340] (56.98/year)
Abstract: "This paper presents a review and critique of statistical null hypothesis testing in ecological studies in general, and wildlife studies in particular, and describes an alternative. Our review of Ecology and the Journal of Wildlife Management found the use of null hypothesis testing to be pervasive. The estimated number of P-values appearing within articles of Ecology exceeded 8,000 in 1991 and has exceeded 3,000 in each year since 1984, whereas the estimated number of P-values in the Journal of Wildlife Management exceeded 8,000 in 1997 and has exceeded 3,000 in each year since 1994. We estimated that 47% (SE = 3.9%) of the P-values in the Journal of Wildlife Management lacked estimates of means or effect sizes or even the sign of the difference in means or other parameters. We find that null hypothesis testing is uninformative when no estimates of means or effect size and their precision are given. Contrary to common dogma, tests of statistical null hypotheses have relatively little utility in science and are not a fundamental aspect of the scientific method. We recommend their use be reduced in favor of more informative approaches. Towards this objective, we describe a relatively new paradigm of data analysis based on Kullback-Leibler information. This paradigm is an extension of likelihood theory and, when used correctly, avoids many of the fundamental limitations and common misuses of null hypothesis testing. Information-theoretic methods focus on providing a strength of evidence for an a priori set of alternative hypotheses, rather than a statistical test of a null hypothesis. This paradigm allows the following types of evidence for the alternative hypotheses: the rank of each hypothesis, expressed as a model; an estimate of the formal likelihood of each model, given the data; a measure of precision that incorporates model selection uncertainty; and simple methods to allow the use of the set of alternative models in making, formal inference. We provide an example of the information-theoretic approach using data on the effect of lead on survival in spectacled eider ducks (Somateria fischeri). Regardless of the analysis paradigm used, we strongly recommend inferences based on a priori considerations be clearly separated from those resulting from some form of data dredging."
WILKINSON, Leland and the Task Force on Statistical Inference, 1999. Statistical methods in psychology journals: Guidelines and explanations, American Psychologist Volume 54(8), August 1999, p 594-604. [Cited by 358] (51.38/year)
"Hypothesis tests. It is hard to imagine a situation in which a dichotomous accept–reject decision is better than reporting an actual p value or, better still, a confidence interval. Never use the unfortunate expression “accept the null hypothesis.” Always provide some effect-size estimate when reporting a p value. Cohen (1994) has written on this subject in this journal. All psychologists would benefit from reading his insightful article."
Part of Conclusions: "Some had hoped that this task force would vote to recommend an outright ban on the use of significance tests in psychology journals. Although this might eliminate some abuses, the committee thought that there were enough counterexamples (e.g., Abelson, 1997) to justify forbearance. Furthermore, the committee believed that the problems raised in its charge went beyond the simple question of whether to ban significance tests."
COHEN, J., 1994. The earth is round (p<. 05), American Psychologist. [Cited by 515] (43.03/year)
"After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred .05 criterion—still persists. This article reviews the problems with this practice, including near universal misinterpretation of p as the probability that H₀ is false, the misinterpretation that its complement is the probability of successful replication, and the mistaken assumption that if one rejects H₀ one thereby affirms the theory that led to the test. Exploratory data analysis and the use of graphic methods, a steady improvement in and a movement toward standardization in measurement, an emphasis on estimating effect sizes using confidence intervals, and the informed use of available statistical methods are suggested. For generalization, psychologists must finally rely, as has been done in all the older sciences, on replication."
EFRON, B., 2004. Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis.. Journal of the American Statistical Association Vol. 99. [Cited by 69] (35.08/year)
Abstract: "Current scientific techniques in genomics and image processing routinely produce hypothesis testing problems with hundreds or thousands of cases to consider simultaneously. This poses new difficulties for the statistician, but also opens new opportunities. In particular it allows empirical estimation of an appropriate null hypothesis. The empirical null may be considerably more dispersed than the usual theoretical null distribution that would be used for any one case considered separately. An empirical Bayes analysis plan for this situation is developed, using a local version of the false discovery rate to examine the inference issues. Two genomics problems are used as examples to show the importance of correctly choosing the null hypothesis."
JOHNSON, Douglas H., 1999. The insignificance of statistical significance testing, Journal of Wildlife Management 63(3):763-772. [Cited by 216] (30.99/year)
Abstract: "Despite their wide use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices."
Conclusions: "Editors of scientific journals, along with the referees they rely on, are really the arbiters of scientific practice. They need to understand how statistical methods can be used to reach sound conclusions from data that have been gathered. It is not sufficient to insist that authors use statistical methods—the methods must be appropriate to the application. The most common and flagrant misuse of statistics, in my view, is the testing of hypotheses, especially the vast majority of them known beforehand to be false.
With the hundreds of articles already published that decry the use of statistical hypothesis testing, I was somewhat hesitant about writing another. It contains nothing new. But still, reading The Journal of Wildlife Management makes me realize that the message has not really reached the audience of wildlife biologists. Our work is important, so we should use the best tools we have available. Rarely, however, is that tool statistical hypothesis testing."
KILLEEN, P.R., 2005. General Article An Alternative to Null-Hypothesis Significance Tests, Psychological Science, Volume 16, Number 5, May 2005, pp. 345-353(9). [Cited by 29] (29.95/year)
Abstract: "The statistic p_rep estimates the probability of replicating an effect. It captures traditional publication criteria for signal-to-noise ratio, while avoiding parametric inference and the resulting Bayesian dilemma. In concert with effect size and replication intervals, p_rep provides all of the information now used in evaluating research, while avoiding many of the pitfalls of traditional statistical inference."
IOANNIDIS, J.P., 2005. Why most published research findings are false. PLoS Med. [Cited by 29] (29.65/year)
Summary: "There is increasing concern that most current published research findings are false. The probability that a research claim is true may depend on study power and bias, the number of other studies on the same question, and, importantly, the ratio of true to no relationships among the relationships probed in each scientific field. In this framework, a research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance. Simulations show that for most study designs and settings, it is more likely for a research claim to be false than true. Moreover, for many current scientific fields, claimed research findings may often be simply accurate measures of the prevailing bias. In this essay, I discuss the implications of these problems for the conduct and interpretation of research."
SCHMIDT, F.L., 1996. Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers, Psychological Methods 1(2), 115-129. [Cited by 280] (28.09/year)
Abstract: "Data analysis methods in psychology still emphasize statistical significance testing, despite numerous articles demonstrating its severe deficiencies. It is now possible to use meta-analysis to show that reliance on significance testing retards the development of cumulative knowledge. But reform of teaching and practice will also require that researchers learn that the benefits that they believe flow from use of significance testing are illusory. Teachers must revamp their courses to bring students to understand that (a) reliance on significance testing retards the growth of cumulative research knowledge; (b) benefits widely believed to flow from significance testing do not in fact exist; and (c) significance testing methods must be replaced with point estimates and confidence intervals in individual studies and with meta-analyses in the integration of multiple studies. This reform is essential to the future progress of cumulative knowledge in psychological research."
NEWCOMBE, R.G., 1998. Two-sided confidence intervals for the single proportion: comparison of seven methods, Statistics in Medicine, Volume 17, Issue 8, Pages 857 - 872. [Cited by 221] (27.74/year)
Abstract: "Simple interval estimate methods for proportions exhibit poor coverage and can produce evidently inappropriate intervals. Criteria appropriate to the evaluation of various proposed methods include: closeness of the achieved coverage probability to its nominal value; whether intervals are located too close to or too distant from the middle of the scale; expected interval width; avoidance of aberrations such as limits outside [0,1] or zero width intervals; and ease of use, whether by tables, software or formulae. Seven methods for the single proportion are evaluated on 96,000 parameter space points. Intervals based on tail areas and the simpler score methods are recommended for use. In each case, methods are available that aim to align either the minimum or the mean coverage with the nominal 1-."
FARRIS, J.S., et al., 1995. Constructing a Significance Test for Incongruence. Systematic Biology, Vol. 44, No. 4. (Dec., 1995), pp. 570-572. [Cited by 296] (26.99/year)
WESTFALL, P.H. and S.S. YOUNG, 1993. Resampling-based multiple testing: examples and methods for p-value adjustment. John Wiley & Sons. [Cited by 346] (26.67/year)
GIGERENZER, G., S. KRAUSS and O. VITOUCH, 2004. The null ritual: What you always wanted to know about significance testing but were afraid to ask. The SAGE handbook of quantitative methodology for the social …. [Cited by 44] (22.37/year)
KLEIN, D.F., 2005. Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research. American Journal of Psychiatry. [Cited by 21] (21.69/year)
KLAYMAN, J. and Y.W. HA, 1987. Confirmation, Disconfirmation, and Information in Hypothesis Testing, Psychological Review, 94, 211-228. [Cited by 408] (21.51/year)
Abstract: "Strategies for hypothesis testing in scientific investigation and everyday reasoning have interested both psychologists and philosophers. A number of these scholars stress the importance of disconfirmation in reasoning and suggest that people are instead prone to a general deleterious “confirmation bias.” In particular, it is suggested that people tend to test those cases that have the best chance of verifying current beliefs rather than those that have the best chance of falsifying them. We show, however, that many phenomena labeled “confirmation bias” are better understood in terms of a general positive test strategy. With this strategy, there is a tendency to test cases that are expected (or known) to have the property of interest rather than those expected (or known) to lack that property. This strategy is not equivalent to confirmation bias in the first sense; we show that the positive test strategy can be a very good heuristic for determining the truth or falsity of a hypothesis under realistic conditions. It can, however, lead to systematic errors or inefficiencies. The appropriateness of human hypothesis-testing strategies and prescriptions about optimal strategies must be understood in terms of the interaction between the strategy and the task at hand."
DAVIES, R.B., 1977. Hypothesis testing when a nuisance parameter is present only under the alternative, Biometrika, Vol. 64, No. 2. (Aug., 1977), pp. 247-254. [Cited by 376] (19.82/year)
Abstract: "Suppose that the distribution of a random variable representing the outcome of an experiment depends on two parameters ξ and θ and that we wish to test the hypothesis ξ = 0 against the alternative ξ > 0. If the distribution does not depend on θ when ξ = 0, standard asymptotic methods such as likelihood ratio testing or C(α) testing are not directly applicable. However, these methods may, under appropriate conditions, be used to reduce the problem to one involving inference from a Gaussian process. This simplified problem is examined and a test which may be derived as a likelihood ratio test or from the union-intersection principle is introduced. Approximate expressions for the significance level and power are obtained."
CHOW, S.L., 2000. Précis of Statistical significance: Rationale, validity, and utility, Behavioral and Brain Sciences 1998 Apr;21(2):169-94; discussion 194-239. [Cited by 111] (18.60/year)
Abstract: "The null-hypothesis significance-test procedure (NHSTP) is defended in the context of the theory-corroboration experiment, as well as the following contrasts: (a) substantive hypotheses versus statistical hypotheses, (b) theory corroboration versus statistical hypothesis testing, (c) theoretical inference versus statistical decision, (d) experiments versus nonexperimental studies, and (e) theory corroboration versus treatment assessment. The null hypothesis can be true because it is the hypothesis that errors are randomly distributed in data. Moreover, the null hypothesis is never used as a categorical proposition. Statistical significance means only that chance influences can be excluded as an explanation of data; it does not identify the nonchance factor responsible. The experimental conclusion is drawn with the inductive principle underlying the experimental design. A chain of deductive arguments gives rise to the theoretical conclusion via the experimental conclusion. The anomalous relationship between statistical significance and the effect size often used to criticize NHSTP is more apparent than real. The absolute size of the effect is not an index of evidential support for the substantive hypothesis. Nor is the effect size, by itself, informative as to the practical importance of the research result. Being a conditional probability, statistical power cannot be the a priori probability of statistical significance. The validity of statistical power is debatable because statistical significance is determined with a single sampling distribution of the test statistic based on H₀, whereas it takes two distributions to represent statistical power or effect size. Sample size should not be determined in the mechanical manner envisaged in power analysis. It is inappropriate to criticize NHSTP for nonstatistical reasons. At the same time, neither effect size, nor confidence interval estimate, nor posterior probability can be used to exclude chance as an explanation of data. Neither can any of them fulfill the nonstatistical functions expected of them by critics."
NICKERSON, R.S., 2000. Null hypothesis significance testing: A review of an old and continuing controversy, Psychological Methods. [Cited by 111] (18.60/year)
Abstract: "Null hypothesis significance testing (NHST) is arguably the most widely used approach to hypothesis evaluation among behavioral and social scientists. It is also very controversial. A major concern expressed by critics is that such testing is misunderstood by many of those who use it. Several other objections to its use have also been raised. In this article the author reviews and comments on the claimed misunderstandings as well as on other criticisms of the approach, and he notes arguments that have been advanced in support of NHST. Alternatives and supplements to NHST are considered, as are several related recommendations regarding the interpretation of experimental data. The concluding opinion is that NHST is easily misunderstood and misused but that when applied with good judgment it can be an effective aid to the interpretation of experimental data."
GOODMAN, S.N., 1999. Toward evidence-based medical statistics. 1: The P value fallacy. Ann Intern Med. [Cited by 121] (17.35/year)
Abstract: "An important problem exists in the interpretation of modern medical research data: Biological understanding and previous research play little formal role in the interpretation of quantitative results. This phenomenon is manifest in the discussion sections of research articles and ultimately can affect the reliability of conclusions. The standard statistical approach has created this situation by promoting the illusion that conclusions can be produced with certain “error rates,” without consideration of information from outside the experiment. This statistical approach, the key components of which are P values and hypothesis tests, is widely perceived as a mathematically coherent approach to inference. There is little appreciation in the medical community that the methodology is an amalgam of incompatible elements, whose utility for scientific inference has been the subject of intense debate among statisticians for almost 70 years. This article introduces some of the key elements of that debate and traces the appeal and adverse impact of this methodology to the P value fallacy, the mistaken idea that a single number can capture both the long-run outcomes of an experiment and the evidential meaning of a single result. This argument is made as a prelude to the suggestion that another measure of evidence should be used—the Bayes factor, which properly separates issues of long-run behavior from evidential strength and allows the integration of background knowledge with statistical findings."
HARLOW, L.L., S.A. MULAIK and J.H. STEIGER, 1997. What If There Were No Significance Tests?. erlbaum.com. [Cited by 151] (16.84/year)
THOMPSON, B., 2002. What future quantitative social science research could look like: Confidence intervals for effect sizes, Educational Researcher v31 n3 p25-32 Apr 2002. [Cited by 66] (16.63/year)
WILCOX, R.R., 1997. Introduction to robust estimation and hypothesis testing. Academic Press San Diego, CA. [Cited by 145] (16.17/year)
BERKSON, J., 2003. Tests of significance considered as evidence. International Journal of Epidemiology. [Cited by 48] (16.16/year)
GARDNER, M.J. and D.G. ALTMAN, 1986. Confidence intervals rather than P values: estimation rather than hypothesis testing.. Br Med J (Clin Res Ed). [Cited by 306] (15.32/year)
"Overemphasis on hypothesis testing--and the use of P values to dichotomise significant or non-significant results--has detracted from more useful approaches to interpreting study results, such as estimation and confidence intervals. In medical studies investigators are usually interested in determining the size of difference of a measured outcome between groups, rather than a simple indication of whether or not it is statistically significant. Confidence intervals present a range of values, on the basis of the sample data, in which the population value for such a difference may lie. Some methods of calculating confidence intervals for means and differences between means are given, with similar information for proportions. The paper also gives suggestions for graphical display. Confidence intervals, if appropriate to the type of study, should be used for major findings in both the main text of a paper and its abstract."
BRAUMOELLER, B.F., 2004. Hypothesis Testing and Multiplicative Interaction Terms. International Organization. [Cited by 28] (14.23/year)
Abstract: "When a statistical equation incorporates a multiplicative term in an attempt to model interaction effects, the statistical significance of the lower-order coefficients is largely useless for the typical purposes of hypothesis testing. This fact remains largely unappreciated in political science, however. This brief article explains this point, provides examples, and offers some suggestions for more meaningful interpretation."
ROSENTHAL, R., 1979. The “file drawer problem” and tolerance for null results. Psychological Bulletin 86, 638-641. [Cited by 380] (14.09/year)
HANSEN, B.E., 1997. Approximate Asymptotic P Values for Structural-Change Tests. Journal of Business & Economic Statistics. [Cited by 124] (13.82/year)
TRAFIMOW, D., 2003. Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes' s …. Psychological Review. [Cited by 41] (13.81/year)
Abstract: "Because the probability of obtaining an experimental finding given that the null hypothesis is true [p(F\H₀)] is not the same as the probability that the null hypothesis is true given a finding [p(H₀\F)], calculating the former probability does not justify conclusions about the latter one. As the standard null-hypothesis significance-testing procedure does just that, it is logically invalid (J. Cohen, 1994). Theoretically, Bayes's theorem yields p(H₀\F), but in practice, researchers rarely know the correct values for 2 of the variables in the theorem. Nevertheless, by considering a wide range of possible values for the unknown variables, it is possible to calculate a range of theoretical values for p(H₀\F) and to draw conclusions about both hypothesis testing and theory evaluation."
MASSON, M.E.J. and G.R. LOFTUS, 2003. Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology. [Cited by 40] (13.48/year)
CUMMING, G. and S. FINCH, 2001. A Primer on the Understanding, Use, and Calculation of Confidence Intervals That Are Based on …. Educational and Psychological Measurement. [Cited by 63] (12.68/year)
LENHARD, J., 2006. Models and Statistical Inference: The Controversy between Fisher and Neyman-Pearson, The British Journal for the Philosophy of Science 57(1):69-91. [Cited by 3] (12.55/year)
WRIGHT, S.P., 1992. Adjusted P-Values for Simultaneous Inference. Biometrics. [Cited by 163] (11.66/year)
KOCH, K.R., 1988. Parameter estimation and hypothesis testing in linear models. Springer-Verlag New York, Inc. New York, NY, USA. [Cited by 195] (10.85/year)
TVERSKY, A. and D. KAHNEMAN, 1971. Belief in the law of small numbers. Psychological Bulletin, 76, 105-110. [Cited by 371] (10.61/year)
THOMPSON, Bruce, 1996. AERA Editorial Policies regarding Statistical Significance Testing: Three Suggested Reforms, Educational Researcher, Vol. 25, No. 2. (Mar., 1996), pp. 26-30. [Cited by 105] (10.53/year)
Abstract: "The present comment reviews practices revolving around tests of statistical significance. First, the logic of statistical significance testing is presented in an accessible manner; many people who use statistical tests might not place such a premium on the tests if these individuals understood what the tests really do, and what the tests do not do. Second, the etiology of decades of misuse of statistical tests is briefly explored; we must understand the bad implicit logic of persons who misuse statistical tests if we are to have any hope of persuading them to alter their practices. Third, three revised editorial policies that would improve conventional practice are highlighted."
BELIA, S., et al., 2005. Researchers misunderstand confidence intervals and standard error bars. Psychological Methods. [Cited by 10] (10.33/year)
STEPHENS, P.A., et al., 2005. Information theory and hypothesis testing: a call for pluralism. Journal of Applied Ecology. [Cited by 10] (10.33/year)
THEILER, J. and D. PRICHARD, 1996. Constrained-realization Monte-Carlo method for hypothesis testing. Physica D. [Cited by 100] (10.03/year)
Abstract: "We compare two theoretically distinct approaches to generating artificial (or ``surrogate'') data for testing hypotheses about a given data set. The first and more straightforward approach is to fit a single ``best'' model to the original data, and then to generate surrogate data sets that are ``typical realizations'' of that model. The second approach concentrates not on the model but directly on the original data; it attempts to constrain the surrogate data sets so that they exactly agree with the original data for a specified set of sample statistics. Examples of these two approaches are provided for two simple cases: a test for deviations from a gaussian distribution, and a test for serial dependence in a time series. Additionally, we consider tests for nonlinearity in time series based on a Fourier transform (FT) method and on more conventional autoregressive moving-average (ARMA) fits to the data. The comparative performance of hypothesis testing schemes based on these two approaches is found to depend on whether or not the discriminating statistic is pivotal. A statistic is ``pivotal'' if its distribution is the same for all processes consistent with the null hypothesis. The typical-realization method requires that the discriminating statistic satisfy this property. The constrained-realization approach, on the other hand, does not share this requirement, and can provide an accurate and powerful test without having to sacrifice flexibility in the choice of discriminating statistic."

Bibliography

ABELSON, R.P., 1997. A retrospective on the significance test ban of 1999 (If there were no significance tests, they …. What if there were no significance tests. [Cited by 39] (4.35/year)
ABELSON, R.P., 1997. On the surprising longevity of flogged horses: Why there is a case for the significance test. Psychological Science. [Cited by 31] (3.46/year)
ABRAHAMOWICZ, M., T. MACKENZIE and J.M. ESDAILE, 1996. Time-Dependent Hazard Ratio: Modeling and Hypothesis Testing with Application in Lupus Nephritis.. Journal of the American Statistical Association. [Cited by 45] (4.51/year)
ABT, K., 1981. Problems of repeated significance testing.. Control Clin Trials. [Cited by 14] (0.56/year)
ABT, K., 1983. Significance testing of many variables. Problems and solutions.. Neuropsychobiology. [Cited by 15] (0.65/year)
ADOLPHS, R., 2003. Investigating the cognitive neuroscience of social behavior. Neuropsychologia. [Cited by 18] (6.06/year)
AGUINIS, H., 2004. Regression analysis for categorical moderators. ecommerce.tandf.co.uk. [Cited by 13] (6.60/year)
AHLSWEDE, R. and I. CSISZAR, 1986. Hypothesis testing with communication constraints. Information Theory, IEEE Transactions on. [Cited by 25] (1.25/year)
AITKIN, M. and D.B. RUBIN, 1985. Estimation and Hypothesis Testing in Finite Mixture Models. Journal of the Royal Statistical Society. Series B ( …. [Cited by 81] (3.86/year)
AITKIN, M.O., 1997. The calibration of P-values, posterior Bayes factors and the AIC from the posterior distribution of …. Statistics and Computing. [Cited by 9] (1.00/year)
ALI, M.W., 1990. … of tumor prevalence in tumorigenicity experiments: A comparison of P-values for small frequency of …. Drug Information Journal. [Cited by 6] (0.38/year)
ALTHAM, P.M.E., 1969. Exact Bayesian Analysis of a 2*2 Contingency Table, and Fisher's" Exact" Significance Test. Journal of the Royal Statistical Society. Series B ( …. [Cited by 22] (0.60/year)
Altman, D. G. (1985). Discussion of Dr Chatfield's paper. J. R. Statist. Soc. A 148, Part 3 : 242.
ANDERSON, D.R., K.P. BURNHAM and W.L. THOMPSON, 2000. Null hypothesis testing: Problems, prevalence, and an alternative. The Journal of wildlife management. [Cited by 340] (56.98/year)
ANDERSON, G.J. and R.W. BLUNDELL, 1982. Estimation and Hypothesis Testing in Dynamic Singular Equation Systems. Econometrica. [Cited by 43] (1.79/year)
ANSCOMBE, F.J., 1956. Discussion on Dr. David's and Dr. Johnson's paper. Journal of the Royal Statistical Society Soc. B 18 : 24-27. [Cited by 2] (0.04/year)
ANSLEY, C.F. and T.S. SHIVELY, 1991. Computing P-values for the generalized Durbin-Watson and other invariant test statistics. ideas.repec.org. [Cited by 12] (0.80/year)
ARBUTHNOT, John, 1710. An Argument for Divine Providence, Taken from the Constant Regularity Observ'd in the Births of Both Sexes. By Dr. John Arbuthnott, Physitian in Ordinary to Her Majesty, and Fellow of the College of Physitians and the Royal Society, Philosophical Transactions (1683-1775), Vol. 27. (1710 - 1712), pp. 186-190. [Cited by 25] (0.04/year)
AZAR, B., 1999. APA statistics task force prepares to release recommendations for public comment. APA Monitor Online, 30 (5). [Cited by 4] (0.57/year)
BAGOZZI, R.P. and H. BAUMGARTNER, 1994. The evaluation of structural equation models and hypothesis testing. Principles of Marketing Research. [Cited by 89] (7.44/year)
BAILEY, T.L. and M. GRIBSKOV, Bioinformatics. Combining evidence using p-values: application to sequence homology searches. [Cited by 212] (?/year)
BAILEY, T.L. and W.N. GRUNDY, 1999. Classifying proteins by family using the product of correlated p-values. Proceedings of the third annual international conference on …. [Cited by 16] (2.30/year)
BAKAN, D., 1966. The test of significance in psychological research.. Psychol Bull. [Cited by 110] (2.75/year)
BAKAN, D., 1966. The test of significance in psychological research.. Psychol Bull. [Cited by 110] (2.75/year)
BAKEMAN, R., B.F. ROBINSON and V. QUERA, 1996. Testing sequential association: Estimating exact p values using sampled permutations. Psychological Methods. [Cited by 12] (1.20/year)
BANASIEWICZ, A.D., 2005. Marketing pitfalls of statistical significance testing. Marketing Intelligence & Planning. [not cited] (0/year)
BARNARD, G.A., 1989. On alleged gains in power from lower P-values.. Stat Med. [Cited by 16] (0.94/year)
Barndorff-Nielsen, O. (1977). Discussion of D. R. Cox's paper. Scand. J. Statist. 4 : 67-69.
BATANERO, C., 2000. Controversies Around the Role of Statistical Tests in Experimental Research. Mathematical Thinking and Learning. [Cited by 5] (0.84/year)
BATTERHAM, A.M. and W.G. HOPKINS, 2005. Making meaningful inferences about magnitudes. Sportscience. [Cited by 3] (3.10/year)
BAUGH, F., 2002. Correcting Effect Sizes for Score Reliability: A Reminder That Measurement and Substantive Issues …. Educational and Psychological Measurement. [Cited by 11] (2.77/year)
BAYARRI, M.J. and J.O. BERGER, 2000. P Values for Composite Null Models.. Journal of the American Statistical Association. [Cited by 52] (8.71/year)
Beaven, E. S. (1935). Discussion on Dr. Neyman's Paper. Journal of the Royal Statistical Society, Supplement 2 : 159-161.
BECK-BORNHOLDT, H.P. and H.H. DUBBEN, 1994. Potential pitfalls in the use of p-values and in interpretation of significance levels.. Radiother Oncol. [Cited by 17] (1.42/year)
BEHAVIOUR, A., 1999. Are significance thresholds appropriate for the study of animal behaviour?. ANIMAL BEHAVIOUR. [Cited by 10] (1.44/year)
BELIA, S., et al., 2005. Researchers misunderstand confidence intervals and standard error bars. Psychological Methods. [Cited by 10] (10.33/year)
BENGIO, S. and J. MARIETHOZ, The Speaker and Language Recognition Workshop (Odyssey). A Statistical Significance Test for Person Authentication. [Cited by 8] (?/year)
BERGER, J.O. and M. DELAMPADY, 1987. Testing Precise Hypotheses. Statistical Science. [Cited by 109] (5.75/year)
BERGER, J.O. and T. SELLKE, 1987. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence. Journal of the American Statistical Association. [Cited by 114] (6.01/year)
BERGER, J.O. and T. SELLKE, 1987. Testing a point null hypothesis: The irreconcilability of P values and evidence (with discussion). Journal of the American Statistical Association. [Cited by 42] (2.21/year)
BERGER, R.L. and D.D. BOOS, 1994. P Values Maximized over a Confidence Set for the Nuisance Parameter.. Journal of the American Statistical Association. [Cited by 46] (3.84/year)
BERGER, R.L., 1982. Multiparameter Hypothesis Testing and Acceptance Sampling. Technometrics. [Cited by 42] (1.75/year)
BERGER, R.L., 1996. More Powerful Tests from Confidence Interval P Values.. The American Statistician. [Cited by 22] (2.21/year)
BERKSON, J., 1938. Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test. Journal of the American Statistical Association. [Cited by 38] (0.56/year)
BERKSON, J., 2003. Tests of significance considered as evidence. International Journal of Epidemiology. [Cited by 48] (16.16/year)
BERKSON, Joseph, 1938. Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test, Journal of the American Statistical Association, Vol. 33, No. 203. (Sep., 1938), pp. 526-536. [Cited by 38] (0.56/year)
BERNDT, E.R. and N.E. SAVIN, 1975. Estimation and Hypothesis Testing in Singular Equation Systems with Autoregressive Disturbances. Econometrica. [Cited by 85] (2.74/year)
BEROZA, M. and M.C. BOWMAN, 1965. Identification of Pesticides at Nanogram Level by Extraction p-Values.. Analytical Chemistry. [Cited by 13] (0.32/year)
BESAG, J. and P. CLIFFORD, 1991. Sequential Monte Carlo p-values. Biometrika. [Cited by 27] (1.80/year)
BEZEAU, S.A. and R.A. GRAVES, 2001. Statistical Power and Effect Sizes of Clinical Neuropsychology Research. Journal of Clinical and Experimental Neuropsychology. [Cited by 11] (2.21/year)
BIAGI, G.L., et al., 1975. Rm values of phenols. Their relation with log P values and activity. Journal of Medicinal Chemistry. [Cited by 8] (0.26/year)
BIRNBAUM, A., 1961. Confidence Curves: An Omnibus Technique for Estimation and Testing Statistical Hypotheses. Journal of the American Statistical Association 56, 246-249. [Cited by 6] (0.13/year)
BLAHUT, R., 1974. Hypothesis testing and information theory. Information Theory, IEEE Transactions on. [Cited by 46] (1.44/year)
BLAICH, C.F., 2000. The null-hypothesis significance-test procedure: Can't live with it, can't live without it. Behavioral and Brain Sciences. [Cited by 2] (0.34/year)
BLAIR, R.C. and W. KARNISKI, 1993. An alternative method for significance testing of waveform difference potentials.. Psychophysiology. [Cited by 55] (4.24/year)
BLOSTEIN, S.D. and T.S. HUANG, 1991. Detecting small, moving objects in image sequences using sequential hypothesis testing. Signal Processing, IEEE Transactions on [see also Acoustics, …. [Cited by 84] (5.61/year)
BOARDMAN, T.J., 1994. The Statistician Who Changed the World: W. Edwards Deming, 1900-1993, The American Statistician 48(3) : 179-187. [Cited by 2] (0.17/year)
BOBKOSKI, M.J., 1983. Hypothesis testing in nonstationary time series.. Dissertation Abstracts International Part B: Science and …. [Cited by 24] (1.04/year)
BORENSTEIN, M., COHEN, J., & ROTHSTEIN, H. (in press). Confidence intervals, effect size, and power [Computer program]. Hillsdale, NJ: Erlbaum.
BOWMAN, M.C. and M. BEROZA, 1966. Identification of Compounds by Extraction p-Values Using Gas Chromatography.. Analytical Chemistry. [Cited by 7] (0.18/year)
BOX, G.E.P., 1976. Science and Statistics. Journal of the American Statistical Association 71 : 791-799. [Cited by 67] (2.24/year)
BOX, G.E.P., 1982. An apology for ecumenism in statistics In Scientific Inference, Data Analysis, and Robustness, G. E. P. Box, T. Leonard and C. F. Wu (eds.), Academic Press, Inc. : 51- 84. [Cited by 16] (0.67/year)
BRAITHWAITE, R.B., 1946. Scientific Explanation: A Study of the Function of Theory. Probability and Law in Science Based upon the Turner …. [Cited by 2] (0.03/year)
BRANDSTÄTTER, Eduard, 1999. Confidence Intervals as anAlternative to Significance Testing. mpr-online.de. [Cited by 5] (0.71/year)
BRANDST?TTER, E., Confidence Intervals as anAlternative to Significance Testing. mpr-online.de. [Cited by 5] (?/year)
BRANDST?TTER, E., 1999. Konfidenzintervalle als Alternative zu Signifikanztests. Methods of Psychological Research Online, 4 (2). [Cited by 2] (0.29/year)
BRAUMOELLER, B.F., 2004. Hypothesis Testing and Multiplicative Interaction Terms. International Organization. [Cited by 28] (14.23/year)
BRYAN-JONES, J. and D.J. FINNEY, 1983. On an error in “Instructions to Authors”. HortScience 18(3) : 279-282. [Cited by 4] (0.17/year)
BUCHANAN-WOLLASTON, H., 1935. The philosophic basis of statistical analysis. Journal of the International Council for the Exploration of the Sea 10 : 249-263. [Cited by 1] (0.01/year)
BURTON, P.R., L.C. GURRIN and M.J. CAMPBELL, 1998. Clinical significance not statistical significance: a simple Bayesian alternative to p values. Journal of Epidemiology & Community Health. [Cited by 22] (2.76/year)
BUSHWAY, S.D., G. SWEETEN and D.B. WILSON, 2006. Size matters: Standard errors in the application of null hypothesis significance testing in …. Journal of Experimental Criminology. [not cited] (0/year)
CAHAN, S., 2000. Statistical significance is not a “kosher certificate” for observed effects: A critical analysis …. Educational Researcher. [Cited by 4] (0.67/year)
CAMILLERI, S.F., 1962. Theory, Probability, and Induction in Social Research. American Sociological Review 27 : 170-178. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 8] (0.18/year)
CAPRARO, R.M. and M.M. CAPRARO, 2002. Treatments of Effect Sizes and Statistical Significance Tests in Textbooks. Educational and Psychological Measurement. [Cited by 4] (1.01/year)
CARVER, R.P., 1978. The Case against Statistical Significance Testing.. Harvard Educational Review. [Cited by 127] (4.54/year)
CARVER, R.P., 1993. The Case Against Statistical Significance Testing, Revisited. Journal of Experimental Education. [Cited by 74] (5.71/year)
CASELLA, G. and R.L. BERGER, 1987. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence: Rejoinder. Journal of the American Statistical Association 82(397) : 133-135. [not cited] (0/year)
CASTILLO-DAVIS, C.I. and D.L. HARTL, 2003. GeneMerge--post-genomic analysis, data mining, and hypothesis testing. Bioinformatics. [Cited by 38] (12.81/year)
Chatfield, C. (1989). Comments on the paper by McPherson. Journal of the Royal Statistical Society, Series A, 152 : 234-238.
CHATFIELD, C., 1985. The initial examination of data. Discussion of D? 2 ? Chatfield's paper. Journal of the Royal Statistical Society A 148, Part 3 : 214-253. [Cited by 12] (0.57/year)
CHERNOFF, H., 1986. [Why Isn't Everyone a Bayesian?]: Comment The American Statistician 40(1) : 5-6. [Cited by 2] (0.10/year)
CHEW, V., 1976. Comparing treatment means: a compendium, HortScience 11(4) : 348-357. [Cited by 51] (1.70/year)
CHEW, V., 1980. Testing differences among means: Correct interpretation and some alternatives. HortScience 15(4) : 467-470. [Cited by 3] (0.12/year)
CHOW, S.L., 1988. Significance test or effect size. Psychological Bulletin. [Cited by 38] (2.11/year)
CHOW, S.L., 1999. In Defence of Significance Tests. Psycoloquy. [Cited by 4] (0.57/year)
CHOW, S.L., 2000. Pr?cis of Statistical significance: Rationale, validity, and utility. Behavioral and Brain Sciences. [Cited by 111] (18.60/year)
CHOW, S.L., 2000. The null-hypothesis significance-test procedure is still warranted, Behavioral and Brain Sciences. [Cited by 1] (0.17/year)
CICCHETTI, D.V.D., 1998. Role of Null Hypothesis Significance Testing (NHST) in the Design of Neuropsychologic Research. Journal of Clinical and Experimental Neuropsychology. [Cited by 6] (0.75/year)
CLARK, C.A., 1963. Hypothesis Testing in Relation to Statistical Methodology. Review of Educational Research. [Cited by 6] (0.14/year)
COCHRAN, W.G. and G.M. COX, 1992. Experimental designs. Wiley New York. [Cited by 1389] (99.43/year)
COHEN, J., 1962. The statistical power of abnormal-social psychological research: a review.. Journal of Abnormal and Social Psychology 69 145-153. [Cited by 164] (3.73/year)
COHEN, J., 1988. Statistical Power Analysis for the Behavioral Sciences. Hillsdale. NJ: Erlbaum. [Cited by 8611] (479.24/year)
COHEN, J., 1990. Things I have learned (so far). American Psychologist. [Cited by 299] (18.73/year)
COHEN, J., 1994. The earth is round (p<. 05), American Psychologist. [Cited by 515] (43.03/year)
COHEN, J., 1995. The earth is round (p<. 05): Rejoinder. American Psychologist. [Cited by 6] (0.55/year)
COHEN, J., 1997. The earth is round (r<. 05). What if there were not significance tests. [Cited by 1] (0.11/year)
Cormack, R. M. (1985). Discussion of Dr Chatfield's paper. J. R. Statist. Soc. A 148, Part 3 : 231-233.
CORTINA, J.M. and W.P. DUNLAP, 1997. On the logic and purpose of significance testing. Psychological Methods. [Cited by 40] (4.46/year)
COX, D.R. and C. COX, 1981. Applied Statistics Principles and Examples, Chapman and Hall. [Cited by 59] (2.36/year)
COX, D.R., 1958. Some Problems Connected with Statistical Inference, The Annals of Mathematical Statistics 29 : 357-372. [Cited by 60] (1.25/year)
COX, D.R., 1977. The role of significance tests. Scand. J. Statist 4 : 49-70. [Cited by 37] (1.28/year)
COX, D.R., 1982. Statistical significance tests, Br J Clin Pharmacol 14 : 325-331. [Cited by 12] (0.50/year)
CUMMING, G. and S. FINCH, 2001. A Primer on the Understanding, Use, and Calculation of Confidence Intervals That Are Based on …. Educational and Psychological Measurement. [Cited by 63] (12.68/year)
DALES, L.G. and H.K. URY, 1978. An improper use of statistical significance testing in studying covariables. Int J Epidemiol. [Cited by 26] (0.93/year)
DANIEL, L.G., 1998. The statistical significance controversy is definitely not over: A rejoinder to responses by …. Research in the Schools. [Cited by 6] (0.75/year)
DANIEL, L.G., 1998. Statistical significance testing: A historical overview of misuse and misinterpretation with …. Research in the Schools. [Cited by 20] (2.51/year)
DANIEL, Larry G., 1998. Statistical significance testing: A historical overview of misuse and misinterpretation with …. Research in the Schools. [Cited by 20] (2.51/year)
DASS, Sarat C. and Jaeyong LEE, A Note on the Consistency of Bayes Factors for Testing Point Null versus Nonparametric Alternatives [not cited]
DAVID, H. and A. EDWARDS, 2001. The First Formal Significance Test-Comments on Arbuthnott 1710. Annotated Readings from the History of Statistics. [Cited by 3] (0.60/year)
DAVIES, R.B., 1987. Hypothesis testing when a nuisance parameter is present only under the alternative. Biometrika. [Cited by 376] (19.82/year)
DE, C.A. and J.M. STERN, 1999. Evidence and Credibility: Full Bayesian Significance Test for Precise Hypotheses. Entropy Journal. [Cited by 16] (2.30/year)
DELGADO, M.A. and W.G. MANTEIGA, 2001. Significance Testing in Nonparametric Regression Based on the Bootstrap. The Annals of Statistics. [Cited by 21] (4.23/year)
DEMPSTER, A.P., 1958. A High Dimensional Two Sample Significance Test. The Annals of Mathematical Statistics. [Cited by 7] (0.15/year)
DEMPSTER, A.P.B.N., 1997. The direct use of likelihood for significance testing. Statistics and Computing. [Cited by 33] (3.68/year)
DENIS, D.J., 2003. Alternatives to null hypothesis significance testing. Theory and Science. [Cited by 3] (1.01/year)
DENIS, Daniel J., Alternatives to Null Hypothesis Significance Testing
DI, J., F. FIDLER and G. CUMMING, Effect size estimates and confidence intervals: An alternative focus for the presentation and …. latrobe.edu.au. [not cited] (?/year)
DICKEY, D.A. and R.J. ROSSANA, 1994. Cointegrated Time Series: A Guide to Estimation and Hypothesis Testing. Oxford Bulletin of Economics and Statistics. [Cited by 25] (2.09/year)
DICKEY, D.A., 1976. Estimation and hypothesis testing in nonstationary time series. [Cited by 40] (1.33/year)
DONNER, A. and M. ELIASZIW, 1992. … for the kappa statistic: confidence interval construction, significance-testing and sample size …. Stat Med. [Cited by 45] (3.22/year)
DOUGLAS, J.A., L.A. ROUSSOS and W. STOUT, 1996. Item-Bundle DIF Hypothesis Testing: Identifying Suspect Bundles and Assessing Their Differential …. Journal of Educational Measurement. [Cited by 29] (2.91/year)
DOUGLAS, M.E. and W.J. MATTHEWS, 1992. Does morphology predict ecology? hypothesis testing within a freshwater stream fish assemblage. Oikos. [Cited by 33] (2.36/year)
DOUTHWAITE, W.A., et al., 1999. The EyeSys videokeratoscopic assessment of apical radius and p-value in the normal human cornea. Ophthalmic and Physiological Optics. [Cited by 11] (1.58/year)
DOW, G.S., 2003. Effect of sample size and P-value filtering techniques on the detection of transcriptional changes …. Malaria Journal. [Cited by 7] (2.36/year)
DUDBRIDGE, F. and B.P.C. KOELEMAN, 2003. Rank truncated product of P-values, with application to genomewide association scans. Genetic Epidemiology. [Cited by 16] (5.39/year)
DUDOIT, S., J.P. SHAFFER and J.C. BOLDRICK, 2003. Multiple hypothesis testing in microarray experiments. Statistical Science. [Cited by 182] (61.34/year)
DUNCAN, D.B., 1951. A Significance Test for Differences Between Ranked Treatments in an Analysis of Variance. Virginia Polytechnic Institute. [Cited by 39] (0.71/year)
DUNNETT, C.W. and M. GENT, 1977. Significance Testing to Establish Equivalence between Treatments, with Special Reference to Data in …. Biometrics. [Cited by 98] (3.38/year)
DZHAPARIDZE, K.O., 1985. Parameter Estimation and Hypothesis Testing in Spectral Analysis of Stationary Time Series. books.google.com. [Cited by 43] (2.05/year)
EDWARDS, W., H. LINDMAN and L.J. SAVAGE, 1963. Bayesian statistical inference for psychological research. Psychological Review 70 : 193-242. [Cited by 195] (4.54/year)
EFRON, B., 2004. Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis.. Journal of the American Statistical Association. [Cited by 69] (35.08/year)
EGGERS, Andrew C., 2005. Notes on Inference: Misconceptions about Bayesian and Frequentist Inference
ELLIOTT, R. and R.J. DOLAN, 1998. … of different anterior cingulate foci in association with hypothesis testing and response selection. Neuroimage. [Cited by 78] (9.79/year)
ELSTON, R.C., 1994. P values, power, and pitfalls in the linkage analysis of psychiatric disorders. Genetic approaches to mental disorders. Proceedings of the …. [Cited by 12] (1.00/year)
EMERSON, S.S. and T.R. FLEMING, 1990. Parameter estimation following group sequential hypothesis testing. Biometrika. [Cited by 49] (3.07/year)
EPSTEIN, B.R., et al., 1993. … in linear integrated circuits: an application of discrimination analysis and hypothesis testing. Computer-Aided Design of Integrated Circuits and Systems, …. [Cited by 38] (2.93/year)
ERWIN, E., 2000. The logic of null hypothesis testing. Behavioral and Brain Sciences. [Cited by 14] (2.35/year)
EVANS, S.J., P. MILLS and J. DAWSON, 1988. The end of the p value?. Br Heart J. [Cited by 16] (0.89/year)
FAIRBANKS, K. and R. MADSEN, 1982. P values for tests using a repeated significance test design. Biometrika. [Cited by 16] (0.67/year)
FALK, R. and C.W. GREENBAUM, 1995. Significance tests die hard: The amazing persistence of a probabilistic misconception, Theory and Psychology. [Cited by 53] (4.83/year)
FARRIS, J.S., K? aallersj? oo, M., Kluge, AG, Bult, C., 1995a. Constructing a significance test for …. Syst. Biol. [Cited by 10] (?/year)
FARRIS, J.S., et al., 1995. Constructing a Significance Test for Incongruence. Systematic Biology. [Cited by 296] (26.99/year)
FEARON, J.D., 1991. Counterfactuals and Hypothesis Testing in Political Science. World Politics. [Cited by 100] (6.68/year)
FEINSTEIN, A.R., 1998. P-values and confidence intervals: two sides of the same unsatisfactory coin.. J Clin Epidemiol. [Cited by 15] (1.88/year)
FEISE, R.J., 2002. Do multiple outcome measures require p-value adjustment. BMC Med Res Methodol. [Cited by 15] (3.78/year)
FERNANDEZ-DUQUE, E., 1997. Comparing and combining data across studies: Alternatives to significance testing. Oikos. [Cited by 6] (0.67/year)
FIDLER, F. and B. THOMPSON, 2001. COMPUTING CORRECT CONFIDENCE INTERVALS FOR ANOVA FIXED-AND RANDOM-EFFECTS EFFECT SIZES. Educational and Psychological Measurement. [Cited by 23] (4.63/year)
FIDLER, F. and G. CUMMING, Teaching Confidence Intervals: Problems and Potential Solutions. stat.auckland.ac.nz. [not cited] (?/year)
FIDLER, F., 2002. The Fifth Edition of the APA Publication Manual: Why Its Statistics Recommendations Are So …. Educational and Psychological Measurement. [Cited by 19] (4.79/year)
FIDLER, F., et al., 2004. Editors can lead researchers to confidence intervals, but can't make them think. Psychological Science. [Cited by 3] (1.52/year)
FINCH, S., N. THOMASON and G. CUMMING, 2002. Past and Future American Psychological Association Guidelines for Statistical Practice. Theory & Psychology. [Cited by 17] (4.28/year)
FINDINGS, P., Why Most Published Research Findings Are False. evogen.bio.uci.edu. [not cited] (?/year)
FINNEY, D. J. (1988). Was this in your statistics textbook? III. Design and analysis. Expl Agric. 24 :421-432.
FINNEY, D. J. (1989b). Is the statistician still necessary? Biom. Praxim. 29 : 135-146.
FINNEY, D.J., 1989. Was this in your statistics text book? VI Regression and covariance, Experimental Agriculture 25 : 291-311. [Cited by 2] (0.12/year)
FISHER, R.A., 1922. On the Mathematical Foundations of Theoretical Statistics, Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, Vol. 222. (1922), pp. 309-368. [Cited by 347] (4.13/year)
FISHER, R.A., 1925. Statistical Methods for Research Workers, Oliver and Boyd (London). [Cited by 1820] (22.47/year)
FISHER, R.A., 1951. The Design of Experiments.-Oliver and Boyd. Edinburgh, London. [Cited by 8] (0.15/year)
FISHER, R.A., 1959. Statistical methods and scientific inference. Hafner Publishing. [Cited by 305] (9.25/year)
FLORES-VILLELA, O., 2000. Multiple Data Sets, Congruence, and Hypothesis Testing for the Phylogeny of Basal Groups of the …. Systematic Biology. [Cited by 32] (5.36/year)
FLYNN, P.J. and A.K. JAIN, 1988. Surface classification: hypothesis testing and parameter estimation. Computer Vision and Pattern Recognition, 1988. Proceedings …. [Cited by 40] (2.23/year)
FORSTER, Malcolm R. and Ann B. WOLFE, Whewell’s Theory of Hypothesis Testing and a Relational View of Evidence [not cited]
FORSTER, Malcolm R., Optional Stopping, 1998. [not cited]
FRALEY, R. Chris, The Statistical Significance Testing Debate: A Critical Analysis of the Controversy and Solutions
FRASER, D.A.S., A. WONG and J. WU, 1999. Regression Analysis, Nonlinear or Nonnormal: Simple and Accurate P Values from Likelihood Analysis.. Journal of the American Statistical Association. [Cited by 9] (1.29/year)
FREEDMAN, D.A. and D. LANE, 1983. Significance testing in a nonstochastic setting. A Festschrift for Erich L. Lehmann. [Cited by 9] (0.39/year)
FREEMAN, P.R., 1915. The role of p-values in analysing trial results.. Stat Med. [Cited by 16] (0.18/year)
FRICK, R.W., 1996. The appropriate use of null hypothesis testing. Psychological Methods. [Cited by 46] (4.62/year)
FRISTON, K.J., et al., 1995. Statistical parametric maps: Confidence intervals on p-values. Human Brain Mapping. [Cited by 8] (0.73/year)
G?MEZ-VILLEGAS, M.A. and L. SANZ, 1998. Reconciling Bayesian and frequentist evidence in the point null testing problem. Test. [Cited by 6] (0.75/year)
GABOR, George, 2004. Classical Statistics: Smoke and Mirrors
GADRE, S.R. and S.J. CHAKRAVORTY, 1986. Some rigorous inequalities among the Weizsacker correction and atomic??? r??? and??? p??? values. The Journal of Chemical Physics. [Cited by 9] (0.45/year)
GARCIA-BERTHOU, E. and C. ALCARAZ, 2004. Incongruence between test statistics and P values in medical papers. BMC Medical Research Methodology. [Cited by 8] (4.07/year)
GARDNER, M.J. and D.G. ALTMAN, 1986. Confidence intervals rather than P values: estimation rather than hypothesis testing.. Br Med J (Clin Res Ed). [Cited by 306] (15.33/year)
GARDNER, M.J. and D.G. ALTMAN, 1989. Estimation rather than hypothesis testing: confidence intervals rather than P values. Statistics with confidence-Confidence intervals and …. [Cited by 13] (0.77/year)
GARDNER, M.J. and D.G. ALTMAN, 2001. Confidence intervals rather than P values. Statistics with Confidence. 2nd ed. Bristol: Br Med J. [Cited by 13] (2.62/year)
GAUCH, H.G., 1988. Model Selection and Validation for Yield Trials with Interaction, Biometrics 44 : 705-715. [Cited by 102] (5.68/year)
GEARY, R.C., 1947. Testing for Normality, Biometrika 34 : 209-242. [Cited by 37] (0.63/year)
GIBBONS, J.D. and J.W. PRATT, 1975. P-Values: Interpretation and Methodology. The American Statistician. [Cited by 26] (0.84/year)
GIGERENZER, G., 1993. The superego, the ego, and the id in statistical reasoning. A handbook for data analysis in the behavioral sciences: …. [Cited by 75] (5.78/year)
GIGERENZER, G., 2000. We need statistical thinking, not statistical rituals. Behavioral and Brain Sciences. [Cited by 12] (2.01/year)
GIGERENZER, G., 2004. Mindless statistics. The Journal of Socio-Economics. [Cited by 5] (2.54/year)
GIGERENZER, G., S. KRAUSS and O. VITOUCH, 2004. The null ritual: What you always wanted to know about significance testing but were afraid to ask. The SAGE handbook of quantitative methodology for the social …. [Cited by 44] (22.37/year)
GIGERENZER, Gerd, 1993. The superego, the ego, and the id in statistical reasoning. In: Keren G, Lewis C, eds. A Handbook for Data Analysis in the Behavioral Sciences, Hillsdale, NJ: Lawrence Erlbaum; 1993:311-339. [Cited by 75] (5.78/year)
GILL, J., 1999. The Insignificance of Null Hypothesis Significance Testing. Political Research Quarterly. [Cited by 35] (5.02/year)
GILL, Jeff, How Do We Do Hypothesis Testing?
GLASER, D.N., 1999. The controversy of significance testing: Misconceptions and alternatives. American Journal of Critical Care. [Cited by 2] (0.29/year)
GLINER, J.A., 2001. Null Hypothesis Significance Testing: Effect Size Matters. Human Dimensions of Wildlife. [Cited by 12] (2.42/year)
GLINER, J.A., et al., 2001. Problems with null hypothesis significance testing.. J Am Acad Child Adolesc Psychiatry. [Cited by 5] (1.01/year)
GLINER, J.A., N.L. LEECH and G.A. MORGAN, 2002. Problems with null hypothesis significance testing (NHST): What do the textbooks say. The Journal of Experimental Education. [Cited by 5] (1.26/year)
GLOVER, S.B. and P.B. DIXON, 2001. Motor adaptation to an optical illusion. Experimental Brain Research. [Cited by 26] (5.23/year)
GLOVER, S.F. and P.F. DIXON, 2002. Semantics affect the planning but not control of grasping. Experimental Brain Research. [Cited by 10] (2.52/year)
GODINO, J.D., C. BATANERO and R.G. JAIMEZ, 2001. The statistical consultancy workshop as a pedagogical tool. Training Researchers in the Use of Statistics. [Cited by 8] (1.61/year)
GOLDIN, L.R., G.A. CHASE and A.F. WILSON, 1999. Regional inference with averaged P values increases the power to detect linkage. Genetic Epidemiology. [Cited by 12] (1.72/year)
Good, I. J., 1983. Good thinking: the foundations of probability and its applications. University of Minnesota Press. [Cited by 99] (4.31/year)
GOOD, I.J., 1967. A Bayesian Significance Test for Multinomial Distributions. Journal of the Royal Statistical Society. Series B ( …. [Cited by 9] (0.23/year)
GOOD, I.J., 1967. A Bayesian significance test for multinomial distributions (with discussion). J. Roy. Statist. Soc. Ser. B. [Cited by 11] (0.28/year)
GOODMAN, S.N., 1992. A comment on replication, p-values and evidence.. Stat Med. [Cited by 20] (1.43/year)
GOODMAN, S.N., 1993. P values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical …. American Journal of Epidemiology. [Cited by 25] (1.93/year)
GOODMAN, S.N., 1999. Toward evidence-based medical statistics. 1: The P value fallacy. Ann Intern Med. [Cited by 121] (17.37/year)
GOODMAN, S.N., 2001. Of P-values and Bayes: a modest proposal. Epidemiology. [Cited by 7] (1.41/year)
GORARD, S., 2003. Understanding Probabilities and Re-Considering Traditional Research Training. Sociological Research Online. [Cited by 10] (3.37/year)
GOULD, A.L., 2001. Two cheers for P-values? by S Senn. Journal of Epidemiology and Biostatistics. [not cited] (0/year)
GOUREVITCH, V.V. and E.V. GALANTER, 1967. A significance test for one parameter isosensitivity functions. Psychometrika. [Cited by 17] (0.44/year)
GRAYBILL, F.A., 1976. Theory and application of the linear model. duxbury.com. [Cited by 291] (9.71/year)
GREEN, C.D., 2002. Comment on Chow's" Issues in Statistical Inference". History. [not cited] (0/year)
GREENWALD AG, GONZALEZ R, HARRIS RJ, Guthrie D, Effect sizes and p values: what should be reported and what should be replicated? Psychophysiology. 1996 Mar;33(2):175-83. [Cited by 34]
GREENWALD, A.G., 1975. Consequences of prejudice against the null hypothesis. Psychological Bulletin 82, 1-20. [Cited by 112] (3.62/year)
GREENWALD, A.G., et al., 1996. Effect sizes and p values: what should be reported and what should be replicated?. Psychophysiology. [Cited by 85] (8.53/year)
GRIES, S.T., 2005. Null-hypothesis significance testing of word frequencies: a follow-up on Kilgarriff. Corpus Linguistics and Linguistic Theory. [Cited by 1] (1.03/year)
GROGAN, P., 2005. The Use of Hypotheses in Ecology. British Ecological Society Bulletin. [not cited] (0/year)
GUERRA, R., et al., 1999. Meta-analysis by combining p-values: simulated linkage studies.. Genet Epidemiol. [Cited by 17] (2.44/year)
GUTHERY, F.S., J.J. LUSK and M.J. PETERSON, 2001. The fall of the null hypothesis: Liabilities and opportunities. The Journal of wildlife management. [Cited by 22] (4.43/year)
GUTHERY, F.S., J.J. LUSK and M.J. PETERSON, 2004. In my opinion: hypotheses in wildlife science. Wildlife Society Bulletin. [Cited by 2] (1.02/year)
GUTHRIE, D. and J.S. BUCHWALD, 1991. Significance testing of difference potentials.. Psychophysiology. [Cited by 67] (4.48/year)
Hacking, I. (1965). Logic of Statistical Inference. Cambridge University Press.
HAGEN, RL., 1997. In Praise of the Null Hypothesis Statistical Test, American Psychologist, 52, 15-24. [not cited] (0/year)
HAGER, W., 2000. About some misconceptions and the discontent with statistical tests in psychology. Methods of Psychological Research Online, 5. [Cited by 4] (0.67/year)
HAHN, G.J., 1990. Commentary:[Communications between Statisticians and Engineers/Physical Scientists], Technometrics 32(3) : 257-258. [Cited by 2] (0.13/year)
HAKSTIAN, A.R. and T.E. WHALEN, 1976. A k-sample significance test for independent alpha coefficients. Psychometrika. [Cited by 34] (1.13/year)
HALANYCH, K.M., 1998. Considerations for Reconstructing Metazoan History: Signal, Resolution, and Hypothesis Testing. Integrative and Comparative Biology. [Cited by 25] (3.14/year)
HALL, P. and S.R. WILSON, 1991. Two Guidelines for Bootstrap Hypothesis Testing. Biometrics. [Cited by 82] (5.48/year)
HALLAHAN, M., 1999. The Hazards of Mechanical Hypothesis Testing. Psycoloquy. [Cited by 2] (0.29/year)
HALLER, H. and S. KRAUSS, 2002. Misinterpretations of significance: A problem students share with their teachers. Methods of Psychological Research Online. [Cited by 16] (4.02/year)
HALLER, H. and S. KRAUSS, 2002. Misinterpretations of significance: A problem students share with their teachers. Methods of Psychological Research Online. [Cited by 16] (4.03/year)
HAMILTON, W.C., 1964. Statistics in physical science. Estimation, hypothesis testing, and least squares. New York: Ronald Press, 1964. [Cited by 259] (6.17/year)
HAMM, R.M., 1998. Characterizing Individual Strategies Illuminates Nonoptimal Behavior. Psycoloquy. [Cited by 4] (0.50/year)
HANSEN, B.E., 1997. Approximate Asymptotic P Values for Structural-Change Tests. Journal of Business & Economic Statistics. [Cited by 124] (13.83/year)
HARLOW, L.L., 1997. Significance Testing Introduction and Overview. What if there were no significance tests. [Cited by 12] (1.34/year)
HARLOW, L.L., S.A. MULAIK and J.H. STEIGER, 1997. What If There Were No Significance Tests?. erlbaum.com. [Cited by 151] (16.84/year)
HARLOW, LL, SA Mulaik, JH Steiger, LL Harlow, "What If There Were No Significance Tests?" [Cited by 82] [book]
HARRIS, E.K., 1993. On P values and confidence intervals (why can't we P with more confidence?). Clin Chem. [Cited by 8] (0.62/year)
HARRIS, R.J., 1997. Reforming significance testing via three-valued logic. What if there were no significance tests. [Cited by 10] (1.12/year)
HARTMANN, C., et al., 1995. Reappraisal of Hypothesis Testing for Method Validation: Detection of Systematic Error by Comparing …. Analytical Chemistry. [Cited by 45] (4.10/year)
Healy, M. J. R. (1989). Comments on the paper by McPherson. Journal of the Royal Statistical Society, Series A, 152 : 232-234.
HEALY, M.J.R., 1978. Is Statistics a Science?. Journal of the Royal Statistical Society. Series A (General) 141, Part 3 : 385-393. [Cited by 10] (0.36/year)
HEDGES, S.B., 1992. … of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies.. Mol Biol Evol. [Cited by 157] (11.24/year)
HENKEL, R.E. and D.E. MORRISON, 1971. On the Nonutility of Significance Tests: A Clarification. The Pacific Sociological Review. [not cited] (0/year)
HERTWIG, R. and P.M. TODD, 2000. … to the Left, Fallacies to the Right: Stuck in the Middle With Null Hypothesis Significance Testing. Psycoloquy. [Cited by 5] (0.84/year)
HERTWIG, R. and P.M. TODD, 2000. … TO THE RIGHT: STUCK IN THE MIDDLE WITH NULL HYPOTHESIS SIGNIFICANCE TESTING Commentary on Krueger …. Psycholoquy, 11 (28). [Cited by 1] (0.17/year)
HILSENBECK, S.G. and G.M. CLARK, 1996. Practical p-value adjustment for optimally selected cutpoints.. Stat Med. [Cited by 32] (3.21/year)
HINKLEY, D.V., 1987. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence: Comment. Journal of the American Statistical Association 82(397) : 128-129. [not cited] (0/year)
HOBBS, N.T. and R. HILBORN, Ecological Applications. ALTERNATIVES TO STATISTICAL HYPOTHESIS TESTING IN ECOLOGY: A GUIDE TO SELF TEACHING. [Cited by 1] (?/year)
HOCHBERG, Y. and Y. BENJAMINI, 1990. More powerful procedures for multiple significance testing.. Stat Med. [Cited by 134] (8.39/year)
HOCHBERG, Y., 1988. A sharper Bonferroni procedure for multiple significance testing. Biometrika. [Cited by 21] (1.17/year)
HODGES, J.L. and E.L. LEHMANN, 1954. Testing the Approximate Validity of Statistical Hypotheses. Journal of the Royal Statistical Society. Series B ( …., 16 : 261-268. [Cited by 13] (0.25/year)
HOEL, P.G., 1937. A Significance Test for Component Analysis. The Annals of Mathematical Statistics. [Cited by 8] (0.12/year)
HOFFMAN, P.F. and D.P. SCHRAG, 2002. The snowball Earth hypothesis: testing the limits of global change. Terra Nova. [Cited by 129] (32.52/year)
Hogben, L. (1957a). The contemporary crisis or the uncertainties of uncertain inference. Statistical Theory, W. W. Norton & Co., Inc. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group).
Hogben, L. (1957b). Statistical prudence and statistical inference. Statistical Theory, W. W. Norton & Co., Inc. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group).
HOIJTINK, H., 1998. … ANALYSIS USING THE GIBBS SAMPLER AND POSTERIOR PREDICTIVE P-VALUES: APPLICATIONS TO EDUCATIONAL …. Statistica Sinica. [Cited by 10] (1.26/year)
HOLMES, A. and I. FORD, 1993. A Bayesian approach to significance testing for statistic images from PET. Quantification of Brain Function: Tracer Kinetics and Image …. [Cited by 10] (0.77/year)
HONG, Y., 1999. Hypothesis Testing in Time Series Via the Empirical Characteristic Function: A Generalized Spectral …. Journal of the American Statistical Association. [Cited by 30] (4.31/year)
HOPE, A.C.A., 1968. A Simplified Monte Carlo Significance Test Procedure. Journal of the Royal Statistical Society. Series B ( …. [Cited by 104] (2.74/year)
HOWARD, G.S., S.E. MAXWELL and K.J. FLEMING, 2000. The proof of the pudding: An illustration of the relative strengths of null hypothesis, meta- …. Psychological Methods. [Cited by 13] (2.18/year)
HOWSON, Colin and Peter URBACH, 1993. Scientific Reasoning: The Bayesian Approach, second edition.
HUBBARD, R. and J.S. ARMSTRONG, 1997. Publication Bias Against Null Results. Psychological Reports. [Cited by 5] (0.56/year)
HUBBARD, R. and P.A. RYAN, 2000. The Historical Growth of Statistical Significance Testing in Psychology-and Its Future Prospects. Educational and Psychological Measurement. [Cited by 10] (1.68/year)
HUBBARD, Raymond and J. Scott ARMSTRONG, 2005. Why We Don't Really Know What “Statistical Significance” Means: A Major Educational Failure. hops.wharton.upenn.edu. [not cited] (?/year)
HUBERTY, C.J., 2002. A History of Effect Size Indices. Educational and Psychological Measurement. [Cited by 12] (3.02/year)
HUBERTY, Carl J., 1993. Historical Origins of Statistical Testing Practices: the Treatment of Fisher Versus Neyman-Pearson views in textbooks, Journal of Experimental Education, 61, 317-333. [Cited by 16] (1.23/year)
HUELSENBECK, J.P. and K.A. CRANDALL, 1997. PHYLOGENY ESTIMATION AND HYPOTHESIS TESTING USING MAXIMUM LIKELIHOOD. Annual Review of Ecology and Systematics. [Cited by 468] (52.19/year)
HUNG, H.M.J., et al., 1997. The Behavior of the P-Value When the Alternative Hypothesis is True. Biometrics. [Cited by 17] (1.90/year)
HUNTER, J.E. and F.L. SCHMIDT, 1996. Cumulative research knowledge and social policy formulation: The critical role of meta-analysis. Psychology, Public Policy, and Law. [Cited by 11] (1.10/year)
HUNTER, J.E., 1997. Needed: A ban on the significance test. Psychological Science. [Cited by 62] (6.91/year)
HUNTER, J.E., 2000. Testing significance testing: A flawed defense. Behavioral and Brain Sciences. [Cited by 11] (1.84/year)
HUNTER, J.S., 1990. Commentary:[Communications between Statisticians and Engineers/Physical Scientists], Technometrics 32(3) : 261. [Cited by 2] (0.13/year)
INGSTER, Y.I., 1993. Asymptotically minimax hypothesis testing for nonparametric alternatives, I. Math. Methods Statist. [Cited by 85] (6.56/year)
INMAN, H.F., 1994. Karl Pearson and RA Fisher on Statistical Tests: A 1935 Exchange from'Nature.'. The American Statistician. 48(1) : 2-11. [Cited by 2] (0.17/year)
IOANNIDIS, J.P., 2005. Why most published research findings are false. PLoS Med. [Cited by 29] (29.65/year)
IVNIK, R.J., et al., 2000. Diagnostic accuracy of four approaches to interpreting neuropsychological test data.. Neuropsychology. [Cited by 15] (2.51/year)
JEFFREYS, Sir Harrold, 1961. Theory of Probability. Clarendon Press Oxford. [Cited by 966] (21.48/year)
JOBSON, J.D. and B.M. KORKIE, 1981. Performance Hypothesis Testing with the Sharpe and Treynor Measures. The Journal of Finance. [Cited by 73] (2.92/year)
JOHANSEN, S., 1991. Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica. [Cited by 1632] (109.04/year)
JOHANSEN, S., 1991. Estimation and hypothesis testing of cointegration vectors in Gaussian autoregressive models. Econometrica. [Cited by 75] (5.01/year)
JOHNSON, D.H., 1998. Hypothesis testing: statistics as pseudoscience. Fifth Annual Conference of the Wildlife Society, Buffalo, …. [Cited by 3] (0.38/year)
JOHNSON, D.H., 1999. The insignificance of statistical significance testing. Journal of Wildlife Management. [Cited by 216] (30.99/year)
JOHNSON, D.H., 2004. What hypothesis tests are not: a response to Colegrave and Ruxton. Behavioral Ecology. [Cited by 1] (0.51/year)
JOHNSON, D.H., The insignificance of statistical significance testing, Journal of Wildlife Management, 1999 [Cited by 138]
JONES, D. and N. MATLOFF, 1986. Statistical hypothesis testing in biology: a contradiction in terms.. J Econ Entomol. [Cited by 9] (0.45/year)
JONES, D., 1984. Use, misuse, and role of multiple-comparison procedures in ecological and agricultural entomology, Environmental entomology 13(3) : 635-649. [Cited by 57] (2.59/year)
JONES, L.V. and J.W. TUKEY, 2000. A sensible formulation of the significance test. Psychological Methods. [Cited by 5] (0.84/year)
JOURNALS, O., 2004. Reply to: Evaluating the role of the cerebellum in temporal processing: beware of the null …. Brain. [Cited by 2] (1.02/year)
JOURNALS, O., 2005. What hypothesis tests are not: a response to Colegrave and Ruxton. Behavioral Ecology. [not cited] (0/year)
JOURNALS, O., Behavioral Ecology. Confidence intervals are a more useful complement to nonsignificant tests than are power …. [Cited by 30] (?/year)
JUNG, J., et al., 2004. Fast portscan detection using sequential hypothesis testing. Security and Privacy, 2004. Proceedings. 2004 IEEE Symposium …. [Cited by 73] (37.11/year)
KAMADA, Y., et al., 1994. Non-inductively current driven H mode with high beta N and high beta p values in JT-60U. Nucl. Fusion. [Cited by 20] (1.67/year)
KASS, G.V., 1975. Significance Testing in Automatic Interaction Detection (AID). Applied Statistics. [Cited by 15] (0.48/year)
KASSIRER, J.P., 1983. Teaching clinical medicine by iterative hypothesis testing. Let's preach what we practice.. N Engl J Med. [Cited by 26] (1.13/year)
KEMPTHORNE, O., 1966. Some Aspects of Experimental Inference. Journal of the American Statistical Association 61(313) : 11-34. [Cited by 7] (0.18/year)
KEMPTHORNE, O., 1976. Of what use are tests of significance and tests of hypotheses. Communications in Statistics: Theory and Methods, A. A5 (8) : 763-777. [Cited by 7] (0.23/year)
KENNY, D.A., 1995. The effect of nonindependence on significance testing in dyadic research. Personal Relationships. [Cited by 31] (2.83/year)
KIDA, T., 1984. The Impact of Hypothesis-Testing Strategies on Auditors' Use of Judgment Data. Journal of Accounting Research. [Cited by 35] (1.59/year)
KIESEPP?, I.A. and M. FORSTER, How To Remove the Ad Hoc Features of Statistical Inference within a Frequentist Paradigm. philosophy.wisc.edu. [not cited] (?/year)
KILGARRIFF, A., 2005. Language is never, ever, ever, random. Corpus Linguistics and Linguistic Theory. [Cited by 1] (1.03/year)
KILLEEN, P.R., 2005. General Article An Alternative to Null-Hypothesis Significance Tests. Psychological Science. [Cited by 29] (29.95/year)
KILLEEN, Peter R., An Alternative to Null-Hypothesis Significance Tests
KIRK, R.E., 2001. PROMOTING GOOD STATISTICAL PRACTICES: SOME SUGGESTIONS. Educational and Psychological Measurement. [Cited by 30] (6.04/year)
KISH, L., 1959. Some Statistical Problems in Research Design. American Sociological Review 24 :328-338. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 23] (0.49/year)
KLAYMAN, J. and Y.W. HA, 1987. Confirmation, Disconfirmation, and Information in Hypothesis Testing.. Psychological Review. [Cited by 408] (21.51/year)
KLAYMAN, J. and Y.W.O.N. HA, 1989. Hypothesis testing in rule discovery: strategy, structure, and content. Journal of experimental psychology. Learning, memory, and …. [Cited by 48] (2.83/year)
KLEIN, D.F., 2005. Beyond Significance Testing: Reforming Data Analysis Methods in Behavioral Research. American Journal of Psychiatry. [Cited by 21] (21.69/year)
KNAPP, M., 1999. Using exact P values to compare the power between the reconstruction-combined transmission/ …. Am J Hum Genet. [Cited by 12] (1.72/year)
KNAPP, T.R., 1978. Canonical correlation analysis: A general parametric significance testing system. Psychological Bulletin. [Cited by 32] (1.14/year)
KNAPP, T.R., 1998. Comments on the statistical significance testing articles. Research in the Schools. [Cited by 10] (1.25/year)
KOCH, G.G., 1991. One-sided and two-sided tests and p values.. J Biopharm Stat. [Cited by 6] (0.40/year)
KOCH, K.R., 1988. Parameter estimation and hypothesis testing in linear models. Springer-Verlag New York, Inc. New York, NY, USA. [Cited by 195] (10.85/year)
KORICHEVA, J., 2003. Non-significant results in ecology: a burden or a blessing in disguise?. Oikos. [Cited by 6] (2.02/year)
KORICHEVA, J., et al., 1998. … of woody plant secondary metabolism by resource availability: hypothesis testing by means of meta- …. Oikos. [Cited by 133] (16.69/year)
KOTRLIK, J.W. and H.A. WILLIAMS, 2003. The Incorporation of Effect Size in Information Technology, Learning, and Performance Research. Information Technology, Learning, and Performance Journal. [Cited by 3] (1.01/year)
KRANTZ, D.H., 1999. The Null Hypothesis Testing Controversy in Psychology.. Journal of the American Statistical Association. [Cited by 27] (3.87/year)
KRAUSS, Stefan and Christoph WASSNER, How Significance Tests Should Be Presented To Avoid The Typical Misinterpretations
KRISHNAMOORTHY, K. and T. MATHEW, 2003. … on the means of lognormal distributions using generalized p-values and generalized confidence …. Journal of Statistical Planning and Inference. [Cited by 12] (4.04/year)
KRUEGER, J., 1998. The Bet on Bias: a Foregone Conclusion?. Psycoloquy. [Cited by 55] (6.90/year)
KRUEGER, J., 1998. Getting to the Core of the Data by Testing Against Alternative Hypotheses. Psycoloquy. [Cited by 2] (0.25/year)
KRUEGER, J., 1999. Significance Testing Does Not Solve the Problem of Induction. Psycoloquy. [Cited by 2] (0.29/year)
KRUEGER, J., 2001. Null hypothesis significance testing. On the survival of a flawed method.. Am Psychol. [Cited by 21] (4.23/year)
KRUEGER, Joachim, Theoretical Progress Requires Refined Methods and Then SomeTheoretical Progress Requires Refined Methods and Then Some
KRUSKAL, W. and R. MAJORS, 1989. Concepts of Relative Importance in Recent Scientific Literature. The American Statistician 43(1) : 2-6. [Cited by 15] (0.88/year)
Kruskal, W. H. (1978). Significance, Tests of. In International Encyclopedia of Statistics , eds. W. H. Kruskal and J. M. Tanur, Free Press (New York) : 944-958.
KRUSKAL, W., 1980. The significance of Fisher: a review of RA Fisher: the life of a scientist, Journal of the American Statistical Association 75(372) : 1019-1030. [Cited by 2] (0.08/year)
KURTZ, J., 1999. The immunocompetence handicap hypothesis: testing the genetic predictions. Proceedings: Biological Sciences. [Cited by 31] (4.45/year)
LABOVITZ, S., 1970. The Nonutility of Significance Tests: The Significance of Tests of Significance Reconsidered. The Pacific Sociological Review. [not cited] (0/year)
LANDY, F.J., 1986. Stamp collecting versus science: Validation as hypothesis testing. American Psychologist. [Cited by 38] (1.90/year)
LANG, J.M., K.J. ROTHMAN and C.I. CANN, 1998. That confounded P-value.. Epidemiology. [Cited by 28] (3.51/year)
LAVERGNE, P. and Q. VUONG, 2000. NONPARAMETRIC SIGNIFICANCE TESTING. Econometric Theory. [Cited by 20] (3.35/year)
LECOUTRE, B., 1999. Beyond the significance test controversy: Prime time for Bayes. Bulletin of the International Statistical Institute: …. [Cited by 5] (0.72/year)
LECOUTRE, B., P. LECOUTRE and J.M. GROUIN, 2001. A Challenge for Statistical Instructors: Teaching Bayesian Inference Without Discarding the “ …. Bayesian methods with applications to science, policy and …. [Cited by 1] (0.20/year)
LECOUTRE, Bruno, ERIS
LECOUTRE, Bruno, Beyond the significance test controversy: Prime time for Bayes? [Cited by 3]
LECOUTRE, M.P.T.S., J.T.S. POITEVINEAU and B.T.S. LECOUTRE, 2003. Even statisticians are not immune to misinterpretations of Null Hypothesis Significance Tests. International Journal of Psychology. [Cited by 8] (2.70/year)
LEE, M.D. and E.J. WAGENMAKERS, 2005. Bayesian statistical inference in psychology: Comment on Trafimow (2003). Psychological Review. [Cited by 8] (8.26/year)
LEE, S.J., K. KIM and A.A. TSIATIS, 1996. Repeated Significance Testing in Longitudinal Clinical Trials. Biometrika. [Cited by 12] (1.20/year)
LEHMANN, E.L. and Joseph P. ROMANO, Testing Statistical Hypotheses, 2005. [book]
LEPSKI, O.V. and V.G. SPOKOINY, 1999. Minimax nonparametric hypothesis testing: the case of an inhomogeneous alternative. Bernoulli. [Cited by 33] (4.74/year)
LEPSKI, O.V.K. and A.B.K. TSYBAKOV, 2000. Asymptotically exact nonparametric hypothesis testing in sup-norm and at a fixed point. Probability Theory and Related Fields. [Cited by 25] (4.19/year)
LEVIN, J.R. and D.H. ROBINSON, 2000. Rejoinder: Statistical hypothesis testing, effect-size estimation, and the conclusion coherence of …. Educational Researcher. [Cited by 7] (1.17/year)
LEVIN, J.R., 1993. Statistical significance testing from three perspectives. Journal of Experimental Education. [Cited by 17] (1.31/year)
LEVIN, J.R., 1998. What if there were no more bickering about statistical significance tests. Research in the Schools. [Cited by 13] (1.63/year)
LEVIN, J.R.K. and D.H.K. ROBINSON, 1999. Further Reflections on Hypothesis Testing and Editorial Policy for Primary Research Journals. Educational Psychology Review. [Cited by 12] (1.72/year)
LEVINE, M., 1975. A Cognitive Theory of Learning: Research on Hypothesis Testing. John Wiley. [Cited by 40] (1.29/year)
LI, K.H., et al., 1991. SIGNIFICANCE LEVELS FROM REPEATED P-VALUES WITH MULTIPLY-IMPUTED DATA. Statistica Sinica. [Cited by 12] (0.80/year)
LIBERMAN, N. and Y. KLAR, 1996. Hypothesis testing in Wason's selection task: Social exchange cheating detection or task …. Cognition. [Cited by 31] (3.11/year)
Lindley, D. V. (1986). Discussion. The Statistician 35 : 502-504.
LITTLE, T.M., 1981. Interpretation and presentation of results. Hortscience 16(5) : 637-640. [Cited by 14] (0.56/year)
LIU, R.Y. and K. SINGH, 1997. Notions of Limiting P Values Based on Data Depth and Bootstrap. Journal of the American Statistical Association. [Cited by 18] (2.01/year)
LOEHLE, C., 1987. Hypothesis testing in ecology: psychological aspects and the importance of theory maturation.. Q Rev Biol. [Cited by 33] (1.74/year)
LOEHLE, C., 1997. A hypothesis testing framework for evaluating ecosystem model performance. Ecological Modelling. [Cited by 27] (3.01/year)
LOFTUS, G.R., 1991. On the tyranny of hypothesis testing in the social sciences. Contemporary Psychology. [Cited by 26] (1.74/year)
LOFTUS, G.R., 1993. … picture is worth a thousand p values: On the irrelevance of hypothesis testing in the microcomputer …. Behavior Research Methods, Instruments & Computers. [Cited by 38] (2.93/year)
LOFTUS, G.R., 1996. Psychology Will Be a Much Better Science When We Change the Way We Analyze Data.. Current Directions in Psychological Science. [Cited by 89] (8.93/year)
LONGO, M., T.D. LOOKABAUGH and R.M. GRAY, 1990. Quantization for decentralized hypothesis testing undercommunication constraints. Information Theory, IEEE Transactions on. [Cited by 58] (3.63/year)
LORD, F.M., 1957. A significance test for the hypothesis that two variables measure the same trait except for errors …. Psychometrika. [Cited by 10] (0.20/year)
LUNT, P., The significance of the significance test controversy: comments on 'Size Matters' [not cited]
LYKKEN, D.T., 1968. Statistical significance in psychological research. Psychological Bulletin 70 :151-159. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 108] (2.84/year)
MAGUIRE, H.C., et al., 1995. An outbreak of cryptosporidiosis in south London: what value the p value?. Epidemiol Infect. [Cited by 7] (0.64/year)
MAHARAJ, E.A., 1994. A significance test for classifying ARMA models. ideas.repec.org. [Cited by 11] (0.92/year)
MALONEY, C.J. and S.C. RASTOGI, 1970. Significance Test for Grubbs's Estimators. Biometrics. [Cited by 12] (0.33/year)
MARASCUILO, L.A., 1970. Extensions of the significance test for one-parameter signal detection hypotheses. Psychometrika. [Cited by 8] (0.22/year)
MARDEN, J.I., 2000. Hypothesis Testing: From P Values to Bayes Factors.. Journal of the American Statistical Association. [Cited by 4] (0.67/year)
MARGOLIS, H., 1998. Logic, Intuition, and Einstein. Psycoloquy. [Cited by 4] (0.50/year)
MARTIEN, K.K. and B.L. TAYLOR, 2003. Limitations of hypothesis-testing in defining management units for continuously distributed species. Journal of cetacean research and management. [Cited by 1] (0.34/year)
MARTINOVICH, Z., S. SAUNDERS and K.I. HOWARD, 1996. Some comments on “assessing clinical significance.”. Psychotherapy Research. [Cited by 15] (1.50/year)
MASSON, M.E.J. and G.R. LOFTUS, 2003. Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology. [Cited by 40] (13.48/year)
Matloff, N. S. (1991). Statistical hypothesis testing: problems and alternatives. Environmental Entomology 20(5) : 1246-1250.
MATTHEWS, J.N.S. and D.G. ALTMAN, British Medical Journal. Statistics Notes: Interaction 2: compare effect sizes not P values. [Cited by 13] (?/year)
MAUCHLY, J.W., 1940. Significance Test for Sphericity of a Normal n-Variate Distribution. The Annals of Mathematical Statistics. [Cited by 73] (1.11/year)
MAY, K., 2003. A Note on the Use of Confidence Intervals. Understanding Statistics. [Cited by 1] (0.34/year)
MCCARTHY, M.A. and K.M. PARRIS, 2004. METHODOLOGICAL INSIGHTS Clarifying the effect of toe clipping on frogs with Bayesian statistics. Journal of Applied Ecology. [Cited by 9] (4.57/year)
MCCAULEY, C., 1998. The Bet on Bias is Cockeyed Optimism. Psycoloquy. [Cited by 5] (0.63/year)
MCCLOSKEY, D.N., 1995. The Insignificance of Statistical Significance. Scientific American. 272(4) :104-105. [Cited by 7] (0.64/year)
MCLEAN, J.E. and J.M. ERNEST, 1998. The role of statistical significance testing in educational research. Research in the Schools. [Cited by 25] (3.14/year)
MCNEMAR, Q., 1960. At random: Sense and nonsense. American Psychologist 15 : 295-300. [Cited by 17] (0.37/year)
MCPHERSON, G., 1989. The Scientists' View of Statistics--A Neglected Area. Journal of the Royal Statistical Society. Series A ( …. [Cited by 5] (0.29/year)
MEEHL, P.E., 1967. Theory-Testing in Psychology and Physics: A Methodological Paradox. Philosophy of Science 34 : 103-115. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 128] (3.28/year)
MEEHL, P.E., 1978. Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft …. Journal of Consulting and Clinical Psychology 46, 806-834. [Cited by 240] (8.58/year)
MEEHL, P.E., 1986. What social scientists don't understand. Metatheory in social science. [Cited by 16] (0.80/year)
MEEHL, P.E., 1990. Appraising and Amending Theories: The Strategy of Lakatosian Defense and Two Principles That Warrant …. Psychological Inquiry, 1, 108-141. [Cited by 45] (2.82/year)
MEEHL, P.E., 1990. Why summaries of research on psychological theories are often uninterpretable, Psychological Reports 66(Monograph Suppl. 1-V66), 195-244. [Cited by 56] (3.51/year)
MEHTA, C.R., N.R. PATEL and A.A. TSIATIS, 1984. Exact Significance Testing to Establish Treatment Equivalence with Ordered Categorical Data. Biometrics. [Cited by 81] (3.69/year)
MENDOZA, J.L., 1980. A Significance Test for Multisample Sphericity.. Psychometrika. [Cited by 9] (0.35/year)
MENG, C.Y.K. and A.P. DEMPSTER, 1987. A Bayesian Approach to the Multiplicity Problem for Significance Testing with Binomial Data. Biometrics. [Cited by 12] (0.63/year)
MENG, X.L., 1994. Posterior Predictive p-Values. The Annals of Statistics. [Cited by 100] (8.36/year)
MENZIES, T. and P. COMPTON, 1997. Applications of abduction: Hypothesis testing of neuroendocrinological qualitative compartmental …. Artificial Intelligence in Medicine. [Cited by 30] (3.35/year)
MEYER, S.M., 1992. … Economic Prosperity: Testing the Environmental Impact Hypothesis: Testing the Environmental Impact …. Massachusetts Institute of Technology, Project on Environmental …. [Cited by 28] (2.00/year)
MILLER, J., Issues in applying Statistical Significance Testing to Software Engineering Experiments. ee.ualberta.ca. [not cited] (?/year)
MILLER, J., 2004. Statistical significance testing: a panacea for software technology experiments?. Journal of Systems and Software. [Cited by 5] (2.54/year)
MITCHELL, P.D., 2000. The impact of educational technology: a radical reappraisal of research methods. The Changing Face of Learning Technology. University of …. [Cited by 10] (1.68/year)
MITTAG, K.C. and B. THOMPSON, 2000. A national survey of AERA members' perceptions of statistical significance tests and other …. Educational Researcher. [Cited by 18] (3.02/year)
MOGIE, M.G., 2004. In support of null hypothesis significance testing. Proceedings: Biological Sciences. [Cited by 1] (0.51/year)
MOHR, L.B., 1990. Understanding Significance Testing. books.google.com. [Cited by 25] (1.57/year)
MOORE, D.S. and G.P. MCCABE, 1989. Introduction to the practice of statistics. WH Freeman. [Cited by 516] (30.41/year)
MORGAN, D.L. and R.K. MORGAN, 2001. Single-participant research design. Bringing science to managed care.. Am Psychol. [Cited by 11] (2.21/year)
MORGAN, P.L., 2003. Null Hypothesis Significance Testing: Philosophical and Practical Considerations of a Statistical …. Exceptionality. [Cited by 1] (0.34/year)
MORGAN, Paul L., Null Hypothesis Significance Testing: Philosophical and Practical Considerations of a Statistical Controversy [not cited]
MORRISON, D.E. and R.E. HENKEL, 1969. Significance tests reconsidered. American Sociologist 4 : 131-140. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 8] (0.22/year)
MORRISON, D.E. and R.E. HENKEL, 1970. The significance test controversy: a reader. Aldine Pub. Co. [Cited by 97] (2.70/year)
MOTT, R., 2000. Accurate formula for P-values of gapped local sequence and profile alignments. J. Mol. Biol. [Cited by 66] (11.06/year)
MOYE, L.A., 1998. P-value interpretation and alpha allocation in clinical trials.. Ann Epidemiol. [Cited by 13] (1.63/year)
MOYE, L.A., 2000. Statistical Reasoning in Medicine: The Intuitive P-Value Primer. books.google.com. [Cited by 11] (1.84/year)
MULAIK, S.A., N.S. RAJU and R.A. HARSHMAN, 1997. There is a time and place for significance testing - What if there were no significance tests. [Cited by 30] (3.35/year)
MUNDRY, R. and J. FISCHER, 1998. … for nonparametric tests of small samples often leads to incorrect P values: examples from Animal …. Animal Behaviour. [Cited by 68] (8.54/year)
NAIMAN, D.Q. and C.E. PRIEBE, 2001. Computing Scan Statistic p Values Using Importance Sampling, With Applications to Genetics and …. Journal of Computational & Graphical Statistics. [Cited by 8] (1.61/year)
NEALE, M.C.S., et al., 1989. Fitting genetic models with LISREL: Hypothesis testing. Behavior Genetics. [Cited by 30] (1.77/year)
Nelder, J. A. (1985). Discussion of Dr Chatfield's paper. J. R. Statist. Soc. A 148, Part 3 : 238.
NELDER, J.A., 1971. Discussion on the papers by Wynn and Bloomfield, and O'Neill and Wetherill. JR Statist. Soc. B., 33 : 244-246. [Cited by 2] (0.06/year)
NELDER, J.A., 1999. From Statistics to Statistical Science. The Statistician. [Cited by 17] (2.44/year)
NESTER, Marks R., "A Myopic View and History of Hypothesis Testing".
NESTER, Marks R., 1996. An Applied Statistician's Creed, Applied statistics Vol. 45, No. 4. (1996), pp. 401-410. [Cited by 35] (3.51/year)
NESTER, Marks R., 1996. An Applied Statistician's Creed, Applied statistics Vol. 45, No. 4. (1996), pp. 401-410. [Cited by 35] (3.51/year)
NEUMANN, C.J., M.B. LAWRENCE and E.L. CASO, 1977. Monte Carlo significance testing as applied to statistical tropical cyclone prediction models. Journal of Applied Meteorology. [Cited by 7] (0.24/year)
NEWCOMBE, R.G., 1998. Two-sided confidence intervals for the single proportion: comparison of seven methods. Statistics in Medicine. [Cited by 221] (27.74/year)
NEWEY W.K. and D McFADDEN, Large Sample Estimation and Hypothesis Testing, Handbook of Econometrics, 1994. [Cited by 346]
NEWEY, W.K. and D. MCFADDEN, 1994. Large Sample Estimation and Hypothesis Testing. Handbook of Econometrics. [Cited by 429] (35.85/year)
NEWEY, W.K. and K.D. WEST, 1987. Hypothesis Testing with Efficient Method of Moments Estimation. International Economic Review. [Cited by 168] (8.86/year)
NEYMAN, J. and E.S. PEARSON, 1933. On the Problem of the Most Efficient Tests of Statistical Hypotheses, Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, Vol. 231. (1933), pp. 289-337. [Cited by 212] (2.91/year)
NEYMAN, J., 1958. The Use of the Concept of Power in Agricultural Experimentation. Journal of the Indian Society of Agricultural Statistics 9(1) : 9-17. [Cited by 1] (0.02/year)
NICHOLLS, N., 2000. Commentary and analysis: The insignificance of significance testing. Bulletin of the American Meteorological Society. [Cited by 40] (6.70/year)
NICKERSON, R.S., 2000. Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods. [Cited by 111] (18.60/year)
NIELSEN, R. and J.P. HUELSENBECK, 2002. Detecting positively selected amino acid sites using posterior predictive P-values. Pacific Symposium on Biocomputing, proceedings (RB Altman, …. [Cited by 13] (3.28/year)
NIX, T.W. and J.J. BARNETTE, 1998. A review of hypothesis testing revisited: Rejoinder to Thompson, Knapp, and Levin. Research in the Schools. [Cited by 5] (0.63/year)
NIX, T.W. and J.J. BARNETTE, 1998. The data analysis dilemma: Ban or abandon. A review of null hypothesis significance testing. Research in the Schools. [Cited by 15] (1.88/year)
NORTH, B.V., D. CURTIS and P.C. SHAM, 2002. A note on the calculation of empirical P values from Monte Carlo procedures. Am J Hum Genet. [Cited by 29] (7.31/year)
Northern Prairie Wildlife Research Center, The Insignificance of Statistical Significance Testing
NUNNALLY, J., 1960. The place of statistics in psychology. Educational and Psychological Measurement XX(4) : 641-650. [Cited by 20] (0.44/year)
O'SHEA, T.M., et al., 1998. Intrauterine infection and the risk of cerebral palsy in very low-birthweight infants. Paediatr Perinat Epidemiol. [Cited by 40] (5.02/year)
OAKES, M., 1986. Statistical Inference: A Commentary for the Social and Behavioural Sciences. Wiley. [Cited by 122] (6.11/year)
OGAWA, T. and H. NAGAOKA, 2000. Strong converse and Stein's lemma in quantum hypothesis testing. Information Theory, IEEE Transactions on. [Cited by 21] (3.52/year)
OVERALL, J.E. and H.M. RHOADES, 1987. Adjusting p values for multiple tests of significance. Psychopharmacology. The Third Generation of Progress. [Cited by 7] (0.37/year)
OVERLAND, J.E. and R.W. PREISENDORFER, 1982. A significance test for principal components applied to a cyclone climatology. Mon. Wea. Rev. [Cited by 99] (4.13/year)
PAGE, E.B., 1963. Ordered Hypotheses for Multiple Treatments: A Significance Test for Linear Ranks. Journal of the American Statistical Association. [Cited by 75] (1.75/year)
PALMER, P.L., J. KITTLER and M. PETROU, lip;. A Hough transform algorithm with a 2D hypothesis testing kernel. Pattern Recognition, 1992. Vol. III. Conference C: Image, &h. [Cited by 22] (?/year)
PARADIS, E., 1997. … temporal variations in diversification rates from phylogenies: estimation and hypothesis testing. Proceedings: Biological Sciences. [Cited by 32] (3.57/year)
PARKHURST, D.F., 2001. Statistical significance tests: equivalence and reverse tests should reduce misinterpretation. BioScience. [Cited by 12] (2.42/year)
Pearce, S. C. (1992). Data analysis in agricultural experimentation. II. Some standard contrasts. Expl Agric. 28 : 375-383.
PEARN, W.L. and P.C. LIN, 2002. Computer program for calculating the p-value in testing process capability index C pmk. Quality and Reliability Engineering International. [Cited by 7] (1.76/year)
PEARSON, K., 1900. On the Criterion that a Given System of Deviations from the Probable in the Case of a Correlated Systems of Variables is such that it can be Reasonably Supposed to have Arisen from Random Sampling. London. Philosophical Magazine, Series V, 1 : 157-175. [Cited by 177] (1.67/year)
PEARSON, K., 1935. Statistical tests. Nature 136 [Cited by 2] (0.03/year)
PEREIRA, C.A.B. and S. WECHSLER, 1993. On the concept of p-value. Braz. J. Prob. Statist. [Cited by 12] (0.93/year)
PERRY, J.N., 1986. Multiple-comparison procedures: a dissenting view, Journal of Economic Entomology 79(5) : 1149-1155. [Cited by 29] (1.45/year)
PHILLIPS, O. and A.H. GENTRY, 1993. The useful plants of Tambopata, Peru: II. Additional hypothesis testing in quantitative ethnobotany. Economic Botany. [Cited by 32] (2.47/year)
PITT, Mark A. and In Jae MYUNG, NHST: Can Psychology Do Better?
PLETCHER, S.D., Model? fitting and hypothesis testing for age-specific mortality data. ingentaconnect.com. [Cited by 48] (?/year)
POLETIEK, F.H., 2000. Hypothesis Testing Behaviour. ecommerce.tandf.co.uk. [Cited by 20] (3.35/year)
POLINE, J.B., et al., 1995. Estimating smoothness in statistical parametric maps: variability of p values.. J Comput Assist Tomogr. [Cited by 35] (3.19/year)
POLLARD, P., & RICHARDSON, J. T. E. (1987). On the probability of making Type I errors. Psychological Bulletin, 102, 159-163.
POND, S.L.K., S.D.W. FROST and S.V. MUSE, 2005. HyPhy: hypothesis testing using phylogenies. Bioinformatics. [Cited by 27] (27.92/year)
POOLE, C., 2001. Low P-values or narrow confidence intervals: which are more durable. Epidemiology. [Cited by 25] (5.03/year)
POSAVAC, E.J., 2002. Using p Values to Estimate the Probability of a Statistically Significant Replication. Understanding Statistics. [Cited by 8] (2.02/year)
POSAVAC, E.J., 2002. Using p Values to Estimate the Probability of a Statistically Significant Replication. Understanding Statistics. [Cited by 8] (2.02/year)
POUNDS, S. and S.W. MORRIS, 2003. … in microarray studies by approximating and partitioning the empirical distribution of p-values. Bioinformatics. [Cited by 40] (13.48/year)
POWER, D.O.F., 2000. Power and Effect Size: Research Considerations for the Clinical Nurse Specialist. Clinical Nurse Specialist. [Cited by 2] (0.34/year)
PRATT, J. W. (1976). A discussion of the question: for what use are tests of hypotheses and tests of significance. Commun. Statist.-Theor. Meth. A5(8) : 779-787.
Preece, D. A. (1990). R. A. Fisher and experimental design: a review. Biometrics 46 : 925-935.
PREECE, D.A., 1982. The design and analysis of experiments: what has gone wrong. Utilitas Mathematica 21A : 201-244. [Cited by 4] (0.17/year)
PREECE, D.A., 1984. Biometry in the Third World: Science not Ritual, Biometrics 40 : 519-523. [Cited by 1] (0.05/year)
PRINCEN, J., J. ILLINGWORTH and J. KITTLER, 1994. Hypothesis testing: a framework for analyzing and optimizing Hough transform performance. Pattern Analysis and Machine Intelligence, IEEE Transactions …. [Cited by 24] (2.01/year)
PYSZCZYNSKI, T. and J. GREENBERG, 1987. … of cognitive and motivational perspectives on social inference: a biased hypothesis-testing model. Advances in experimental social psychology. [Cited by 125] (6.59/year)
QUINN, J.F. and A.E. DUNHAM, 1983. On Hypothesis Testing in Ecology and Evolution. The American Naturalist. [Cited by 79] (3.44/year)
RACINE, J., 1997. Consistent Significance Testing for Nonparametric Regression. Journal of Business & Economic Statistics. [Cited by 9] (1.00/year)
RAFTERY, A.E., 1996. Hypothesis testing and model selection via posterior simulation. Practical Markov Chain Monte Carlo(WR Gilks, DJ …. [Cited by 34] (3.41/year)
RAFTERY, A.E., 1996. Hypothesis testing and model selection. Markov Chain Monte Carlo in Practice. [Cited by 97] (9.73/year)
RAYNER, R.K., 1990. Bootstrapping p Values and Power in the First-Order Autoregression: A Monte Carlo Investigation. Journal of Business & Economic Statistics. [Cited by 11] (0.69/year)
RICE, W.R., 1988. A New Probability Model for Determining Exact P-Values for 2 x 2 Contingency Tables When Comparing …. Biometrics. [Cited by 26] (1.45/year)
RICE, W.R., 1990. A Consensus Combined P-Value Test and the Family-Wide Significance of Component Tests. Biometrics. [Cited by 66] (4.13/year)
RICKERT, N.W., 1998. Intelligence is Not Rational. Psycoloquy. [Cited by 11] (1.38/year)
RIGBY, A.S., 1999. Getting past the statistical referee: moving away from P-values and towards interval estimation. Health Education Research. [Cited by 1] (0.14/year)
ROBINS, J.M., et al., 2000. Asymptotic Distribution of P Values in Composite Null Models.. Journal of the American Statistical Association. [Cited by 38] (6.37/year)
ROBINSON, D.H. and H. WAINER, 2002. On the past and future of null hypothesis significance testing. The Journal of wildlife management. [Cited by 17] (4.28/year)
ROBINSON, D.H. and J.R. LEVIN, 1997. Reflections on Statistical and Substantive Significance, with a Slice of Replication. Educational Researcher. [Cited by 52] (5.80/year)
ROBINSON, P.M., 1989. Hypothesis Testing in Semiparametric and Nonparametric Models for Econometric Time Series. The Review of Economic Studies. [Cited by 29] (1.71/year)
RODGERS, J.L. and D.C. ROWE, 2002. Theory development should begin (but not end) with good empirical fits: A comment on Roberts and …. Psychological Review. [Cited by 8] (2.02/year)
ROGER, J.H., 1977. A significance test for cyclic trends in incidence data. Biometrika. [Cited by 42] (1.45/year)
ROSENTHAL, R., 1979. The “file drawer problem” and tolerance for null results. Psychological Bulletin 86, 638-641. [Cited by 380] (14.09/year)
ROSENTHAL, R., 1993. Cumulating evidence. A handbook for data analysis in the behavioral sciences: Methodological issues [Cited by 17] (1.31/year)
ROSNER, B. and R.C. MILTON, 1988. Significance Testing for Correlated Binary Outcome Data. Biometrics. [Cited by 12] (0.67/year)
ROSNER, B. and W.C. WILLETT, 1988. … corrected for within-person variation: implications for study design and hypothesis testing. Am J Epidemiol. [Cited by 77] (4.29/year)
ROSNER, B., 1995. Hypothesis testing: categorical data. Fundamentals of Biostatistics. [Cited by 53] (4.83/year)
ROSNER, B., A. DONNER and C.H. HENNEKENS, 1979. Significance Testing of Interclass Correlations from Familial Data. Biometrics. [Cited by 6] (0.22/year)
ROSNOW, R.L. and R. ROSENTHAL, 1996. Computing contrasts, effect sizes, and counternulls on other people's published data: General …. Psychological Methods. [Cited by 69] (6.92/year)
ROSNOW, R.L., 2003. Effect sizes for experimenting psychologists.. Can J Exp Psychol. [Cited by 10] (3.37/year)
ROTHSTEIN, H. and M.C. TONGES, 2000. Beyond the Significance Test In Administrative Research and Policy Decisions. Journal OF Nursing Scholarship. [Cited by 4] (0.67/year)
ROTNITZKY, A. and N.P. JEWELL, 1990. Hypothesis testing of regression parameters in semiparametric generalized linear models for cluster …. Biometrika. [Cited by 71] (4.45/year)
ROULSTON, M.S., 1997. Significance testing of information theoretic functionals. Physica D. [Cited by 10] (1.12/year)
ROZEBOOM, W.W., 1960. The fallacy of the null-hypothesis significance test, Psychological Bulletin 57 : 416-428. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 95] (2.07/year)
ROZEBOOM, W.W., 1960. The fallacy of the null-hypothesis significance test.. Psychol Bull. [Cited by 95] (2.07/year)
RUBIN, D.B., 1998. More powerful randomization-based p-values in double-blind trials with non-compliance. Statistics in Medicine. [Cited by 23] (2.89/year)
RUSCIO, J., 1998. Applying What We Have Learned: Understanding and Correcting Biased Judgment. Psycoloquy. [Cited by 7] (0.88/year)
RUTLEDGE, T. and C. LOH, 2004. Effect Sizes and Statistical Testing in the Determination of Clinical Significance in Behavioral …. Annals of Behavioral Medicine. [Cited by 11] (5.59/year)
RUVOLO, M., 1996. A New Approach to Studying Modern Human Origins: Hypothesis Testing with Coalescence Time …. Molecular Phylogenetics and Evolution. [Cited by 28] (2.81/year)
SACKROWITZ, H. and E. SAMUEL-CAHN, 1999. P Values as Random Variables-Expected P Values.. The American Statistician. [Cited by 12] (1.72/year)
SARKAR, S.K. and C.K. CHANG, 1997. The Simes Method for Multiple Hypothesis Testing with Positively Dependent Test Statistics.. Journal of the American Statistical Association. [Cited by 39] (4.35/year)
SAVAGE, I.R., 1957. Nonparametric Statistics. Journal of the American Statistical Association. 52 : 331-344. [Cited by 13] (0.27/year)
SAVIN, N.E., 1984. Multiple Hypothesis Testing. Handbook of Econometrics. [Cited by 65] (2.96/year)
SAVITZ, D.A., 1993. Is statistical significance testing useful in interpreting data?. Reprod Toxicol. [Cited by 11] (0.85/year)
SAVITZ, D.A., K.A. TOLO and C. POOLE, 1994. Statistical significance testing in the American Journal of Epidemiology, 1970-1990. American Journal of Epidemiology. [Cited by 12] (1.00/year)
SCHERVISH, M. J., P values: what they are and what they are not, Am. Stat, 1996 [Cited by 19]
SCHERVISH, M.J., 1996. P Values: What They Are and What They Are Not.. The American Statistician. [Cited by 29] (2.91/year)
SCHMIDT, F. and J.E. HUNTER, 1995. … on cumulative research knowledge: statistical significance testing, confidence intervals, and meta- …. Eval Health Prof. [Cited by 7] (0.64/year)
SCHMIDT, F.L. and J.E. HUNTER, 1997. Eight common but false objections to the discontinuation of significance testing in the analysis of …. What if there were no significance tests. [Cited by 46] (5.13/year)
SCHMIDT, F.L., 1992. What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. American Psychologist 47, 1173-1181. [Cited by 127] (9.09/year)
SCHMIDT, F.L., 1996. Statistical significance testing and cumulative knowledge in psychology: Implications for training …. Psychological Methods. [Cited by 280] (28.09/year)
SCHWARTZ, J., Beyond LOEL's, p values, and vote counting: methods for looking at the shapes and strengths of …. ncbi.nlm.nih.gov. [Cited by 18] (?/year)
SCHWEDER, T. and E. SPJ?TVOLL, 1982. Plots of P-values to evaluate many tests simultaneously. Biometrika. [Cited by 71] (2.96/year)
SEDLMEIER, P. and G. GIGERENZER, 1989. Do studies of statistical power have an effect on the power of studies?. Psychological bulletin, 105, 309-316. [Cited by 105] (6.19/year)
SELLKE, T., M.J. BAYARRI and J.O. BERGER, 1999. Calibration of P-values for Testing Precise Null Hypotheses. stat.duke.edu. [Cited by 37] (5.31/year)
SENCHAUDHURI, P., C.R. MEHTA and N.R. PATEL, 1995. Estimating Exact P Values by the Method of Control Variates or Monte Carlo Rescue.. Journal of the American Statistical Association. [Cited by 9] (0.82/year)
SENN, S., 2001. Two cheers for P-values?. Journal of Epidemiology and Biostatistics. [Cited by 2] (0.40/year)
SEO, J., et al., 2004. … profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix …. Bioinformatics. [Cited by 27] (13.73/year)
SEO, J., et al., 2004. … profiling: project-specific algorithm selection and detection p value weighting in affymetrix …. Bioinformatics. [Cited by 8] (4.07/year)
SERLIN, R.C. and D.K. LAPSLEY, 1993. Rational appraisal of psychological research and the good-enough principle. A handbook for data analysis in the behavioral sciences: …. [Cited by 13] (1.00/year)
SHAFFER, J.P., 1995. Multiple Hypothesis Testing.. Annual Review of Psychology. [Cited by 107] (9.76/year)
SHAVER, J.P., 1993. What Statistical Significance Testing Is, and What It is Not. Journal of Experimental Education. [Cited by 43] (3.32/year)
SHEA, C., 1996. Psychologists debate accuracy of ‘significance test.'. The Chronicle of Higher Education. [Cited by 8] (0.80/year)
SHOESMITH, E., 1987. The Continental Controversy over Arbuthnot's Argument for Divine Providence. Historia Mathematica 14 (2) (1987), 133-146. [Cited by 5] (0.26/year)
SIEGMUND, D. and B. YAKIR, 2000. Approximate p-Values for Local Sequence Alignments. The Annals of Statistics. [Cited by 47] (7.88/year)
SIMBERLOFF, D., 1983. Competition Theory, Hypothesis-Testing, and Other Community Ecological Buzzwords. The American Naturalist. [Cited by 57] (2.48/year)
SIVIA, D. S., 1996. Data Analysis: A Bayesian Tutorial.
SKIPPER, J.K., A.L. GUENTER and G. NASS, 1970. The sacredness of. 05: A note concerning the uses of statistical levels of significance in social social science. The American Sociologist 2 : 16-18. Reprinted in The Significance Test Controversy - A Reader, Eds. D. E. Morrison and R. E. Henkel, 1970, Aldine Publishing Company (Butterworth Group). [Cited by 16] (0.44/year)
SKOV, T., et al., International Journal of Epidemiology. Prevalence proportion ratios: estimation and hypothesis testing. [Cited by 91] (?/year)
SMITH, C. and G.M. ROSE, 2001. … the relationship between REM and memory consolidation: A need for scholarship and hypothesis testing. Behavioral and Brain Sciences. [Cited by 28] (5.64/year)
SMITH, C.A.B., 1960. Statistical Methods in Biology [book review]. Applied Statistics 9 : 64-66. [not cited] (0/year)
SMITH, C.J., et al., 2003. IARC carcinogens reported in cigarette mainstream smoke and their calculated log P values.. Food Chem Toxicol. [Cited by 8] (2.70/year)
SMITH, L.D., et al., 2000. Psychology without p values. Data analysis at the turn of the 19th century.. Am Psychol. [Cited by 8] (1.34/year)
SMITHSON, M., 2001. Correct Confidence Intervals for Various Regression Effect Sizes and Parameters: The Importance of …. Educational and Psychological Measurement. [Cited by 33] (6.64/year)
SNYDER, M. and W.B. SWANN, 1978. Hypothesis-testing processes in social interaction. Journal of Personality and Social Psychology. [Cited by 128] (4.58/year)
SOBER, E., Sex Ratio Theory, Ancient and Modern? - An 18 thCentury Debate about Intelligent Design and the Development of Models in Evolutionary Biology. philosophy.wisc.edu. [not cited] (?/year)
SPOKOINY, V.G., 1996. Adaptive Hypothesis Testing Using Wavelets. The Annals of Statistics. [Cited by 67] (6.72/year)
STAM, H.J. and G.A. PASAY, 2000. The historical case against null-hypothesis significance testing. Behavioral and Brain Sciences. [Cited by 2] (0.34/year)
STANOVICH, K.E., 1998. Individual Differences in Cognitive Biases. Psycoloquy. [Cited by 4] (0.50/year)
STEEL, M., P.J. LOCKHART and D. PENNY, 1995. A frequency-dependent significance test for parsimony.. Mol Phylogenet Evol. [Cited by 9] (0.82/year)
STEPHENS, P.A., et al., 2005. Information theory and hypothesis testing: a call for pluralism. Journal of Applied Ecology. [Cited by 10] (10.33/year)
STERNE, J.A.C., 2002. Teaching hypothesis tests-time for significant change?. Statistics in Medicine. [Cited by 7] (1.76/year)
STERNE, Jonathan, Commentary: Null points—has interpretation of significance tests improved? [Cited by 1]
STONE, M., 1969. The role of significance testing: some data with a message. Biometrika. [Cited by 8] (0.22/year)
STOOVE, M.A. and M.B. ANDERSEN, 2003. What are we looking at, and how big is it. Physical Therapy in Sport. [Cited by 2] (0.67/year)
STOREY, J.D. and D. SIEGMUND, 2001. Approximate P-Values for Local Sequence Alignments: Numerical Studies. Journal of Computational Biology. [Cited by 13] (2.62/year)
Street, D. J. (1990). Fisher's contributions to agricultural statistics. Biometrics 46 : 937-945. "Student" (1908). The probable error of a mean. Biometrika 6 : 1-25.
STREINER, D.L., 2003. Unicorns Do Exist: A Tutorial on “Proving” the Null Hypothesis. Can J Psychiatry. [Cited by 3] (1.01/year)
SUTER, G.W., 1996. Abuse of hypothesis testing statistics in ecological risk assessment. Human and Ecological Risk Assessment. [Cited by 27] (2.71/year)
THEILER, J. and D. PRICHARD, 1996. Constrained-realization Monte-Carlo method for hypothesis testing. Physica D. [Cited by 100] (10.03/year)
THOMPSON, B. and P.A. SNYDER, 1997. Statistical Significance Testing Practices in" The Journal of Experimental Education.".. Journal of Experimental Education. [Cited by 22] (2.45/year)
THOMPSON, B., 1994. The concept of statistical significance testing. Practical Assessment, Research & Evaluation. [Cited by 9] (0.75/year)
THOMPSON, B., 1996. AERA Editorial Policies regarding Statistical Significance Testing: Three Suggested Reforms. Educational Researcher. [Cited by 105] (10.53/year)
THOMPSON, B., 1997. Rejoinder: Editorial Policies regarding Statistical Significance Tests: Further Comments. Educational Researcher. [Cited by 25] (2.79/year)
THOMPSON, B., 1998. Statistical significance and effect size reporting: Portrait of a possible future. Research in the Schools. [Cited by 12] (1.51/year)
THOMPSON, B., 1999. If Statistical Significance Tests Are Broken/Misused, What Practices Should Supplement or Replace …. Theory & Psychology. [Cited by 31] (4.45/year)
THOMPSON, B., 1999. Statistical Significance Tests, Effect Size Reporting and the Vain Pursuit of Pseudo-objectivity. Theory & Psychology. [Cited by 9] (1.29/year)
THOMPSON, B., 2002. What future quantitative social science research could look like: Confidence intervals for effect …. Educational Researcher. [Cited by 66] (16.63/year)
THOMPSON, B.R., 1999. Journal Editorial Policies Regarding Statistical Significance Tests: Heat Is to Fire as p Is to …. Educational Psychology Review. [Cited by 1] (0.14/year)
THOMPSON, William L., Bill Thompson's References on Hypothesis Testing
THORBURN, Daniel, Significance testing, interval estimation or Bayesian inference: Comments to "Extracting a maximum of useful information from statistical research data" by S. Sohlberg and G. Andersson
TRAFIMOW, D., 2003. Hypothesis testing and theory evaluation at the boundaries: Surprising insights from Bayes' s …. Psychological Review. [Cited by 41] (13.81/year)
TROPE, Y. and A. LIBERMAN, 1996. Social hypothesis testing: Cognitive and motivational mechanisms. Social psychology: Handbook of basic principles. [Cited by 91] (9.13/year)
TSIATIS, A.A., 1982. Repeated Significance Testing for a General Class of Statistics Used in Censored Survival Analysis. Journal of the American Statistical Association. [Cited by 33] (1.38/year)
TSUI, K.W. and S. WEERAHANDI, 1989. Generalized P-Values in Significance Testing of Hypotheses in the Presence of Nuisance Parameters.. Journal of the American Statistical Association. [Cited by 55] (3.24/year)
TSUI, KW, and S. WEERAHANDI, Generalized p-values in significance testing of hypotheses in the presence of nuisance parameters, Journal of the American Statistical Association, 1989. [Cited by 41]
TUKEY, J.W., 1962. The Future of Data Analysis. The Annals of Mathematical Statistics, 33, 1-67. [Cited by 141] (3.21/year)
TUKEY, J.W., 1969. Analyzing data: Sanctification or detective work. American Psychologist, 24, 83-91. [Cited by 31] (0.84/year)
TUKEY, J.W., 1991. The Philosophy of Multiple Comparisons. Statistical Science, 6, 100-116. [Cited by 85] (5.68/year)
TVERSKY, A. and D. KAHNEMAN, 1971. Belief in the law of small numbers. Psychological Bulletin, 76, 105-110. [Cited by 371] (10.61/year)
UPPSALA, S. and L.L.C. PROTEOMETRICS, 2002. A model of random mass-matching and its use for automated significance testing in mass spectrometric …. Proteomics. [Cited by 19] (4.79/year)
UPTON, G.J.G., 1992. Fisher's Exact Test. Journal of the Royal Statistical Society. Series A ( …. 155(3) : 395-402. [Cited by 15] (1.07/year)
UTSU, T., 1966. A statistical significance test of the difference in b-value between two earthquake groups. J. Phys. Earth. [Cited by 12] (0.30/year)
VACHA-HAASE, T., 2001. STATISTICAL SIGNIFICANCE SHOULD NOT BE CONSIDERED ONE OF LIFE'S GUARANTEES: EFFECT SIZES ARE …. Educational and Psychological Measurement. [Cited by 7] (1.41/year)
VACHA-HAASE, T., et al., 2000. Reporting Practices and APA Editorial Policies Regarding Statistical Significance and Effect Size. Theory & Psychology. [Cited by 32] (5.36/year)
VARDEMAN, S.B., 1987. Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence: Comment. Journal of the American Statistical Association. [not cited] (0/year)
VARDEMAN, S.B., 1987. Comments on Testing a Point Null Hypothesis. J. Amer. Statis. Assoc. [Cited by 1] (0.05/year)
VASKE, J.J., 2002. Communicating Judgments About Practical Significance: Effect Size, Confidence Intervals and Odds …. Human Dimensions of Wildlife. [Cited by 24] (6.05/year)
VASKE, J.J., et al., 2006. Abstract View. Wildlife Society Bulletin. [not cited] (0/year)
VAUTARD, R., K. MO and M. GHIL, 1990. Statistical significance test for transition matrices of atmospheric Markov chains. Journal of the Atmospheric Sciences. [Cited by 25] (1.57/year)
VEKLEROV, E. and J. LLACER, 1987. Stopping rule for the MLE algorithm based on statistical hypothesis testing.. IEEE TRANS. MED. IMAG. [Cited by 66] (3.48/year)
VENN, J., 1889. Cambridge Anthropometry.. The Journal of the Anthropological Institute of Great …. 18 : 140-154. [Cited by 2] (0.02/year)
VICENTE, K.J. and G.L. TORENVLIET, 2000. The Earth is spherical(p < 0. 05): alternative methods of statistical inference. Theoretical Issues in Ergonomics Science. [Cited by 6] (1.01/year)
VICENTE, Kim J. and Gerard L. TORENVLIET, The Earth is spherical (p < 0.05): alternative methods of statistical inference [Cited by 3]
VOGELSANG, T.J., 1998. Trend Function Hypothesis Testing in the Presence of Serial Correlation. Econometrica. [Cited by 54] (6.78/year)
VOKEY, J.R., 2000. Statistics without probability: Significance testing as typicality and exchangeability in data …. Behavioral and Brain Sciences. [Cited by 4] (0.67/year)
WAINER, H. and D.H. ROBINSON, 2003. Shaping Up the Practice of Null Hypothesis Significance Testing. Educational Researcher. [Cited by 8] (2.70/year)
WAINER, H., "One cheer for null hypothesis significance testing", Psychological Methods, 1999 [Cited by 15]
WAKELING, I.N., M.M. RAATS and H.J.H. MACFIE, 1992. A new significance test for consensus in generalized procrustes analysis. Journal of sensory studies. [Cited by 10] (0.72/year)
WAKKER, Peter P., 1999. Justifying Bayesianism by Dynamic Decision Principles. plenary paper presented at FUR IX, Marrakesh. [Cited by 4] (0.57/year)
WALKER, D.A., 2004. The Importance of Drawing Meaningful Conclusions from Data: A Review of the Literature with Meta- …. NASPA Journal. [Cited by 1] (0.51/year)
WALLIS, W.A. and G.H. MOORE, 1941. A Significance Test for Time Series Analysis. Journal of the American Statistical Association. [Cited by 10] (0.15/year)
WANG, C., 1992. Sense and Nonsense of Statistical Inference: Controversy: Misuse, and Subtlety. books.google.com. [Cited by 13] (0.93/year)
WARE, J.H., et al., 1992. P values. Bailar JC II, Mosteller F (eds): Medical Uses of Statistics. …. [Cited by 15] (1.07/year)
WARREN, W.G., 1986. On the presentation of statistical analysis: reason or ritual. Canadian journal of forest research(Print). 16 : 1185-1191. [Cited by 15] (0.75/year)
WEBBER, C. and G.J. BARTON, 2001. Estimation of P-values for global alignments of protein sequences. Bioinformatics. [Cited by 11] (2.21/year)
WEERAHANDI, S., 1991. Testing Variance Components in Mixed Models With Generalized p Values. Journal of the American Statistical Association. [Cited by 16] (1.07/year)
WEINBERG, C.R., 2001. It's time to rehabilitate the P-value. Epidemiology. [Cited by 12] (2.42/year)
WESTAD, F. and H. MARTENS, 2000. Variable selection in near infrared spectroscopy based on significance testing in partial least …. J. Near Infrared Spectrosc. [Cited by 31] (5.20/year)
WESTFALL, P.H. and S.S. YOUNG, 1989. p Value Adjustments for Multiple Tests in Multivariate Binomial Models. Journal of the American Statistical Association. [Cited by 35] (2.06/year)
WESTFALL, P.H. and S.S. YOUNG, 1993. Resampling-based multiple testing: examples and methods for p-value adjustment. John Wiley & Sons. [Cited by 346] (26.68/year)
WESTFALL, P.H., S.S. YOUNG and S.P. WRIGHT, 1993. On Adjusting P-Values for Multiplicity. Biometrics. [Cited by 10] (0.77/year)
WHITTLE, P., 1951. Hypothesis testing in time series analysis. Uppsala, Almqvist & Wiksells boktr. [Cited by 71] (1.29/year)
WILCOX, R.R., 1997. Introduction to robust estimation and hypothesis testing. Academic Press San Diego, CA. [Cited by 145] (16.17/year)
WILKINSON…, L., 1999. Statistical methods in psychology journals: Guidelines and explanations. American Psychologist. [Cited by 358] (51.38/year)
WILLIAMS, L.J. and B.K. BROWN, 1994. … and human resources research: Effects on correlations, path coefficients, and hypothesis testing. Organizational Behavior and Human Decision Processes. [Cited by 26] (2.17/year)
WINDIG, J.J., 1997. The calculation and significance testing of genetic correlations across environments. J. Evol. Biol. [Cited by 27] (3.01/year)
WOODS, S.P.V., M.V. WEINBORN and D.W.V. LOVEJOY, 2003. Are Classification Accuracy Statistics Underused in Neuropsychological Research?. Journal of Clinical and Experimental Neuropsychology. [Cited by 6] (2.02/year)
WRIGHT, D.B., 2002. First Steps in Statistics. books.google.com. [Cited by 7] (1.76/year)
WRIGHT, D.B., 2003. Making friends with your data: Improving how statistics are conducted and reported. British Journal of Educational Psychology. [Cited by 15] (5.05/year)
WRIGHT, S.P., 1992. Adjusted P-Values for Simultaneous Inference. Biometrics. [Cited by 163] (11.67/year)
WRIGHT, S.P., 1992. Adjusted P-values for simultaneous interference. Biometrics. [Cited by 10] (0.72/year)
YATES, F., 1951. The Influence of Statistical Methods for Research Workers on the Development of the Science of …. Journal of the American Statistical Association 46 : 19-34. [Cited by 21] (0.38/year)
YATES, F., 1964. Sir Ronald Fisher and the Design of Experiments. Biometrics 20 : 307-321. [Cited by 4] (0.10/year)
ZAKZANIS, K.K.V., 1998. Brain is Related to Behavior (p<. 05). Journal of Clinical and Experimental Neuropsychology. [Cited by 15] (1.88/year)
ZAR, J.H., 1972. Significance Testing of the Spearman Rank Correlation Coefficient. Journal of the American Statistical Association. [Cited by 17] (0.50/year)
ZEISEL, H., 1955. The Significance of Insignificant Differences. Public Opinion Quarterly 17 : 319-321. [Cited by 1] (0.02/year)
ZELLNER, Arnold, 1999. Bayesian and Non-bayesian Approaches to Scientific Modeling and Inference in Economics and …. ideas.repec.org. [Cited by 4] (0.57/year)
ZHOU, L. and T. MATHEW, 1994. Some Tests for Variance Components Using Generalized p Values. Technometrics. [Cited by 27] (2.26/year)