Refine
Has Fulltext
- no (3)
Document Type
- Article (3)
Language
- English (3) (remove)
Is part of the Bibliography
- yes (3)
Keywords
- Bayes factors (1)
- Bayesian inference (1)
- Bayesian model comparison (1)
- Null hypothesis significance testing (1)
- a priori (1)
- contrasts (1)
- hypotheses (1)
- linear models (1)
- null hypothesis significance testing (1)
- posterior (1)
Institute
- Department Linguistik (3) (remove)
Factorial experiments in research on memory, language, and in other areas are often analyzed using analysis of variance (ANOVA). However, for effects with more than one numerator degrees of freedom, e.g., for experimental factors with more than two levels, the ANOVA omnibus F-test is not informative about the source of a main effect or interaction. Because researchers typically have specific hypotheses about which condition means differ from each other, a priori contrasts (i.e., comparisons planned before the sample means are known) between specific conditions or combinations of conditions are the appropriate way to represent such hypotheses in the statistical model. Many researchers have pointed out that contrasts should be "tested instead of, rather than as a supplement to, the ordinary 'omnibus' F test" (Hays, 1973, p. 601). In this tutorial, we explain the mathematics underlying different kinds of contrasts (i.e., treatment, sum, repeated, polynomial, custom, nested, interaction contrasts), discuss their properties, and demonstrate how they are applied in the R System for Statistical Computing (R Core Team, 2018). In this context, we explain the generalized inverse which is needed to compute the coefficients for contrasts that test hypotheses that are not covered by the default set of contrasts. A detailed understanding of contrast coding is crucial for successful and correct specification in linear models (including linear mixed models). Contrasts defined a priori yield far more useful confirmatory tests of experimental hypotheses than standard omnibus F-tests. Reproducible code is available from https://osf.io/7ukf6/.
When researchers carry out a null hypothesis significance test, it is tempting to assume that a statistically significant result lowers Prob(H0), the probability of the null hypothesis being true. Technically, such a statement is meaningless for various reasons: e.g., the null hypothesis does not have a probability associated with it. However, it is possible to relax certain assumptions to compute the posterior probability Prob(H0) under repeated sampling. We show in a step-by-step guide that the intuitively appealing belief, that Prob(H0) is low when significant results have been obtained under repeated sampling, is in general incorrect and depends greatly on: (a) the prior probability of the null being true; (b) type-I error rate, (c) type-II error rate, and (d) replication of a result. Through step-by-step simulations using open-source code in the R System of Statistical Computing, we show that uncertainty about the null hypothesis being true often remains high despite a significant result. To help the reader develop intuitions about this common misconception, we provide a Shiny app (https://danielschad.shinyapps.io/probnull/). We expect that this tutorial will help researchers better understand and judge results from null hypothesis significance tests.
Inferences about hypotheses are ubiquitous in the cognitive sciences. Bayes factors provide one general way to compare different hypotheses by their compatibility with the observed data. Those quantifications can then also be used to choose between hypotheses. While Bayes factors provide an immediate approach to hypothesis testing, they are highly sensitive to details of the data/model assumptions and it's unclear whether the details of the computational implementation (such as bridge sampling) are unbiased for complex analyses. Hem, we study how Bayes factors misbehave under different conditions. This includes a study of errors in the estimation of Bayes factors; the first-ever use of simulation-based calibration to test the accuracy and bias of Bayes factor estimates using bridge sampling; a study of the stability of Bayes factors against different MCMC draws and sampling variation in the data; and a look at the variability of decisions based on Bayes factors using a utility function. We outline a Bayes factor workflow that researchers can use to study whether Bayes factors are robust for their individual analysis. Reproducible code is available from haps://osf.io/y354c/. <br /> Translational Abstract <br /> In psychology and related areas, scientific hypotheses are commonly tested by asking questions like "is [some] effect present or absent." Such hypothesis testing is most often carried out using frequentist null hypothesis significance testing (NIIST). The NHST procedure is very simple: It usually returns a p-value, which is then used to make binary decisions like "the effect is present/abscnt." For example, it is common to see studies in the media that draw simplistic conclusions like "coffee causes cancer," or "coffee reduces the chances of geuing cancer." However, a powerful and more nuanced alternative approach exists: Bayes factors. Bayes factors have many advantages over NHST. However, for the complex statistical models that arc commonly used for data analysis today, computing Bayes factors is not at all a simple matter. In this article, we discuss the main complexities associated with computing Bayes factors. This is the first article to provide a detailed workflow for understanding and computing Bayes factors in complex statistical models. The article provides a statistically more nuanced way to think about hypothesis testing than the overly simplistic tendency to declare effects as being "present" or "absent".