Conclusions and Recommendations The final section presents the conclusions of the Task Force. Those conclusions are summarized below. Great advances of the most successful sciences - astronomy, physics, chemistry - were and are, achieved without probability sampling. Statistical inference in these researches is based on subjective judgment about the presence of adequate, automatic, and natural randomization in the population.

No clear rule exists for deciding exactly when probability sampling is necessary, and what price should be paid for it. Probability sampling for randomization is not a dogma, but a strategy, especially for large numbers.

For example, although psychologists sometimes data from nationally representative probability samples, it is far more common for their studies to be based on convenience samples of college students. In the mids, David Sears raised concerns about this. The picture had not changed when he examined papers in the same journals published five years later. Although this reliance on unrepresentative samples may have its weaknesses, as Sears argued, psychologists have continued to use samples of convenience for most of their research.

Sears was concerned more about the population from which the subjects in psychology experiments were drawn than about the method for selecting them, but it is clear that even the population of undergraduates is likely not represented well in psychology experiments. The participants in psychology experiments are self-selected in various ways, ranging from their decision to go to particular colleges, to enroll in specific courses, and to volunteer and show up for a given experiment.

For example, the vast majority of psychological experiments are not used to estimate a mean or proportion for some particular finite population — which is the usual situation with surveys based on probability samples — but are used to determine whether the differences across two or more experimental groups are reliably different from zero. In still other cases, no quantitative conclusions are drawn.

In general, they differ from the other methods described in this report in two important ways. Second, they tend to avoid surveys or direct questioning of respondents about their attitudes and behaviors to avoid the data collection costs, instead trying to infer these attributes in other ways.

The typical goal of evaluation Probaiblity is to establish a causal relationship between a treatment and an outcome e. The gold standard of evaluation research is randomized controlled trials, which are characterized by random assignment of subjects to a treatment group or to one of multiple treatment groups or to a control group Shadish et al.]

