Sampling

Why is probability relevant to inferential statistics? Statistics are, in one sense, all about probabilities. Inferential statistics deal with establishing whether differences or associations exist between sets of data. The data comes from the sample we use, and the sample is taken from a population.

So we need to think about whether the sample represents the population from which it has been taken. The larger the sample we take the greater the probability that it is representative of the population. If we took the whole population for our study the probability would = 1 since the sample = the population.

A sample smaller than the whole population means that we cannot guarantee that it is similar to the population. There is a probability that it is not. We want to keep this probability of sampling error as small as possible, so researchers often set a limit of probability (p) of a sampling error at no more than 0.05. Some studies might be more stringent and set the chance of a sampling error at 0.01. And in very important studies where you want to be reasonably certain there is little chance of error - say, testing new drugs, some researchers may even use a probability of error being very small indeed at 0.001, saying that the chance of an error is one in a thousand.