Introduction
Register
SAS Help
About UC
Home

Samples, Statistics and Parameters

The totality of observations with which we are concerned, whether finite or infinite, constitutes what we call a population. Once, the word population referred to observations on people, but today it applies to measurements of any entities of interest, whether it be people, animals, plants or objects. The number of entities that comprise the population is called the size of the population which, in many circumstances may be regarded as infinite.

A sample is a subset of the population. A random sample is a sample taken in such a way that each element of the population has the same probability of being selected. We take random samples to ensure that our samples are representative of the population from which they are taken, so that what we learn from study of the samples will be more or less true of the populations themselves. Statistics calculated from random samples provide unbiassed estimates of the corresponding true values for the population.

Samples should be considered fuzzy snapshots of the populations from which they are drawn, with the degree of fuzziness diminishing as the intensity of sampling or the sample size increases.

Were we to be examining an entire population, the average that we calculate would be called the population mean. The population mean is a fixed figure characteristic of the population. It is not subject to variation -- no matter who calculates such a mean or how many times it is calculated, barring mistakes, the same value will be obtained in each case. As such the population mean is called a parameter.

Alternatively, time or resources may not permit an examination of the entire population, and we might choose instead to select and examine a random sample of say 100 items. We could then measure them and calculate the sample mean. The difference between the sample mean and the population mean is that the sample mean is subject to natural variation or, as it is called, sampling error. If we were to repeat our sampling procedure by selecting another 100 items, we would obtain a second value for the mean that differed from the first one, and if we repeated the procedure once more, a third value would most likely result. For this reason, the sample mean is called a statistic. The sample mean is said to estimate the population mean. Statistics, calculated from samples, are estimates of true population parameters. The estimation improves as sample size increases.

When we study samples, we are seldom directly interested in them per se. We study them to learn something of the population from which the samples were drawn. We infer properties of the entire population, which we have not studied in its entirety, from our detailed knowledge of a sample of observations. The convenience of studying finite samples rather than the population as a whole comes at a cost. Samples, because they are finite and often relatively small, are somewhat akin to a fuzzy snapshot -- the general impression of the population is evident, but the sample is an inexact representation. No matter how intensively we study the sample, there will be a level of uncertainty in what we discover, if we try to extrapolate our findings to the entire population. This uncertainty is often referred to as sampling error. Sampling error has important practical consequences namely,

Sample statistics will typically differ somewhat from the corresponding true values for the entire population. Estimating by how much they differ is a problem addressed under the heading of parameter estimation.

Any two samples, even if taken from identical populations, will differ typically in all of their statistics. Determining whether the observed difference in sample statistics is great enough to conclude that the true population values differ is a problem addressed under the heading of hypothesis testing.
Because they deal with making inferences about population parameters on the basis of sample statistics, parameter estimation and hypothesis testing are grouped under the broader heading of statistical inference.

Use the BACK key
of your browser
to return to the
questionnaire


Refer to
Course
Notes
[PDF]


SAS
Online Help



Last modified Sunday May 26, 2002