free booknotes online

Help / FAQ



8.4 Sampling Distribution

Consider a finite population of 4000 students of a certain college. Of this 200 students are selected randomly and their mean weight is calculated. It was computed as x = 135 lbs. If we draw 20 different samples of 200 students, instead of just one as we did previously. We can’t expect the same mean weight of these 20 samples chosen randomly. It is due to sampling variability i.e. each sample drawn randomly from the same population differs from each other in their computed 'statistic'.

The means computed is listed as

130

137

140

131

132

133

128

125

129

138

133

134

142

145

136

134

128

143

137

136

These 20 values form what is known as a part of the sampling distribution of a statistic "mean" weight. Therefore, the sampling distribution is the statistic calculated for a large number of random samples drawn from the same population.

Alternatively, consider all possible samples of size 'n' which can be drawn randomly from the same population, and for each one, compute the aimed statistic. In this manner we obtain the distribution of the statistic, which is the sampling distribution.

Now we compute mean of this sampling distribution which is 134.55 lb (note that is obtained using the formula

Now we are told that the population mean ( m ) i.e. mean weight of 4000 students is 134.21 lb (In case of infinite population, this is not known). Therefore, 135 lb is not a bad estimate of the population mean. But 134.55 lb is a still better estimation. We thus conclude that the more the sample means (statistics) included in the sampling distribution, the more correct the chances of the mean of the sampling distribution becoming an estimate of the population mean (parameter). The following graphical representation supports this argument.

Click here to enlarge

Now recall the list of means computed for 20 random samples. We observe that the 20 sample means differed form the true population mean m = 134.21 lbs. The means 137, 140, 138, 133, 142, 142, 136, 143, 147, 136 are clearly above the true mean while 130, 131, 133, 132, 128, 125, 129, 133, 134, 134, 128 are below the true mean. These deviations or differences from true value is due to the variability of samples. This is known as 'Random error' in statistical estimation. Note that the mean of 20 random sample means is slightly greater than the true population mean. If we take 20 more samples of 200 students, the mean of this new sampling distribution might be lower than the true population mean.


Other than random error, error arises due to the tendency of consistent under estimating or over estimating a true value. This is known as systematic error or bias. Suppose, in the example discussed above, an investigator is investigating only those students who are healthy and fit. He would most likely overestimate the population mean. Generally healthy bodies have reasonable weights. Clearly the investigator's choice is one of bias.

This is an example from one of the bias that occurs in estimation.

Click here to enlarge

Click here to enlarge

Index

8.1 Population
8.2 Sample
8.3 Parameters and Statistic
8.4 Sampling Distribution
8.5 Sampling Error
8.6 Central Limit Theorem
8.7 Critical Region
8.8 Testing of Hypothesis
8.9 Errors in Tesitng of Hypothesis
8.10 Power o a Hypothesis Test
8.11 Sampling of Variables
8.12 Sampling of Attributes
8.13 Estimation
8.14 Testing the Difference Between Means
8.15 Test for Difference Between Proportions
8.16 Two Tailed and one Tailed Tests
8.17 Test of Significance for Small Samples
8.18 Students t-distribution
8.19 Distribution of 't' for Comparison of Two Samples Means Independent Samples
8.20 Testing Difference Between Mens of Two Samples Dependent Samples or Matched Paired Observations
8.21 Chi-Square
8.22 Sampling Theory of Correlation
8.23 Sampling Theory of Regression

Chapter 1





All Contents Copyright © All rights reserved.
Further Distribution Is Strictly Prohibited.


Search:
Keywords:
In Association with Amazon.com