It is important for you to understand when to use the central limit theorem. If you are being asked to find the probability of the mean, use the clt for the mean. If you are being asked to find the probability of a sum or total, use the clt for sums. This also applies to percentiles for means and sums.
If you are being asked to find the probability of an individual value, do not use the clt. Use the distribution of its random variable.
The law of large numbers says that if you take samples of larger and larger size from any population, then the mean
of the sample tends to get closer and closer to μ. From the central limit theorem, we know that as n gets larger and larger, the sample means follow a normal distribution. The larger n gets, the smaller the standard deviation gets. (Remember that the standard deviation for
is
.) This means that the sample mean
must be close to the population mean μ. We can say that μ is the value that the sample means approach as n gets larger. The central limit theorem illustrates the law of large numbers.
A study involving stress is conducted among the students on a college campus. The stress scores follow a **uniform distribution** with the lowest stress score equal to one and the highest equal to five. Using a sample of 75 students, find:
Let X = one stress score.
Problems a and b ask you to find a probability or a percentile for a mean. Problems c and d ask you to find a probability or a percentile for a total or sum. The sample size, n, is equal to 75.
Since the individual stress scores follow a uniform distribution, X ~ U(1, 5) where a = 1 and b = 5 (See Continuous Random Variables for an explanation on the uniform distribution).
μX =
=
= 3
σX =
=
= 1.15
For problems a. and b., let
= the mean stress score for the 75 students. Then,
∼ N
a. Find P(
< 2). Draw the graph.
a. P(
< 2) = 0
The probability that the mean stress score is less than two is about zero.
{:}
normalcdf
= 0
The smallest stress score is one.
b. Find the 90th percentile for the mean of 75 stress scores. Draw a graph.
b. Let k = the 90th precentile.
Find k, where P(
< k) = 0.90.
k = 3.2
The 90th percentile for the mean of 75 scores is about 3.2. This tells us that 90% of all the means of 75 stress scores are at most 3.2, and that 10% are at least 3.2.
invNorm
= 3.2
For problems c and d, let ΣX = the sum of the 75 stress scores. Then, ΣX ~ N[(75)(3),
(1.15)]
c. Find P(Σx < 200). Draw the graph.
c. The mean of the sum of 75 stress scores is (75)(3) = 225
The standard deviation of the sum of 75 stress scores is
(1.15) = 9.96
P(Σx < 200) = 0
{:}
The probability that the total of 75 scores is less than 200 is about zero.
normalcdf
(75,200,(75)(3),
(1.15)).
The smallest total of 75 stress scores is 75, because the smallest single score is one.
d. Find the 90th percentile for the total of 75 stress scores. Draw a graph.
d. Let k = the 90th percentile.
Find k where P(Σx < k) = 0.90.
k = 237.8
{:}
The 90th percentile for the sum of 75 scores is about 237.8. This tells us that 90% of all the sums of 75 scores are no more than 237.8 and 10% are no less than 237.8.
invNorm
(0.90,(75)(3),
(1.15)) = 237.8
Use the information in [link], but use a sample size of 55 to answer the following questions.
< 7).
Suppose that a market research analyst for a cell phone company conducts a study of their customers who exceed the time allowance included on their basic cell phone contract; the analyst finds that for those people who exceed the time included in their basic contract, the excess time used follows an exponential distribution with a mean of 22 minutes.
Consider a random sample of 80 customers who exceed the time allowance included in their basic cell phone contract.
Let X = the excess time used by one INDIVIDUAL cell phone customer who exceeds his contracted time allowance.
X ∼ Exp
. From previous chapters, we know that μ = 22 and σ = 22.
Let
= the mean excess time used by a sample of n = 80 customers who exceed their contracted time allowance.
~ N
by the central limit theorem for sample means
> 20). Draw the graph.
Find: P(
> 20)
P(
> 20) = 0.79199 using normalcdf
The probability is 0.7919 that the mean excess time used is more than 20 minutes, for a sample of 80 customers who exceed their contracted time allowance.
{:}
1E99 = 1099 and –1E99 = –1099. Press the EE
key for E. Or just use 1099 instead of 1E99.
Find P(x > 20). Remember to use the exponential distribution for an individual:
.* * *
or e(–0.04545(20)) = 0.4029
> 20) = 0.7919
Using the clt to find percentilesFind the 95th percentile for the sample mean excess time for samples of 80 customers who exceed their basic contract time allowances. Draw a graph.
Let k = the 95th percentile. Find k where P(
< k) = 0.95
k = 26.0 using invNorm
= 26.0
{:}
The 95th percentile for the sample mean excess time used is about 26.0 minutes for random samples of 80 customers who exceed their contractual allowed time.
Ninety five percent of such samples would have means under 26 minutes; only five percent of such samples would have means above 26 minutes.
Use the information in [link], but change the sample size to 144.
< 30).
In the United States, someone is sexually assaulted every two minutes, on average, according to a number of studies. Suppose the standard deviation is 0.5 minutes and the sample size is 100.
=
= 0.05. Therefore:
(σx) = 10(0.5) = 5. Therefore
< 1.85) = normalcdf
(1.75,1.85,2,0.05) = 0.0013
, and solving for x, we have x = 2(0.05) + 2 = 2.1
Based on data from the National Health Survey, women between the ages of 18 and 24 have an average systolic blood pressures (in mm Hg) of 114.8 with a standard deviation of 13.1. Systolic blood pressure for women between the ages of 18 to 24 follow a normal distribution.
A study was done about violence against prostitutes and the symptoms of the posttraumatic stress that they developed. The age range of the prostitutes was 14 to 61. The mean age was 30.9 years with a standard deviation of nine years.
< 35) = normalcdf
(-E99,35,30.9,1.8) = 0.9886
> 50) = normalcdf
(50, E99,30.9,1.8) ≈ 0. For this sample group, it is almost impossible for the group’s average age to be more than 50. However, it is still possible for an individual in this group to have an age greater than 50.
normalcdf
(1600,E99,1514.10,63) = 0.0864normalcdf
(-E99,1595,1514.10,63) = 0.9005. This means that there is a 90% chance that the sum of the ages for the sample group n = 49 is at most 1595.invNorm
(0.95,30.9,1.1) = 32.7. This indicates that 95% of the prostitutes in the sample of 65 are younger than 32.7 years, on average.invNorm
(0.90,2008.5,72.56) = 2101.5. This indicates that 90% of the prostitutes in the sample of 65 have a sum of ages less than 2,101.5 years.According to Boeing data, the 757 airliner carries 200 passengers and has doors with a height of 72 inches. Assume for a certain population of men we have a mean height of 69.0 inches and a standard deviation of 2.8 inches.
: Normal Approximation to the Binomial
Historically, being able to compute binomial probabilities was one of the most important applications of the central limit theorem. Binomial probabilities with a small value for n(say, 20) were displayed in a table in a book. To calculate the probabilities with large values of n, you had to use the binomial formula, which could be very complicated. Using the normal approximation to the binomial distribution simplified the process. To compute the normal approximation to the binomial distribution, take a simple random sample from a population. You must meet the conditions for a binomial distribution:
Recall that if X is the binomial random variable, then X ~ B(n, p). The shape of the binomial distribution needs to be similar to the shape of the normal distribution. To ensure this, the quantities np and nq must both be greater than five (np > 5 and nq > 5; the approximation is better if they are both greater than or equal to 10). Then the binomial can be approximated by the normal distribution with mean μ = np and standard deviation σ =
. Remember that q = 1 – p. In order to get the best approximation, add 0.5 to x or subtract 0.5 from x (use x + 0.5 or x – 0.5). The number 0.5 is called the continuity correction factor and is used in the following example.
Suppose in a local Kindergarten through 12th grade (K - 12) school district, 53 percent of the population favor a charter school for grades K through 5. A simple random sample of 300 is surveyed.
Let X = the number that favor a charter school for grades K trough 5. X ~ B(n, p) where n = 300 and p = 0.53. Since np > 5 and nq > 5, use the normal approximation to the binomial. The formulas for the mean and standard deviation are μ = np and σ =
. The mean is 159 and the standard deviation is 8.6447. The random variable for the normal distribution is Y. Y ~ N(159, 8.6447). See The Normal Distribution for help with calculator instructions.
For part a, you include 150 so P(X ≥ 150) has normal approximation P(Y ≥ 149.5) = 0.8641.
normalcdf
(149.5,10^99,159,8.6447) = 0.8641.
For part b, you include 160 so P(X ≤ 160) has normal appraximation P(Y ≤ 160.5) = 0.5689.
normalcdf
(0,160.5,159,8.6447) = 0.5689
For part c, you exclude 155 so P(X > 155) has normal approximation P(y > 155.5) = 0.6572.
normalcdf
(155.5,10^99,159,8.6447) = 0.6572.
For part d, you exclude 147 so P(X < 147) has normal approximation P(Y < 146.5) = 0.0741.
normalcdf
(0,146.5,159,8.6447) = 0.0741
For part e,P(X = 175) has normal approximation P(174.5 < Y < 175.5) = 0.0083.
normalcdf
(174.5,175.5,159,8.6447) = 0.0083
Because of calculators and computer software that let you calculate binomial probabilities for large values of n easily, it is not necessary to use the the normal approximation to the binomial distribution, provided that you have access to these technology tools. Most school labs have Microsoft Excel, an example of computer software that calculates binomial probabilities. Many students have access to the TI-83 or 84 series calculators, and they easily calculate probabilities for the binomial distribution. If you type in "binomial probability distribution calculation" in an Internet browser, you can find at least one online calculator for the binomial.
For [link], the probabilities are calculated using the following binomial distribution: (n = 300 and p = 0.53). Compare the binomial and normal distribution answers. See Discrete Random Variables for help with calculator instructions for the binomial.
P(X ≥ 150) :1 - binomialcdf
(300,0.53,149) = 0.8641
P(X ≤ 160) :binomialcdf
(300,0.53,160) = 0.5684
P(X > 155) :1 - binomialcdf
(300,0.53,155) = 0.6576
P(X < 147) :binomialcdf
(300,0.53,146) = 0.0742
P(X = 175) :(You use the binomial pdf.)binomialpdf
(300,0.53,175) = 0.0083
In a city, 46 percent of the population favor the incumbent, Dawn Morgan, for mayor. A simple random sample of 500 is taken. Using the continuity correction factor, find the probability that at least 250 favor Dawn Morgan for mayor.
Data from the Wall Street Journal.
“National Health and Nutrition Examination Survey.” Center for Disease Control and Prevention. Available online at http://www.cdc.gov/nchs/nhanes.htm (accessed May 17, 2013).
The central limit theorem can be used to illustrate the law of large numbers. The law of large numbers states that the larger the sample size you take from a population, the closer the sample mean
gets to μ.
Use the following information to answer the next ten exercises: A manufacturer produces 25-pound lifting weights. The lowest actual weight is 24 pounds, and the highest is 26 pounds. Each weight is equally likely so the distribution of weights is uniform. A sample of 100 weights is taken.
Draw the graph from [link]
Find the probability that the mean actual weight for the 100 weights is greater than 25.2.
0.0003
Draw the graph from [link]
Find the 90th percentile for the mean weight for the 100 weights.
25.07
Draw the graph from [link]
Draw the graph from [link]
Find the 90th percentile for the total weight of the 100 weights.
2,507.40
Draw the graph from [link]
Use the following information to answer the next five exercises:* The length of time a particular smartphone's battery lasts follows an exponential distribution with a mean of ten months. A sample of 64 of these smartphones is taken.
What is the distribution for the length of time one battery lasts?
What is the distribution for the mean length of time 64 batteries last?
N
What is the distribution for the total length of time 64 batteries last?
Find the probability that the sample mean is between seven and 11.
0.7799
Find the 80th percentile for the total length of time 64 batteries last.
Find the IQR for the mean amount of time 64 batteries last.
1.69
Find the middle 80% for the total amount of time 64 batteries last.
Use the following information to answer the next eight exercises:* A uniform distribution has a minimum of six and a maximum of ten. A sample of 50 is taken.
Find P(Σx > 420).
0.0072
Find the 90th percentile for the sums.
Find the 15th percentile for the sums.
391.54
Find the first quartile for the sums.
Find the third quartile for the sums.
405.51
Find the 80th percentile for the sums.
The attention span of a two-year-old is exponentially distributed with a mean of about eight minutes. Suppose we randomly survey 60 two-year-olds.
= \_\_\_\_\_\_\_\_\_\_\_\_
~ \_\_\_\_\_(\_\_\_\_\_,\_\_\_\_\_)
is not exponential.
The closing stock prices of 35 U.S. semiconductor manufacturers are given as follows.
8.625 30.25 27.625 46.75 32.875 18.25 5 0.125 2.9375 6.875 28.25 24.25 21 1.5 30.25 71 43.5 49.25 2.5625 31 16.5 9.5 18.5 18 9 10.5 16.625 1.25 18 12.87 7 12.875 2.875 60.25 29.25
= _____
= _____
~ _____(_____,____)
Use the following information to answer the next three exercises:* Richard’s Furniture Company delivers furniture from 10 A.M. to 2 P.M. continuously and uniformly. We are interested in how long (in hours) past the 10 A.M. start time that individuals wait for their delivery.
Χ ~ \_\_\_\_\_(\_\_\_\_\_,\_\_\_\_\_)
The average wait time is:
b
Suppose that it is now past noon on a delivery day. The probability that a person must wait at least one and a half more hours is:
Use the following information to answer the next two exercises: The time to wait for a particular rural bus is distributed uniformly from zero to 75 minutes. One hundred riders are randomly sampled to learn how long they waited.
The 90th percentile sample average wait time (in minutes) for a sample of 100 riders is:
b
Would you be surprised, based upon numerical calculations, if the sample average wait time (in minutes) for 100 riders was less than 30 minutes?
Use the following to answer the next two exercises:* The cost of unleaded gasoline in the Bay Area once followed an unknown distribution with a mean of $4.59 and a standard deviation of $0.10. Sixteen gas stations from the Bay Area are randomly chosen. We are interested in the average cost of gasoline for the 16 gas stations.
What’s the approximate probability that the average price for 16 gas stations is over $4.69?
a
Find the probability that the average price for 30 gas stations is less than $4.55.
Suppose in a local Kindergarten through 12th grade (K - 12) school district, 53 percent of the population favor a charter school for grades K through five. A simple random sample of 300 is surveyed. Calculate following using the normal approximation to the binomial distribtion.
If you have access to an appropriate calculator or computer software, try calculating these probabilities using the technology.
Four friends, Janice, Barbara, Kathy and Roberta, decided to carpool together to get to school. Each day the driver would be chosen by randomly selecting one of the four names. They carpool to school for 96 days. Use the normal approximation to the binomial to calculate the following probabilities. Round the standard deviation to four decimal places.
X ~ N(60, 9). Suppose that you form random samples of 25 from this distribution. Let
be the random variable of averages. Let ΣX be the random variable of sums. For parts c through f, sketch the graph, shade the region, label and scale the horizontal axis for
, and find the probability.
on the same graph.
~ _____(_____,_____)
< 60) = _____
< 62) = _____
< 58) = _____
~ N
Suppose that the length of research papers is uniformly distributed from ten to 25 pages. We survey a class in which 55 research papers were turned in to a professor. The 55 research papers are considered a random collection of all papers. We are interested in the average length of the research papers.
= \_\_\_\_\_\_\_\_\_\_\_\_\_\_
~ \_\_\_\_\_(\_\_\_\_\_,\_\_\_\_\_)
Salaries for teachers in a particular elementary school district are normally distributed with a mean of $44,000 and a standard deviation of $6,500. We randomly survey ten teachers from that district.
The average length of a maternity stay in a U.S. hospital is said to be 2.4 days with a standard deviation of 0.9 days. We randomly survey 80 women who recently bore children in a U.S. hospital.
= \_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_\_
~ \_\_\_\_\_(\_\_\_\_\_,\_\_\_\_\_)
For each problem, wherever possible, provide graphs and use the calculator.
NeverReady batteries has engineered a newer, longer lasting AAA battery. The company claims this battery has an average life span of 17 hours with a standard deviation of 0.8 hours. Your statistics class questions this claim. As a class, you randomly select 30 batteries and find that the sample mean life span is 16.7 hours. If the process is working properly, what is the probability of getting a random sample of 30 batteries in which the sample mean lifetime is 16.7 hours or less? Is the company’s claim reasonable?
= 16.7, and n = 30. To calculate the probability, we use normalcdf
(lower, upper, μ,
) = normalcdf
= 0.0200.
Men have an average weight of 172 pounds with a standard deviation of 29 pounds.
M&M candies large candy bags have a claimed net weight of 396.9 g. The standard deviation for the weight of the individual candies is 0.017 g. The following table is from a stats experiment conducted by a statistics class.
Red | Orange | Yellow | Brown | Blue | Green |
---|---|---|---|---|---|
0.751 | 0.735 | 0.883 | 0.696 | 0.881 | 0.925 |
0.841 | 0.895 | 0.769 | 0.876 | 0.863 | 0.914 |
0.856 | 0.865 | 0.859 | 0.855 | 0.775 | 0.881 |
0.799 | 0.864 | 0.784 | 0.806 | 0.854 | 0.865 |
0.966 | 0.852 | 0.824 | 0.840 | 0.810 | 0.865 |
0.859 | 0.866 | 0.858 | 0.868 | 0.858 | 1.015 |
0.857 | 0.859 | 0.848 | 0.859 | 0.818 | 0.876 |
0.942 | 0.838 | 0.851 | 0.982 | 0.868 | 0.809 |
0.873 | 0.863 | 0.803 | 0.865 | ||
0.809 | 0.888 | 0.932 | 0.848 | ||
0.890 | 0.925 | 0.842 | 0.940 | ||
0.878 | 0.793 | 0.832 | 0.833 | ||
0.905 | 0.977 | 0.807 | 0.845 | ||
0.850 | 0.841 | 0.852 | |||
0.830 | 0.932 | 0.778 | |||
0.856 | 0.833 | 0.814 | |||
0.842 | 0.881 | 0.791 | |||
0.778 | 0.818 | 0.810 | |||
0.786 | 0.864 | 0.881 | |||
0.853 | 0.825 | ||||
0.864 | 0.855 | ||||
0.873 | 0.942 | ||||
0.880 | 0.825 | ||||
0.882 | 0.869 | ||||
0.931 | 0.912 | ||||
0.887 |
The bag contained 465 candies and he listed weights in the table came from randomly selected candies. Count the weights.
= 0.862, s = 0.05
= 85.65, Σs = 5.18
normalcdf
(396.9,E99,(465)(0.8565),(0.05)(
)) ≈ 1
The Screw Right Company claims their
inch screws are within ±0.23 of the claimed mean diameter of 0.750 inches with a standard deviation of 0.115 inches. The following data were recorded.
0.757 | 0.723 | 0.754 | 0.737 | 0.757 | 0.741 | 0.722 | 0.741 | 0.743 | 0.742 |
0.740 | 0.758 | 0.724 | 0.739 | 0.736 | 0.735 | 0.760 | 0.750 | 0.759 | 0.754 |
0.744 | 0.758 | 0.765 | 0.756 | 0.738 | 0.742 | 0.758 | 0.757 | 0.724 | 0.757 |
0.744 | 0.738 | 0.763 | 0.756 | 0.760 | 0.768 | 0.761 | 0.742 | 0.734 | 0.754 |
0.758 | 0.735 | 0.740 | 0.743 | 0.737 | 0.737 | 0.725 | 0.761 | 0.758 | 0.756 |
The screws were randomly selected from the local home repair store.
Your company has a contract to perform preventive maintenance on thousands of air-conditioners in a large city. Based on service records from previous years, the time that a technician spends servicing a unit averages one hour with a standard deviation of one hour. In the coming week, your company will service a simple random sample of 70 units in the city. You plan to budget an average of 1.1 hours per technician to complete the work. Will this be enough time?
Use normalcdf
= 0.7986. This means that there is an 80% chance that the service time will be less than 1.1 hours. It could be wise to schedule more time since there is an associated 20% chance that the maintenance time will be greater than 1.1 hours.
A typical adult has an average IQ score of 105 with a standard deviation of 20. If 20 randomly selected adults are given an IQ tesst, what is the probability that the sample mean scores will be between 85 and 125 points?
Certain coins have an average weight of 5.201 grams with a standard deviation of 0.065 g. If a vending machine is designed to accept coins whose weights range from 5.111 g to 5.291 g, what is the expected number of rejected coins when 280 randomly selected coins are inserted into the machine?
We assume that the weights of coins are normally distributed in the population. Since we have normalcdf
≈ 0.8338, we expect (1 – 0.8338)280 ≈ 47 coins to be rejected.
and the standard deviation is σ =
. The probability density function is f(x) = me–mx, x ≥ 0 and the cumulative distribution function is P(X ≤ x) = 1 – e–mx.
) is
, and the mean for a population (denoted by μ) is
.
, where μ is the mean of the distribution and σ is the standard deviation.; notation: X ~ N(μ, σ). If μ = 0 and σ = 1, the RV is called the standard normal distribution.
and the standard deviation is
. The probability density function is
for a < x < b or a ≤ x ≤ b. The cumulative distribution is P(X ≤ x) =
.
You can also download for free at http://cnx.org/contents/30189442-6998-4686-ac05-ed152b91b9de@21.1
Attribution: