Central limit theorem

165 questions · 25 question types identified

Distribution of sample mean

A question is this type if and only if it asks to state or derive the distribution (including parameters) of a sample mean, possibly requiring CLT.

11
6.7% of questions
Sampling distribution theory

A question is this type if and only if it asks for definitions or explanations of theoretical concepts like statistic, sampling distribution, population, or sampling frame.

9
5.5% of questions
Estimator properties and bias

A question is this type if and only if it asks to prove an estimator is unbiased, find its bias, or compare properties of different estimators.

9
5.5% of questions
Sampling method explanation

A question is this type if and only if it asks to describe, justify, or critique a sampling method (systematic, stratified, quota, simple random, etc.).

8
4.8% of questions
Proportion confidence interval

A question is this type if and only if it requires constructing an approximate confidence interval for a population proportion from sample proportion data.

7
4.2% of questions
Hypothesis test for mean

A question is this type if and only if it involves conducting a formal hypothesis test about a population mean using sample data.

6
3.6% of questions
Sample size determination

A question is this type if and only if it asks for the minimum sample size needed to achieve a specified confidence interval width or probability condition.

6
3.6% of questions
Deriving sampling distribution

A question is this type if and only if it requires listing all possible samples and constructing the complete sampling distribution of a statistic from a small finite population.

5
3.0% of questions
Finding n from sample mean distribution

A question is this type if and only if it requires finding the sample size n given probability conditions about the sample mean.

5
3.0% of questions
Paired sample confidence interval

A question is this type if and only if it involves constructing or interpreting a confidence interval for the mean difference in paired/matched samples.

4
2.4% of questions
Confidence interval interpretation

A question is this type if and only if it asks for an explanation of what a confidence interval means in context or to comment on a claim using the interval.

2
1.2% of questions
Confidence interval from software output

A question is this type if and only if it provides software output and asks to extract, complete, or interpret confidence interval information from it.

1
0.6% of questions
Assumptions for inference

A question is this type if and only if it asks what assumptions are needed (normality, randomness, independence) to perform a specific inference procedure.

1
0.6% of questions
Known variance confidence intervals

Questions where the population variance or standard deviation is given or assumed known, requiring use of the normal distribution (z-values) for the confidence interval.

0
0.0% of questions
Unknown variance confidence intervals

Questions where the population variance is unknown and must be estimated from sample data, typically requiring calculation of sample variance or standard deviation before constructing the interval.

0
0.0% of questions
Normal population, known parameters

Questions where the population is stated to be normally distributed (or the variable itself is normal) and both mean and standard deviation are given, requiring direct application of sampling distribution without CLT justification.

0
0.0% of questions
Unknown distribution, CLT applied

Questions where the population distribution is not specified as normal (or is explicitly non-normal like binomial, Poisson, geometric) and the Central Limit Theorem must be invoked to justify the normal approximation for the sample mean.

0
0.0% of questions
Variance estimation from probability

Questions that work backwards from a given probability about the sample mean to estimate the population variance or standard deviation.

0
0.0% of questions
Unbiased estimator from summary statistics

Questions that provide summary statistics (n, Σx, Σx²) and require calculating unbiased estimates of population mean and/or variance using standard formulas.

0
0.0% of questions
Unbiased estimator from raw data

Questions that provide raw data values and require calculating unbiased estimates of population mean and/or variance by first computing the necessary summary statistics.

0
0.0% of questions
Justifying CLT for confidence intervals

A question is this sub-type if and only if it asks whether CLT was necessary when constructing a confidence interval, typically because the population distribution is unknown but sample size is large.

0
0.0% of questions
Justifying CLT for sampling distribution

A question is this sub-type if and only if it asks whether CLT was necessary when calculating probabilities involving sample means, typically asking whether normality of the population needed to be assumed given the sample size.

0
0.0% of questions
Justifying CLT for hypothesis testing

A question is this sub-type if and only if it asks whether CLT was necessary (or requires stating assumptions) when performing a hypothesis test about a population mean with unknown population distribution.

0
0.0% of questions
Discrete uniform distribution sample mean

Questions involving the sample mean of observations from a discrete uniform distribution U(n), where the CLT is applied to find probabilities about the sample mean.

0
0.0% of questions
Custom discrete distribution sample mean

Questions involving the sample mean of observations from a given discrete distribution (spinner, die, or other) with specified probabilities, where the CLT is applied using the given mean and variance.

0
0.0% of questions
Unclassified

Questions not yet assigned to a type.

91
55.2% of questions
Show 91 unclassified »
4 The score on one spin of a 5 -sided spinner is denoted by the random variable \(X\) with probability distribution as shown in the table.
\(x\)01234
\(\mathrm { P } ( X = x )\)0.10.20.40.20.1
  1. Show that \(\operatorname { Var } ( X ) = 1.2\).
    The spinner is spun 200 times. The score on each spin is noted and the mean, \(\bar { X }\), of the 200 scores is found.
  2. Given that \(\mathrm { P } ( \bar { X } > a ) = 0.1\), find the value of \(a\).
  3. Explain whether it was necessary to use the Central Limit theorem in your answer to part (b).
  4. Johann has another, similar, spinner. He suspects that it is biased so that the mean score is less than 2 . He spins his spinner 200 times and finds that the mean of the 200 scores is 1.86 . Given that the variance of the score on one spin of this spinner is also 1.2 , test Johann's suspicion at the 5\% significance level.
4 The masses, \(m\) kilograms, of flour in a random sample of 90 sacks of flour are summarised as follows. $$n = 90 \quad \Sigma m = 4509 \quad \Sigma m ^ { 2 } = 225950$$
  1. Find unbiased estimates of the population mean and variance.
  2. Calculate a \(98 \%\) confidence interval for the population mean.
  3. Explain why it was necessary to use the Central Limit theorem in answering part (b).
  4. Find the probability that the confidence interval found in part (b) is wholly above the true value of the population mean.
2 The widths, \(w \mathrm {~cm}\), of a random sample of 150 leaves of a certain kind were measured. The sample mean of \(w\) was found to be 3.12 cm . Using this sample, an approximate \(95 \%\) confidence interval for the population mean of the widths in centimetres was found to be [3.01, 3.23].
  1. Calculate an estimate of the population standard deviation.
  2. Explain whether it was necessary to use the Central Limit theorem in your answer to part (a). [1]
1 A construction company notes the time, \(t\) days, that it takes to build each house of a certain design. The results for a random sample of 60 such houses are summarised as follows. $$\Sigma t = 4820 \quad \Sigma t ^ { 2 } = 392050$$
  1. Calculate a 98\% confidence interval for the population mean time.
  2. Explain why it was necessary to use the Central Limit theorem in part (a).
1 The length of time, in minutes, taken by people to complete a task has mean 53.0 and standard deviation 6.2. Find the probability that the mean time taken to complete the task by a random sample of 50 people is more than 51 minutes.
7 Previous records have shown that the number of cars entering Bampor on any day has mean 352 and variance 121.
  1. Find the probability that the mean number of cars entering Bampor during a random sample of 200 days is more than 354 .
  2. State, with a reason, whether it was necessary to assume that the number of cars entering Bampor on any day has a normal distribution in order to find the probability in part (i).
  3. It is thought that the population mean may recently have changed. The number of cars entering Bampor during the day was recorded for each of a random sample of 50 days and the sample mean was found to be 356 . Assuming that the variance is unchanged, test at the \(5 \%\) significance level whether the population mean is still 352 .
3 The lengths, \(x \mathrm {~mm}\), of a random sample of 150 insects of a certain kind were found. The results are summarised by \(\Sigma x = 7520\) and \(\Sigma x ^ { 2 } = 413540\).
  1. Calculate unbiased estimates of the population mean and variance of the lengths of insects of this kind.
  2. Using the values found in part (i), calculate an estimate of the probability that the mean length of a further random sample of 80 insects of this kind is greater than 53 mm .
4 The lengths, \(x \mathrm {~m}\), of a random sample of 200 balls of string are found and the results are summarised by \(\Sigma x = 2005\) and \(\Sigma x ^ { 2 } = 20175\).
  1. Calculate unbiased estimates of the population mean and variance of the lengths.
  2. Use the values from part (i) to estimate the probability that the mean length of a random sample of 50 balls of string is less than 10 m .
  3. Explain whether or not it was necessary to use the Central Limit theorem in your calculation in part (ii).
7 In the past the weekly profit at a store had mean \(
) 34600\( and standard deviation \)\\( 4500\). Following a change of ownership, the mean weekly profit for 90 randomly chosen weeks was \(
) 35400$.
  1. Stating a necessary assumption, test at the \(5 \%\) significance level whether the mean weekly profit has increased.
  2. State, with a reason, whether it was necessary to use the Central Limit theorem in part (i). The mean weekly profit for another random sample of 90 weeks is found and the same test is carried out at the 5\% significance level.
  3. State the probability of a Type I error.
  4. Given that the population mean weekly profit is now \(
    ) 36500$, calculate the probability of a Type II error.
5 The score on one throw of a 4 -sided die is denoted by the random variable \(X\) with probability distribution as shown in the table.
\(x\)0123
\(\mathrm { P } ( X = x )\)0.250.250.250.25
  1. Show that \(\operatorname { Var } ( X ) = 1.25\). The die is thrown 300 times. The score on each throw is noted and the mean, \(\bar { X }\), of the 300 scores is found.
  2. Use a normal distribution to find \(\mathrm { P } ( \bar { X } < 1.4 )\).
  3. Justify the use of the normal distribution in part (ii).
5 The mass, in kilograms, of rocks in a certain area has mean 14.2 and standard deviation 3.1.
  1. Find the probability that the mean mass of a random sample of 50 of these rocks is less than 14.0 kg .
  2. Explain whether it was necessary to assume that the population of the masses of these rocks is normally distributed.
  3. A geologist suspects that rocks in another area have a mean mass which is less than 14.2 kg . A random sample of 100 rocks in this area has sample mean 13.5 kg . Assuming that the standard deviation for rocks in this area is also 3.1 kg , test at the \(2 \%\) significance level whether the geologist is correct.
3 The length, in centimetres, of a certain type of snake is modelled by the random variable \(X\) with mean 52 and standard deviation 6.1. A random sample of 75 snakes is selected, and the sample mean, \(\bar { X }\), is found.
  1. Find \(\mathrm { P } ( 51 < \bar { X } < 53 )\).
  2. Explain why it was necessary to use the Central Limit theorem in the solution to part (i).
6 The time, in minutes, for Anjan's journey to work on Mondays has mean 38.4 and standard deviation 6.9.
  1. Find the probability that Anjan's mean journey time for a random sample of 30 Mondays is between 38 and 40 minutes.
    Anjan wishes to test whether his mean journey time is different on Tuesdays. He chooses a random sample of 30 Tuesdays and finds that his mean journey time for these 30 Tuesdays is 40.2 minutes. Assume that the standard deviation for his journey time on Tuesdays is 6.9 minutes.
    1. State, with a reason, whether Anjan should use a one-tail or a two-tail test.
    2. Carry out the test at the \(10 \%\) significance level.
    3. Explain whether it was necessary to use the Central Limit theorem in part (b)(ii).
      If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
2 The standard deviation of the volume of drink in cans of Koola is 4.8 centilitres. A random sample of 180 cans is taken and the mean volume of drink in these 180 cans is found to be 330.1 centilitres.
  1. Calculate a \(95 \%\) confidence interval for the mean volume of drink in all cans of Koola. Give the end-points of your interval correct to 1 decimal place.
  2. Explain whether it was necessary to use the Central Limit theorem in your answer to part (i).
2 The mean and standard deviation of the time spent by people in a certain library are 29 minutes and 6 minutes respectively.
  1. Find the probability that the mean time spent in the library by a random sample of 120 people is more than 30 minutes.
  2. Explain whether it was necessary to assume that the time spent by people in the library is normally distributed in the solution to part (i).
1 The masses of a certain variety of plums are known to have standard deviation 13.2 g . A random sample of 200 of these plums is taken and the mean mass of the plums in the sample is found to be 62.3 g .
  1. Calculate a \(98 \%\) confidence interval for the population mean mass.
  2. State with a reason whether it was necessary to use the Central Limit theorem in the calculation in part (i).
2 Over a long period of time it is found that the amount of sunshine on any day in a particular town in Spain has mean 6.7 hours and standard deviation 3.1 hours.
  1. Find the probability that the mean amount of sunshine over a random sample of 300 days is between 6.5 and 6.8 hours.
  2. Give a reason why it is not necessary to assume that the daily amount of sunshine is normally distributed in order to carry out the calculation in part (i).
5 The number of hours that Mrs Hughes spends on her business in a week is normally distributed with mean \(\mu\) and standard deviation 4.8. In the past the value of \(\mu\) has been 49.5.
  1. Assuming that \(\mu\) is still equal to 49.5 , find the probability that in a random sample of 40 weeks the mean time spent on her business in a week is more than 50.3 hours. Following a change in her arrangements, Mrs Hughes wishes to test whether \(\mu\) has decreased. She chooses a random sample of 40 weeks and notes that the total number of hours she spent on her business during these weeks is 1920.
  2. (a) Explain why a one-tail test is appropriate.
    (b) Carry out the test at the 6\% significance level.
    (c) Explain whether it was necessary to use the Central Limit theorem in part (ii) (b).
2 The mean and standard deviation of the time spent by people in a certain library are 29 minutes and 6 minutes respectively.
  1. Find the probability that the mean time spent in the library by a random sample of 120 people is more than 30 minutes.
  2. Explain whether it was necessary to assume that the time spent by people in the library is normally distributed in the solution to part (i).
4 The discrete random variable \(H\) takes values 1, 2, 3 and 4. It is given that \(\mathrm { E } ( H ) = 2.5\) and \(\operatorname { Var } ( H ) = 1.25\). The mean of a random sample of 50 observations of \(H\) is denoted by \(\bar { H }\).
Use a suitable approximation to find \(\mathrm { P } ( \bar { H } < 2.6 )\).
1 A random sample of observations of a random variable \(X\) is summarised by $$n = 100 , \quad \Sigma x = 4830.0 , \quad \Sigma x ^ { 2 } = 249 \text { 509.16. }$$
  1. Obtain unbiased estimates of the mean and variance of \(X\).
  2. The sample mean of 100 observations of \(X\) is denoted by \(\bar { X }\). Explain whether you would need any further information about the distribution of \(X\) in order to estimate \(\mathrm { P } ( \bar { X } > 60 )\). [You should not attempt to carry out the calculation.]
1 The continuous random variable \(X\) has probability density function $$f ( x ) = k ( 1 - x ) \quad \text { for } 0 \leqslant x \leqslant 1$$ where \(k\) is a constant.
  1. Show that \(k = 2\). Sketch the graph of the probability density function.
  2. Find \(\mathrm { E } ( X )\) and show that \(\operatorname { Var } ( X ) = \frac { 1 } { 18 }\).
  3. Derive the cumulative distribution function of \(X\). Hence find the probability that \(X\) is greater than the mean.
  4. Verify that the median of \(X\) is \(1 - \frac { 1 } { \sqrt { 2 } }\).
  5. \(\bar { X }\) is the mean of a random sample of 100 observations of \(X\). Write down the approximate distribution of \(\bar { X }\).
1 A manufacturer of fireworks is investigating the lengths of time for which the fireworks burn. For a particular type of firework this length of time, in minutes, is modelled by the random variable \(T\) with probability density function $$\mathrm { f } ( t ) = k t ^ { 3 } ( 2 - t ) \quad \text { for } 0 < t \leqslant 2$$ where \(k\) is a constant.
  1. Show that \(k = \frac { 5 } { 8 }\).
  2. Find the modal time.
  3. Find \(\mathrm { E } ( T )\) and show that \(\operatorname { Var } ( T ) = \frac { 8 } { 63 }\).
  4. A large random sample of \(n\) fireworks of this type is tested. Write down in terms of \(n\) the approximate distribution of \(\bar { T }\), the sample mean time.
  5. For a random sample of 100 such fireworks the times are summarised as follows. $$\Sigma t = 145.2 \quad \Sigma t ^ { 2 } = 223.41$$ Find a 95\% confidence interval for the mean time for this type of firework and hence comment on the appropriateness of the model.
1 In a certain country, any baby born is equally likely to be a boy or a girl, independently for all births. The birthweight of a baby boy is given by the continuous random variable \(X _ { B }\) with probability density function (pdf) \(\mathrm { f } _ { B } ( x )\) and cumulative distribution function (cdf) \(\mathrm { F } _ { B } ( x )\). The birthweight of a baby girl is given by the continuous random variable \(X _ { G }\) with pdf \(\mathrm { f } _ { G } ( x )\) and cdf \(\mathrm { F } _ { G } ( x )\). The continuous random variable \(X\) denotes the birthweight of a baby selected at random.
  1. By considering $$\mathrm { P } ( X \leqslant x ) = \mathrm { P } ( X \leqslant x \mid \text { boy } ) \mathrm { P } ( \text { boy } ) + \mathrm { P } ( X \leqslant x \mid \text { girl } ) \mathrm { P } ( \text { girl } ) ,$$ find the cdf of \(X\) in terms of \(\mathrm { F } _ { B } ( x )\) and \(\mathrm { F } _ { G } ( x )\), and deduce that the pdf of \(X\) is $$\mathrm { f } ( x ) = \frac { 1 } { 2 } \left\{ \mathrm { f } _ { B } ( x ) + \mathrm { f } _ { G } ( x ) \right\} .$$
  2. The birthweights of baby boys and girls have means \(\mu _ { B }\) and \(\mu _ { G }\) respectively. Deduce that $$\mathrm { E } ( X ) = \frac { 1 } { 2 } \left( \mu _ { B } + \mu _ { G } \right) .$$
  3. The birthweights of baby boys and girls have common variance \(\sigma ^ { 2 }\). Find an expression for \(\mathrm { E } \left( X ^ { 2 } \right)\) in terms of \(\mu _ { B } , \mu _ { G }\) and \(\sigma ^ { 2 }\), and deduce that $$\operatorname { Var } ( X ) = \sigma ^ { 2 } + \frac { 1 } { 4 } \left( \mu _ { B } - \mu _ { G } \right) ^ { 2 } .$$
  4. A random sample of size \(2 n\) is taken from all the babies born in a certain period. The mean birthweight of the babies in this sample is \(\bar { X }\). Write down an approximation to the sampling distribution of \(\bar { X }\) if \(n\) is large.
  5. Suppose instead that a stratified sample of size \(2 n\) is taken by selecting \(n\) baby boys at random and, independently, \(n\) baby girls at random. The mean birthweight of the \(2 n\) babies in this sample is \(\bar { X } _ { s t }\). Write down the expected value of \(\bar { X } _ { s t }\) and find the variance of \(\bar { X } _ { s t }\).
  6. Deduce that both \(\bar { X }\) and \(\bar { X } _ { s t }\) are unbiased estimators of the population mean birthweight. Find which is the more efficient.
6 Gordon is a cricketer. Over a long period he knows that his population mean score, in number of runs per innings, is 28 , and the population standard deviation is 12 . In a new season he adopts a different batting style and he finds that in 30 innings using this style his mean score is 28.98 .
  1. Stating a necessary assumption, test at the \(5 \%\) significance level whether his population mean score has increased.
  2. Explain whether it was necessary to use the Central Limit Theorem in part (i).
6 The continuous random variable \(R\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). The results of 100 observations of \(R\) are summarised by $$\Sigma r = 3360.0 , \quad \Sigma r ^ { 2 } = 115782.84 .$$
  1. Calculate an unbiased estimate of \(\mu\) and an unbiased estimate of \(\sigma ^ { 2 }\).
  2. The mean of 9 observations of \(R\) is denoted by \(\bar { R }\). Calculate an estimate of \(\mathrm { P } ( \bar { R } > 32.0 )\).
  3. Explain whether you need to use the Central Limit Theorem in your answer to part (ii).
7 The continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 2 } { 9 } x ( 3 - x ) & 0 \leqslant x \leqslant 3 ,
0 & \text { otherwise } . \end{cases}$$
  1. Find the variance of \(X\).
  2. Show that the probability that a single observation of \(X\) lies between 0.0 and 0.5 is \(\frac { 2 } { 27 }\).
  3. 108 observations of \(X\) are obtained. Using a suitable approximation, find the probability that at least 10 of the observations lie between 0.0 and 0.5 .
  4. The mean of 108 observations of \(X\) is denoted by \(\bar { X }\). Write down the approximate distribution of \(\bar { X }\), giving the value(s) of any parameter(s).
2
  1. For the continuous random variable \(V\), it is known that \(\mathrm { E } ( V ) = 72.0\). The mean of a random sample of 40 observations of \(V\) is denoted by \(\bar { V }\). Given that \(\mathrm { P } ( \bar { V } < 71.2 ) = 0.35\), estimate the value of \(\operatorname { Var } ( V )\).
  2. Explain why you need to use the Central Limit Theorem in part (i), and why its use is justified.
4 Part of an ecological study involved measuring the heights of trees in a young forest. In order to obtain an estimate of the mean height of all the trees in the forest, a random sample of 70 trees was selected and their heights measured. These heights, \(x\) metres, are summarised by \(\Sigma x = 246.6\) and \(\Sigma x ^ { 2 } = 1183.65\). The mean height of all trees in the forest is denoted by \(\mu\) metres.
  1. Calculate a symmetric \(90 \%\) confidence interval for \(\mu\).
  2. A student was asked to interpret the interval and said,
    "If 100 independent \(90 \%\) confidence intervals were calculated then 90 of them would contain \(\mu\)." Explain briefly what is wrong with this statement.
  3. Four independent \(90 \%\) confidence intervals for \(\mu\) are obtained. Calculate the probability that at least three of the intervals contain \(\mu\).
4 A new computer was bought by a local council to search council records and was tested by an employee. She searched a random sample of 500 records and the sample mean search time was found to be 2.18 milliseconds and an unbiased estimate of variance was \(1.58 ^ { 2 }\) milliseconds \({ } ^ { 2 }\).
  1. Calculate a \(98 \%\) confidence interval for the population mean search time \(\mu\) milliseconds.
  2. It is required to obtain a sample mean time that differs from \(\mu\) by less than 0.05 milliseconds with probability 0.95 . Estimate the sample size required.
  3. State why it is unnecessary for the validity of your calculations that search time has a normal distribution.
1 The continuous random variable \(X\) has the distribution \(\mathrm { N } ( \mu , 30 )\). The mean of a random sample of 8 observations of \(X\) is 53.1. Determine a \(95 \%\) confidence interval for \(\mu\). You should give the end points of the interval correct to 4 significant figures.
4 A random sample of 160 observations of a random variable \(X\) is selected. The sample can be summarised as follows.
\(n = 160 \quad \sum x = 2688 \quad \sum x ^ { 2 } = 48398\)
  1. Calculate unbiased estimates of the following.
    1. \(\mathrm { E } ( X )\)
    2. \(\operatorname { Var } ( X )\)
  2. Find a 99\% confidence interval for \(\mathrm { E } ( X )\), giving the end-points of the interval correct to 4 significant figures.
  3. Explain whether it was necessary to use the Central Limit Theorem in answering
    1. part (a),
    2. part (b).
  1. A fair six-sided die is labelled with the numbers \(1,2,3,4,5\) and 6 The die is rolled 40 times and the score, \(S\), for each roll is recorded.
    1. Find the mean and the variance of \(S\).
    2. Find an approximation for the probability that the mean of the 40 scores is less than 3 (3)
    3. A factory produces steel sheets whose weights \(X \mathrm {~kg}\), are such that \(X \sim \mathrm {~N} \left( \mu , \sigma ^ { 2 } \right)\)
    A random sample of these sheets is taken and a \(95 \%\) confidence interval for \(\mu\) is found to be (29.74, 31.86)
  2. Find, to 2 decimal places, the standard error of the mean.
  3. Hence, or otherwise, find a \(90 \%\) confidence interval for \(\mu\) based on the same sample of sheets. Using four different random samples, four \(90 \%\) confidence intervals for \(\mu\) are to be found.
  4. Calculate the probability that at least 3 of these intervals will contain \(\mu\).
8. A six-sided die is labelled with the numbers \(1,2,3,4,5\) and 6 A group of 50 students want to test whether or not the die is fair for the number six.
The 50 students each roll the die 30 times and record the number of sixes they each obtain.
Given that \(\bar { X }\) denotes the mean number of sixes obtained by the 50 students, and using $$\mathrm { H } _ { 0 } : p = \frac { 1 } { 6 } \text { and } \mathrm { H } _ { 1 } : p \neq \frac { 1 } { 6 }$$ where \(p\) is the probability of rolling a 6 ,
  1. use the Central Limit Theorem to find an approximate distribution for \(\bar { X }\), if \(\mathrm { H } _ { 0 }\) is true.
  2. Hence find, in terms of \(\bar { X }\), the critical region for this test. Use a \(5 \%\) level of significance.
6. A company produces a certain type of mug. The masses of these mugs are normally distributed with mean \(\mu\) and standard deviation 1.2 grams. A random sample of 5 mugs is taken and the mass, in grams, of each mug is measured. The results are given below. \section*{\(\begin{array} { l l l l l } 229.1 & 229.6 & 230.9 & 231.2 & 231.7 \end{array}\)}
  1. Find a \(95 \%\) confidence interval for \(\mu\), giving your limits correct to 1 decimal place. Sonia plans to take 20 random samples, each of 5 mugs. A 95\% confidence interval for \(\mu\) is to be determined for each sample.
  2. Find the probability that more than 3 of these intervals will not contain \(\mu\).
6. The continuous random variable \(Y\) is uniformly distributed over the interval $$[ a - 3 , a + 6 ]$$ where \(a\) is a constant. A random sample of 60 observations of \(Y\) is taken.
Given that \(\bar { Y } = \frac { \sum _ { i = 1 } ^ { 60 } Y _ { i } } { 60 }\)
  1. use the Central Limit Theorem to find an approximate distribution for \(\bar { Y }\) Given that the 60 observations of \(Y\) have a sample mean of 13.4
  2. find a \(98 \%\) confidence interval for the maximum value that \(Y\) can take.
  1. A fair six-sided die is labelled with the numbers \(1,2,3,4,5\) and 6
    (b) Find an approximation for the probability that the mean of the 40 scores is less than 3
    \includegraphics[max width=\textwidth, alt={}, center]{0434a6c1-686a-449d-ba16-dbb8e60288e8-24_204_714_237_251}
3. A woodwork teacher measures the width, \(w \mathrm {~mm}\), of a board. The measured width, \(X \mathrm {~mm}\), is normally distributed with mean \(w \mathrm {~mm}\) and standard deviation 0.5 mm .
  1. Find the probability that \(X\) is within 0.6 mm of \(w\). The same board is measured 16 times and the results are recorded.
  2. Find the probability that the mean of these results is within 0.3 mm of \(w\). Given that the mean of these 16 measurements is 35.6 mm ,
  3. find a 98\% confidence interval for \(w\).
4. Kylie regularly travels from home to visit a friend. On 10 randomly selected occasions the journey time \(x\) minutes was recorded. The results are summarised as follows. $$\Sigma x = 753 , \quad \Sigma x ^ { 2 } = 57455 .$$
  1. Calculate unbiased estimates of the mean and the variance of the population of journey times. After many journeys, a random sample of 100 journeys gave a mean of 74.8 minutes and a variance of 84.6 minutes \({ } ^ { 2 }\).
  2. Calculate a 95\% confidence interval for the mean of the population of journey times.
  3. Write down two assumptions you made in part (b).
6. A computer company repairs large numbers of PCs and wants to estimate the mean time to repair a particular fault. Five repairs are chosen at random from the company's records and the times taken, in seconds, are $$\begin{array} { l l l l l } 205 & 310 & 405 & 195 & 320 \end{array} .$$
  1. Calculate unbiased estimates of the mean and the variance of the population of repair times from which this sample has been taken. It is known from previous results that the standard deviation of the repair time for this fault is 100 seconds. The company manager wants to ensure that there is a probability of at least 0.95 that the estimate of the population mean lies within 20 seconds of its true value.
  2. Find the minimum sample size required.
    (Total 10 marks)
  1. Some biologists were studying a large group of wading birds. A random sample of 36 were measured and the wing length, \(x \mathrm {~mm}\) of each wading bird was recorded. The results are summarised as follows
$$\sum x = 6046 \quad \sum x ^ { 2 } = 1016338$$
  1. Calculate unbiased estimates of the mean and the variance of the wing lengths of these birds. Given that the standard deviation of the wing lengths of this particular type of bird is actually 5.1 mm ,
  2. find a \(99 \%\) confidence interval for the mean wing length of the birds from this group.
  1. A sociologist is studying how much junk food teenagers eat. A random sample of 100 female teenagers and an independent random sample of 200 male teenagers were asked to estimate what their weekly expenditure on junk food was. The results are summarised below.
\(n\)means.d.
Female teenagers100\(\pounds 5.48\)\(\pounds 3.62\)
Male teenagers200\(\pounds 6.86\)\(\pounds 4.51\)
  1. Using a 5\% significance level, test whether or not there is a difference in the mean amounts spent on junk food by male teenagers and female teenagers. State your hypotheses clearly.
  2. Explain briefly the importance of the central limit theorem in this problem.
4. A sample of size 8 is to be taken from a population that is normally distributed with mean 55 and standard deviation 3. Find the probability that the sample mean will be greater than 57.
3. A woodwork teacher measures the width, \(w \mathrm {~mm}\), of a board. The measured width, \(X \mathrm {~mm}\), is normally distributed with mean \(w \mathrm {~mm}\) and standard deviation 0.5 mm .
  1. Find the probability that \(X\) is within 0.6 mm of \(w\). The same board is measured 16 times and the results are recorded.
  2. Find the probability that the mean of these results is within 0.3 mm of \(w\). Given that the mean of these 16 measurements is 35.6 mm ,
  3. find a \(98 \%\) confidence interval for \(w\).
3. (a) Explain what you understand by the Central Limit Theorem. A garage services hire cars on behalf of a hire company. The garage knows that the lifetime of the brake pads has a standard deviation of 5000 miles. The garage records the lifetimes, \(x\) miles, of the brake pads it has replaced. The garage takes a random sample of 100 brake pads and finds that \(\sum x = 1740000\)
(b) Find a 95\% confidence interval for the mean lifetime of a brake pad.
(c) Explain the relevance of the Central Limit Theorem in part (b). Brake pads are made to be changed every 20000 miles on average.
The hire car company complain that the garage is changing the brake pads too soon.
(d) Comment on the hire company's complaint. Give a reason for your answer.
6. The continuous random variable \(X\) is uniformly distributed over the interval $$[ a - 1 , a + 5 ]$$ where \(a\) is a constant.
Fifty observations of \(X\) are taken, giving a sample mean of 17.2
  1. Use the Central Limit Theorem to find an approximate distribution for \(\bar { X }\).
  2. Hence find a 95\% confidence interval for \(a\).
  1. Lambs are born in a shed on Mill Farm. The birth weights, \(x \mathrm {~kg}\), of a random sample of 8 newborn lambs are given below.
$$\begin{array} { l l l l l l l l } 4.12 & 5.12 & 4.84 & 4.65 & 3.55 & 3.65 & 3.96 & 3.40 \end{array}$$
  1. Calculate unbiased estimates of the mean and variance of the birth weight of lambs born on Mill Farm. A further random sample of 32 lambs is chosen and the unbiased estimates of the mean and variance of the birth weight of lambs from this sample are 4.55 and 0.25 respectively.
  2. Treating the combined sample of 40 lambs as a single sample, estimate the standard error of the mean. The owner of Mill Farm researches the breed of lamb and discovers that the population of birth weights is normally distributed with standard deviation 0.67 kg .
  3. Calculate a \(95 \%\) confidence interval for the mean birth weight of this breed of lamb using your combined sample mean.
3. A nursery has 16 staff and 40 children on its records. In preparation for an outing the manager needs an estimate of the mean weight of the people on its records and decides to take a stratified sample of size 14 .
  1. Describe how this stratified sample should be taken. The weights, \(x \mathrm {~kg}\), of each of the 14 people selected are summarised as $$\sum x = 437 \text { and } \sum x ^ { 2 } = 26983$$
  2. Find unbiased estimates of the mean and the variance of the weights of all the people on the nursery's records.
  3. Estimate the standard error of the mean. The estimates of the standard error of the mean for the staff and for the children are 5.11 and 1.10 respectively.
  4. Comment on these values with reference to your answer to part (c) and give a reason for any differences.
7. A restaurant states that its hamburgers contain \(20 \%\) fat. Paul claims that the mean fat content of their hamburgers is less than \(20 \%\). Paul takes a random sample of 50 hamburgers from the restaurant and finds that they contain a mean fat content of 19.5\% with a standard deviation of 1.5\% You may assume that the fat content of hamburgers is normally distributed.
  1. Find the \(90 \%\) confidence interval for the mean fat content of hamburgers from the restaurant.
  2. State, with a reason, what action Paul should recommend the restaurant takes over the stated fat content of their hamburgers. The restaurant changes the mean fat content of their hamburgers to \(\mu \%\) and adjusts the standard deviation to \(2 \%\). Paul takes a sample of size \(n\) from this new batch of hamburgers. He uses the sample mean \(\bar { X }\) as an estimator of \(\mu\).
  3. Find the minimum value of \(n\) such that \(\mathrm { P } ( | \bar { X } - \mu | < 0.5 ) \geqslant 0.9\)
4 The time, \(x\) seconds, spent by each of a random sample of 100 customers at an automatic teller machine (ATM) is recorded. The times are summarised in the table.
Time (seconds)Number of customers
\(20 < x \leqslant 30\)2
\(30 < x \leqslant 40\)7
\(40 < x \leqslant 60\)18
\(60 < x \leqslant 80\)27
\(80 < x \leqslant 100\)23
\(100 < x \leqslant 120\)13
\(120 < x \leqslant 150\)7
\(150 < x \leqslant 180\)3
Total100
  1. Calculate estimates for the mean and standard deviation of the time spent at the ATM by a customer.
  2. The mean time spent at the ATM by a random sample of \(\mathbf { 3 6 }\) customers is denoted by \(\bar { Y }\).
    1. State why the distribution of \(\bar { Y }\) is approximately normal.
    2. Write down estimated values for the mean and standard error of \(\bar { Y }\).
    3. Hence estimate the probability that \(\bar { Y }\) is less than \(1 \frac { 1 } { 2 }\) minutes.
5 The times taken by new recruits to complete an assault course may be modelled by a normal distribution with a standard deviation of 8 minutes. A group of 30 new recruits takes a total time of 1620 minutes to complete the course.
  1. Calculate the mean time taken by these 30 new recruits.
  2. Assuming that the 30 recruits may be considered to be a random sample, construct a \(98 \%\) confidence interval for the mean time taken by new recruits to complete the course.
  3. Construct an interval within which approximately \(98 \%\) of the times taken by individual new recruits to complete the course will lie.
  4. State where, if at all, in this question you made use of the Central Limit Theorem.
7 A random sample of 50 full-time university employees was selected as part of a higher education salary survey. The annual salary in thousands of pounds, \(x\), of each employee was recorded, with the following summarised results. $$\sum x = 2290.0 \quad \text { and } \quad \sum ( x - \bar { x } ) ^ { 2 } = 28225.50$$ Also recorded was the fact that 6 of the 50 salaries exceeded \(\pounds 60000\).
    1. Calculate values for \(\bar { x }\) and \(s\), where \(s ^ { 2 }\) denotes the unbiased estimate of \(\sigma ^ { 2 }\).
    2. Hence show why the annual salary, \(X\), of a full-time university employee is unlikely to be normally distributed. Give numerical support for your answer.
    1. Indicate why the mean annual salary, \(\bar { X }\), of a random sample of 50 full-time university employees may be assumed to be normally distributed.
    2. Hence construct a \(99 \%\) confidence interval for the mean annual salary of full-time university employees.
  1. It is claimed that the annual salaries of full-time university employees have an average which exceeds \(\pounds 55000\) and that more than \(25 \%\) of such salaries exceed £60000. Comment on each of these two claims.
6
  1. The length of one-metre galvanised-steel straps used in house building may be modelled by a normal distribution with a mean of 1005 mm and a standard deviation of 15 mm . The straps are supplied to house builders in packs of 12, and the straps in a pack may be assumed to be a random sample. Determine the probability that the mean length of straps in a pack is less than one metre.
  2. Tania, a purchasing officer for a nationwide house builder, measures the thickness, \(x\) millimetres, of each of a random sample of 24 galvanised-steel straps supplied by a manufacturer. She then calculates correctly that the value of \(\bar { x }\) is 4.65 mm .
    1. Assuming that the thickness, \(X \mathrm {~mm}\), of such a strap may be modelled by the distribution \(\mathrm { N } \left( \mu , 0.15 ^ { 2 } \right)\), construct a \(99 \%\) confidence interval for \(\mu\).
    2. Hence comment on the manufacturer's specification that the mean thickness of such straps is greater than 4.5 mm .
3
  1. A sample of 50 washed baking potatoes was selected at random from a large batch.
    The weights of the 50 potatoes were found to have a mean of 234 grams and a standard deviation of 25.1 grams. Construct a \(95 \%\) confidence interval for the mean weight of potatoes in the batch.
    (4 marks)
  2. The batch of potatoes is purchased by a market stallholder. He sells them to his customers by allowing them to choose any 5 potatoes for \(\pounds 1\). Give a reason why such chosen potatoes are unlikely to represent a random sample from the batch.
6
  1. The time taken, in minutes, by Domesat to install a domestic satellite system may be modelled by a normal distribution with unknown mean, \(\mu\), and standard deviation 15 . The times taken, in minutes, for a random sample of 10 installations are as follows.
    \(\begin{array} { l l l l l l l l l l } 47 & 39 & 25 & 51 & 47 & 36 & 63 & 41 & 78 & 43 \end{array}\)
    Construct a \(98 \%\) confidence interval for \(\mu\).
  2. The time taken, \(Y\) minutes, by Teleair to erect a TV aerial and then connect it to a TV is known to have a mean of 108 and a standard deviation of 28. Estimate the probability that the mean of a random sample of 40 observations of \(Y\) is more than 120 .
  3. Indicate, with a reason, where, if at all, in this question you made use of the Central Limit Theorem.
    (2 marks)
    \includegraphics[max width=\textwidth, alt={}]{adf1c0d2-b0a6-4a2f-baf2-cfb45d771315-13_2484_1709_223_153}
7 The volume of bleach in a 5-litre bottle may be modelled by a random variable with a standard deviation of 75 millilitres. The volume, in litres, of bleach in each of a random sample of 36 such bottles was measured. The 36 measurements resulted in a total volume of 181.80 litres and exactly 8 bottles contained less than 5 litres.
  1. Construct a 98\% confidence interval for the mean volume of bleach in a 5-litre bottle.
  2. It is claimed that the mean volume of bleach in a 5-litre bottle exceeds 5 litres and also that fewer than 10 per cent of such bottles contain less than 5 litres. Comment, with numerical justification, on each of these two claims.
  3. State, with justification, whether you made use of the Central Limit Theorem in constructing the confidence interval in part (a).
6 The weight, \(X\) kilograms, of sand in a bag can be modelled by a normal random variable with unknown mean \(\mu\) and known standard deviation 0.4 .
  1. The sand in each of a random sample of 25 bags from a large batch is weighed. The total weight of sand in these 25 bags is found to be 497.5 kg .
    1. Construct a 98\% confidence interval for the mean weight of sand in bags in the batch.
    2. Hence comment on the claim that bags in the batch contain an average of 20 kg of sand.
    3. State why use of the Central Limit Theorem is not required in answering part (a)(i).
  2. The weight, \(Y\) kilograms, of cement in a bag can be modelled by a normal random variable with mean 25.25 and standard deviation 0.35. A firm purchases 10 such bags. These bags may be considered to be a random sample.
    1. Determine the probability that the mean weight of cement in the 10 bags is less than 25 kg .
    2. Calculate the probability that the weight of cement in each of the 10 bags is more than 25 kg .
      \includegraphics[max width=\textwidth, alt={}]{fbee7665-54e4-4805-9ce0-6244a4ba043c-20_1111_1707_1592_153}
      \includegraphics[max width=\textwidth, alt={}]{fbee7665-54e4-4805-9ce0-6244a4ba043c-23_2351_1707_219_153}
2 The number of emergency calls received by a fire station may be modelled by a Poisson distribution. During a given period of 13 weeks, the station received a total of 108 emergency calls.
  1. Construct an approximate \(98 \%\) confidence interval for the average weekly number of emergency calls received by the station.
  2. Hence comment on the station officer's claim that the station receives an average of one emergency call per day.
    (2 marks)
7. (a) Briefly state the central limit theorem. A student throws ten dice and records the number of sixes showing. The dice are fair, numbered 1 to 6 on the faces.
(b) Write down the distribution of the number of sixes obtained when the ten dice are thrown.
(c) Find the mean and variance of this distribution. The student throws the ten dice 100 times, recording the number of sixes showing each time.
(d) Find the probability that the mean number of sixes obtained is more than 1.8
7. A telephone company believes that, for young people, the average length of a telephone call on a land line is longer than on a mobile, due to the difference in price. The company collected data on the time, \(t\) minutes, of 500 calls made by young people on mobiles and the data is summarised by $$\Sigma t = 7335 , \quad \Sigma t ^ { 2 } = 172040 .$$
  1. Calculate unbiased estimates of the mean and variance of \(t\). For 200 calls made on land lines by the same young people, unbiased estimates of the mean and variance of the call length were 15.9 minutes and 108.5 minutes \({ } ^ { 2 }\) respectively.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level whether or not there is evidence that longer calls are made on land lines than on mobiles.
    (9 marks)
  3. Explain the importance of the central limit theorem in carrying out the test in part (b).
1 It is known that the red blood cell count of adults in a particular country, measured in suitable units, has mean 4.96 and variance 0.15.
  1. Find the probability that the mean red blood cell count of a random sample of 50 adults from this country is at least 5.00.
  2. Explain how you can find the probability in part (a) despite the fact that you do not know the distribution of red blood cell counts.
6 The table below shows the mean and variance of the test scores of a random samples of 70 girls who are starting an A level Mathematics course.
Sample meanSample variance
118.8686.57
  1. Showing your working, find a \(95 \%\) confidence interval for the population mean.
  2. Explain why you can construct the interval in part (i) despite no information about the distribution of the parent population being given.
  3. The same random sample of girls repeats the test. The mean improvement in score is 0.9 . The \(95 \%\) confidence interval for the improvement is \([ - 1.5,3.3 ]\). What is the sample variance for the improvement in score?
1 When babies are born, their head circumferences are measured. A random sample of 50 newborn female babies is selected. The sample mean head circumference is 34.711 cm . The sample standard deviation head circumference is 1.530 cm .
  1. Determine a 95\% confidence interval for the population mean head circumference of newborn female babies.
  2. Explain why you can calculate this interval even though the distribution of the population of head circumferences of newborn female babies is unknown.
4 David, a zoologist, is investigating a particular species of monitor lizard. He measures the lengths, in centimetres, of a random sample of this particular species of lizard. His measured lengths are $$\begin{array} { l l l l l l l l l l } 53.2 & 57.8 & 55.3 & 58.9 & 59.0 & 60.2 & 61.8 & 62.3 & 65.4 & 66.5 \end{array}$$ The lengths may be assumed to be normally distributed.
David correctly constructed a 90\% confidence interval for the mean length of lizard using the measured lengths given and the formula \(\bar { x } \pm \left( b \times \frac { s } { \sqrt { n } } \right)\) This interval had limits of 57.63 and 62.45, correct to two decimal places.
4
  1. State the value for \(b\) used in David's formula. 4
  2. David interprets his interval and states,
    "My confidence interval indicates that exactly 90\% of the population of lizard lengths for this particular species lies between 57.63 cm and \(62.45 \mathrm {~cm} ^ { \prime \prime }\). Do you think David's statement is true? Explain your reasoning. 4
  3. David's assistant, Amina, correctly constructs a \(\beta \%\) confidence interval from David's random sample of measured lengths. Amina informs David that the width of her confidence interval is 8.54 .
    Find the value of \(\beta\).
    [0pt] [3 marks]
    Turn over for the next question
  1. A biased spinner can land on the numbers \(1,2,3,4\) or 5 with the following probabilities.
Number on spinner12345
Probability0.30.10.20.10.3
The spinner will be spun 80 times and the mean of the numbers it lands on will be calculated. Find an estimate of the probability that this mean will be greater than 3.25
(6)
  1. A six-sided die has sides labelled \(1,2,3,4,5\) and 6
The random variable \(S\) represents the score when the die is rolled.
Alicia rolls the die 45 times and the mean score, \(\bar { S }\), is calculated.
Assuming the die is fair and using a suitable approximation,
  1. find, to 3 significant figures, the value of \(k\) such that \(\mathrm { P } ( \bar { S } < k ) = 0.05\)
  2. Explain the relevance of the Central Limit Theorem in part (a). Alicia considers the following hypotheses:
    \(\mathrm { H } _ { 0 }\) : The die is fair
    \(\mathrm { H } _ { 1 }\) : The die is not fair
    If \(\bar { S } < 3.1\) or \(\bar { S } > 3.9\), then \(\mathrm { H } _ { 0 }\) will be rejected.
    Given that the true distribution of \(S\) has mean 4 and variance 3
  3. find the power of this test.
  4. Describe what would happen to the power of this test if Alicia were to increase the number of rolls of the die.
    Give a reason for your answer.
  1. A courier delivers parcels. The random variable \(X\) represents the number of parcels delivered successfully each day by the courier where \(X \sim \mathrm {~B} ( 400,0.64 )\)
A random sample \(X _ { 1 } , X _ { 2 } , \ldots X _ { 100 }\) is taken.
Estimate the probability that the mean number of parcels delivered each day by the courier is greater than 257
  1. A random sample of 150 observations is taken from a geometric distribution with parameter 0.3
Estimate the probability that the mean of the sample is less than 3.45
  1. There are 32 students in a class.
Each student rolls a fair die repeatedly, stopping when their total number of sixes is 4 Each student records the total number of times they rolled the die. Estimate the probability that the mean number of rolls for the class is less than 27.2
  1. A random sample of 100 observations is taken from a Poisson distribution with mean 2.3
Estimate the probability that the mean of the sample is greater than 2.5
1 A machine is set to fill pots with yoghurt such that the mean weight of yoghurt in a pot is 505 grams. To check that the machine is working properly, a random sample of 8 pots is selected. The weight of yoghurt, in grams, in each pot is as follows $$\begin{array} { l l l l l l l l } 508 & 510 & 500 & 500 & 498 & 503 & 508 & 505 \end{array}$$ Given that the weights of the yoghurt delivered by the machine follow a normal distribution with standard deviation 5.4 grams,
  1. find a \(95 \%\) confidence interval for the mean weight, \(\mu\) grams, of yoghurt in a pot. Give your answers to 2 decimal places.
  2. Comment on whether or not the machine is working properly, giving a reason for your answer.
  3. State the probability that a \(95 \%\) confidence interval for \(\mu\) will not contain \(\mu\) grams.
  4. Without carrying out any further calculations, explain the changes, if any, that would need to be made in calculating the confidence interval in part (a) if the standard deviation was unknown. Give a reason for your answer.
    You may assume that the weights of the yoghurt delivered by the machine still follow a normal distribution.
2 Jemima makes jam to sell in a local shop. The jam is sold in jars and the weight of jam in a jar is normally distributed. Jemima takes a random sample of 8 of her jars of jam and weighs the contents of each jar, \(x\) grams. Her results are summarised as follows $$\sum x = 3552 \quad \sum x ^ { 2 } = 1577314$$
  1. Calculate a 95\% confidence interval for the mean weight of jam in a jar. The labels on the jars state that the average contents weigh 440 grams.
  2. State, giving a reason, whether or not Jemima should be concerned about the labels on her jars of jam.
  1. Write down the approximate distribution of the sample mean height. Give a reason for your answer.
  2. Hence find the probability that the sample mean height is at least 91 cm . \item A biologist investigated whether or not the diet of chickens influenced the amount of cholesterol in their eggs. The cholesterol content of 70 eggs selected at random from chickens fed diet \(A\) had a mean value of 198 mg and a standard deviation of 47 mg . A random sample of 90 eggs from chickens fed diet \(B\) had a mean cholesterol content of 201 mg and a standard deviation of 23 mg .
  3. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test whether or not there is a difference between the mean cholesterol content of eggs laid by chickens fed on these two diets.
  4. State, in the context of this question, an assumption you have made in carrying out the test in part (a). \item The table below shows the price of an ice cream and the distance of the shop where it was purchased from a particular tourist attraction. \end{enumerate}
    ShopDistance from tourist attraction (m)Price (£)
    A501.75
    B1751.20
    C2702.00
    D3751.05
    E4250.95
    F5801.25
    G7100.80
    \(H\)7900.75
    I8901.00
    J9800.85
  5. Find, to 3 decimal places, the Spearman rank correlation coefficient between the distance of the shop from the tourist attraction and the price of an ice cream.
  6. Stating your hypotheses clearly and using a \(5 \%\) one-tailed test, interpret your rank correlation coefficient.
  1. Determine a 95\% confidence interval for the mean weight of liquid paraffin in a tub.
  2. Explain whether the confidence interval supports the researcher's belief.
  3. Explain why the sample has to be random in order to construct the confidence interval.
    [0pt]
  4. A 95\% confidence interval for the mean weight in grams of another ingredient in the skin cream is [1.202, 1.398]. This confidence interval is based on a large sample and the unbiased estimate of the population variance calculated from the sample is 0.25 . Find each of the following.
    • The mean of the sample
    • The size of the sample
3. A discrete random variable \(X\) has the distribution \(\mathrm { U } ( 11 )\). The mean of 50 observations of \(X\) is denoted by \(\bar { X }\).
Use an approximate method (continuity correction is not required), which should be justified, to find \(P ( \bar { X } \leq 6.10 )\).
[0pt] [BLANK PAGE]
1. Alan's journey time to work can be modelled by a normal distribution with standard deviation 6 minutes. Alan measures the journey time to work for a random sample of 5 journeys. The mean of the 5 journey times is 36 minutes.
  1. Construct a \(95 \%\) confidence interval for Alan's mean journey time to work, giving your values to one decimal place.
  2. Alan claims that his mean journey time to work is 30 minutes. State, with a reason, whether or not the confidence interval found in part (a) supports Alan's claim.
    [0pt] [1 mark]
    \includegraphics[max width=\textwidth, alt={}, center]{db093bfd-a08d-4554-ba5c-5204b6045d0e-2_344_1657_1025_246} \section*{2.} Indre works on reception in an office and deals with all the telephone calls that arrive. Calls arrive randomly and, in a 4-hour morning shift, there are on average 80 calls.
  3. Using a suitable model, find the probability of more than 4 calls arriving in a particular 20-minute period one morning. Indre is allowed 20 minutes of break time during each 4-hour morning shift, which she can take in 5-minute periods. When she takes a break, a machine records details of any call in the office that Indre has missed. One morning Indre took her break time in 4 periods of 5 minutes each.
  4. Find the probability that in exactly 3 of these periods there were no calls. On another occasion Indre took 1 break of 5 minutes and 1 break of 15 minutes.
  5. Find the probability that Indre missed exactly 1 call in each of these 2 breaks.
2. When babies are born, their head circumferences are measured. A random sample of 50 newborn female babies is selected. The sample mean head circumference is 34.711 cm . The sample standard deviation head circumference is 1.530 cm .
  1. Determine a \(95 \%\) confidence interval for the population mean head circumference of newborn female babies.
  2. Explain why you can calculate this interval even though the distribution of the population of head circumferences of newborn female babies is unknown.
    [0pt] [BLANK PAGE]
5. In a large population of hens, the weight of a hen is normally distributed with mean \(\mu \mathrm { kg }\) and standard deviation \(\sigma \mathrm { kg }\). A random sample of 100 hens is taken from the population. The mean weight for the sample is denoted \(\bar { X }\).
a. State the distribution of \(\bar { X }\) giving its mean and variance. The sample values are summarised by \(\sum x = 199.8\) and \(\sum x ^ { 2 } = 407.8\) where \(x \mathrm {~kg}\) is the weight of a hen.
b. Find an unbiased estimate for \(\mu\).
c. Find an unbiased estimate for \(\sigma ^ { 2 }\).
d. Find a \(90 \%\) confidence interval for \(\mu\). It is found that \(\sigma = 0.27\). It is decided to test, at the \(1 \%\) level of significance, the null hypothesis \(\mu = 1.95\) against the alternative hypothesis \(\mu > 1.95\).
e. Find the \(p\)-value for the test.
f. Write down the conclusion reached.
g. Explain whether or not the central limit theorem was required in part e.
2. A machine is set to fill pots with yoghurt such that the mean weight of yoghurt in a pot is 505 grams. To check that the machine is working properly, a random sample of 8 pots is selected. The weight of yoghurt, in grams, in each pot is as follows $$\begin{array} { l l l l l l l l } 508 & 510 & 500 & 500 & 498 & 503 & 508 & 505 \end{array}$$ Given that the weights of the yoghurt delivered by the machine follow a normal distribution with standard deviation 5.4 grams,
  1. find a \(95 \%\) confidence interval for the mean weight, \(\mu\) grams, of yoghurt in a pot. Give your answers to 2 decimal places.
5. The random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , 3 ^ { 2 } \right)\). A random sample of 9 observations of \(X\) produced the following values. $$\begin{array} { l l l l l l l l l } 6 & 2 & 3 & 6 & 8 & 11 & 12 & 5 & 10 \end{array}$$
  1. Find a \(90 \%\) confidence interval for \(\mu\).
  2. Explain what is meant by a \(90 \%\) confidence interval in this context.
    [0pt] [BLANK PAGE]
3 A discrete random variable \(X\) has the distribution \(\mathrm { U } ( 11 )\).
The mean of 50 observations of \(X\) is denoted by \(\bar { X }\).
Use an approximate method, which should be justified, to find \(\mathrm { P } ( \bar { X } \leqslant 6.10 )\).
4 A very popular play has been performed at a London theatre on each of 6 evenings per week for about a year. Over the past 13 weeks ( 78 performances), records have been kept of the proceeds from the sales of programmes at each performance. An analysis of these records has found that the mean was \(\pounds 184\) and the standard deviation was \(\pounds 32\).
  1. Assuming that the 78 performances may be considered to be a random sample, construct a \(90 \%\) confidence interval for the mean proceeds from the sales of programmes at an evening performance of this play.
  2. Comment on the likely validity of the assumption in part (a) when constructing a confidence interval for the mean proceeds from the sales of programmes at an evening performance of:
    1. this particular play;
    2. any play.
5 In a random sample of 12 bags of flour, the weight, in grams, of flour in each bag was recorded as follows.
\(\begin{array} { l l l l l l l l l l l l } 1011 & 995 & 1018 & 1022 & 1014 & 1005 & 1017 & 1015 & 993 & 1018 & 992 & 1020 \end{array}\)
  1. It may be assumed that the weight of flour in a bag is normally distributed with a standard deviation of 10.5 grams.
    1. Construct a \(98 \%\) confidence interval for the mean weight, \(\mu\) grams, of flour in a bag, giving the limits to four significant figures.
    2. State why, in constructing your confidence interval, use of the Central Limit Theorem was not necessary.
    3. If the distribution of the weight of flour in a bag was unknown, indicate a minimum number of weights that you would consider necessary for a confidence interval for \(\mu\) to be valid.
  2. The statement ' 1 kg ' is printed on each bag. Comment on this statement using both the confidence interval that you constructed in part (a)(i) and the weights of the given sample of 12 bags.
  3. Given that \(\mu = 1000\), state the probability that a \(98 \%\) confidence interval for \(\mu\) will not contain 1000.
    (l mark)
6 On arrival at a business centre, all visitors are required to register at the reception desk. An analysis of the register, for a random sample of 100 days, results in the following information on the number, \(X\), of visitors per day.
Number of visitors per dayNumber of days
1-1013
11-2033
21-2517
26-3012
31-358
36-405
41-505
51-1007
Total100
  1. Calculate an estimate of:
    1. \(\mu\), the mean number of visitors per day;
    2. \(\sigma\), the standard deviation of the number of visitors per day.
  2. Give a reason, based upon the data provided, why \(X\) is unlikely to be normally distributed.
    1. Give a reason why \(\bar { X }\), the mean of a random sample of 100 observations on \(X\), may be assumed to be normally distributed.
    2. State, in terms of \(\mu\) and \(\sigma\), the mean and variance of \(\bar { X }\).
  3. Hence construct a \(99 \%\) confidence interval for \(\mu\).
  4. The receptionist claims that she registers on average more than 30 visitors per day, and frequently registers more than 50 visitors on any one day. Comment on each of these two claims.
3 Fiona is studying the heights of corn plants on a farm. She measures the height, \(x \mathrm {~cm}\), of a random sample of 200 corn plants on the farm.
The summarised results are as follows: $$\sum x = 60255 \quad \text { and } \quad \sum ( x - \bar { x } ) ^ { 2 } = 995$$ Calculate a \(96 \%\) confidence interval for the population mean of heights of corn plants on the farm, giving your values to one decimal place.
\begin{center} \begin{tabular}{|l|l|l|l|} \hline \multicolumn{3}{|l|}{\begin{tabular}{l} \(\begin{aligned} & 4 \text { The continuous random variable } X \text { has probability density fu }
& \qquad f ( x ) = \begin{cases} \frac { 4 } { 99 } \left( 12 x - x ^ { 2 } - x ^ { 3 } \right) & 0 \leq x \leq 3
0 & \text { otherwise } \end{cases} \end{aligned}\)
4 Murni is investigating the annual salary of people from a particular town. She takes a random sample of 200 people from the town and records their annual salary. The mean annual salary is \(\pounds 28500\) and the standard deviation is \(\pounds 5100\)
Calculate a \(97 \%\) confidence interval for the population mean of annual salaries for the people who live in the town, giving your values to the nearest pound.
\includegraphics[max width=\textwidth, alt={}, center]{0d592978-08eb-40a8-ab3b-88339956b89d-07_2488_1716_219_153}
5 Rebekah is investigating the distances, \(X\) light years, between the Earth and visible stars in the night sky. She determines the distance between the Earth and a star for a random sample of 100 visible stars. The summarised results are as follows: $$\sum x = 35522 \quad \text { and } \quad \sum x ^ { 2 } = 32902257$$ 5
  1. Calculate a 97\% confidence interval for the population mean of \(X\), giving your values to the nearest light year.
    5
  2. Mike claims that the population mean is 267 light years. Rebekah says that the confidence interval supports Mike's claim. State, with a reason, whether Rebekah is correct.
3 The random variable \(X\) has a normal distribution with known variance 15.7 A random sample of size 120 is taken from \(X\) The sample mean is 68.2 Find a 94\% confidence interval for the population mean of \(X\) Give your limits to three significant figures.
3 The mass of male giraffes is assumed to have a normal distribution. Duncan takes a random sample of 600 male giraffes.
The mean mass of the sample is 1196 kilograms.
The standard deviation of the sample is 98 kilograms.
3
  1. Construct a 94\% confidence interval for the mean mass of male giraffes, giving your values to one decimal place.
    3
  2. Explain whether or not your answer to part (a) would change if a sample of size 5 was taken with the same mean and standard deviation.
4 Oscar is studying the daily maximum temperature in \({ } ^ { \circ } \mathrm { C }\) in a village during the month of June. He constructs a \(95 \%\) confidence interval of width \(0.8 ^ { \circ } \mathrm { C }\) using a random sample of 150 days. He assumes that the daily maximum temperature has a normal distribution.
4
  1. Find the standard deviation of Oscar's sample, giving your answer to three significant figures.
    4
  2. Oscar calculates the mean of his sample to be \(25.3 ^ { \circ } \mathrm { C }\)
    He claims that the population mean is \(26.0 ^ { \circ } \mathrm { C }\)
    Explain whether or not his confidence interval supports his claim.
    4
  3. Explain how Oscar could reduce the width of his 95\% confidence interval.
10 The label on a particular size of milk carton states that it contains 1.5 litres of milk. In an investigation at the packaging plant the contents, \(x\) litres, of each of 60 randomly selected cartons are measured. The data are summarised as follows. $$\Sigma x = 89.758 \quad \Sigma x ^ { 2 } = 134.280$$
  1. Estimate the variance of the underlying population.
  2. Find a 95\% confidence interval for the mean of the underlying population.
  3. What does the confidence interval which you have calculated suggest about the statement on the carton? Each day for 300 days a random sample of 60 cartons is selected and for each sample a \(95 \%\) confidence interval is constructed.
  4. Explain why the confidence intervals will not be identical.
  5. What is the expected number of confidence intervals to contain the population mean?