5.05a Sample mean distribution: central limit theorem

222 questions

Sort by: Default | Easiest first | Hardest first
CAIE S2 2015 November Q1
5 marks Easy -1.2
1 It is known that the number, \(N\), of words contained in the leading article each day in a certain newspaper can be modelled by a normal distribution with mean 352 and variance 29. A researcher takes a random sample of 10 leading articles and finds the sample mean, \(\bar { N }\), of \(N\).
  1. State the distribution of \(\bar { N }\), giving the values of any parameters.
  2. Find \(\mathrm { P } ( \bar { N } > 354 )\).
CAIE S2 2016 November Q3
5 marks Standard +0.3
3 A men's triathlon consists of three parts: swimming, cycling and running. Competitors' times, in minutes, for the three parts can be modelled by three independent normal variables with means 34.0, 87.1 and 56.9, and standard deviations 3.2, 4.1 and 3.8, respectively. For each competitor, the total of his three times is called the race time. Find the probability that the mean race time of a random sample of 15 competitors is less than 175 minutes.
OCR S2 2005 June Q7
13 marks Standard +0.3
7 The continuous random variable \(X\) has the probability density function shown in the diagram. \includegraphics[max width=\textwidth, alt={}, center]{b69b1fe8-790d-4727-a892-8ab2ade08962-3_364_766_1229_699}
  1. Find the value of the constant \(k\).
  2. Write down the mean of \(X\), and use integration to find the variance of \(X\).
  3. Three observations of \(X\) are made. Find the probability that \(X < 9\) for all three observations.
  4. The mean of 32 observations of \(X\) is denoted by \(\bar { X }\). State the approximate distribution of \(\bar { X }\), giving its mean and variance. \section*{[Question 8 is printed overleaf.]}
OCR S2 2007 June Q1
6 marks Moderate -0.8
1 A random sample of observations of a random variable \(X\) is summarised by $$n = 100 , \quad \Sigma x = 4830.0 , \quad \Sigma x ^ { 2 } = 249 \text { 509.16. }$$
  1. Obtain unbiased estimates of the mean and variance of \(X\).
  2. The sample mean of 100 observations of \(X\) is denoted by \(\bar { X }\). Explain whether you would need any further information about the distribution of \(X\) in order to estimate \(\mathrm { P } ( \bar { X } > 60 )\). [You should not attempt to carry out the calculation.]
OCR S2 Specimen Q1
5 marks Moderate -0.8
1 The standard deviation of a random variable \(F\) is 12.0. The mean of \(n\) independent observations of \(F\) is denoted by \(\bar { F }\).
  1. Given that the standard deviation of \(\bar { F }\) is 1.50 , find the value of \(n\).
  2. For this value of \(n\), state, with justification, what can be said about the distribution of \(\bar { F }\).
OCR MEI S3 2007 January Q1
18 marks Standard +0.3
1 The continuous random variable \(X\) has probability density function $$f ( x ) = k ( 1 - x ) \quad \text { for } 0 \leqslant x \leqslant 1$$ where \(k\) is a constant.
  1. Show that \(k = 2\). Sketch the graph of the probability density function.
  2. Find \(\mathrm { E } ( X )\) and show that \(\operatorname { Var } ( X ) = \frac { 1 } { 18 }\).
  3. Derive the cumulative distribution function of \(X\). Hence find the probability that \(X\) is greater than the mean.
  4. Verify that the median of \(X\) is \(1 - \frac { 1 } { \sqrt { 2 } }\).
  5. \(\bar { X }\) is the mean of a random sample of 100 observations of \(X\). Write down the approximate distribution of \(\bar { X }\).
OCR MEI S4 2006 June Q2
24 marks Standard +0.8
2 [In this question, you may use the result \(\int _ { 0 } ^ { \infty } u ^ { m } \mathrm { e } ^ { - u } \mathrm {~d} u = m\) ! for any non-negative integer \(m\).]
The random variable \(X\) has probability density function $$\mathrm { f } ( x ) = \begin{cases} \frac { \lambda ^ { k + 1 } x ^ { k } \mathrm { e } ^ { - \lambda x } } { k ! } , & x > 0 \\ 0 , & \text { elsewhere } \end{cases}$$ where \(\lambda > 0\) and \(k\) is a non-negative integer.
  1. Show that the moment generating function of \(X\) is \(\left( \frac { \lambda } { \lambda - \theta } \right) ^ { k + 1 }\).
  2. The random variable \(Y\) is the sum of \(n\) independent random variables each distributed as \(X\). Find the moment generating function of \(Y\) and hence obtain the mean and variance of \(Y\). [8]
  3. State the probability density function of \(Y\).
  4. For the case \(\lambda = 1 , k = 2\) and \(n = 5\), it may be shown that the definite integral of the probability density function of \(Y\) between limits 10 and \(\infty\) is 0.9165 . Calculate the corresponding probability that would be given by a Normal approximation and comment briefly.
OCR MEI S4 2007 June Q2
24 marks Challenging +1.2
2 The random variable \(X\) has the binomial distribution with parameters \(n\) and \(p\), i.e. \(X \sim \mathrm {~B} ( n , p )\).
  1. Show that the probability generating function of \(X\) is \(\mathrm { G } ( t ) = ( q + p t ) ^ { n }\), where \(q = 1 - p\).
  2. Hence obtain the mean \(\mu\) and variance \(\sigma ^ { 2 }\) of \(X\).
  3. Write down the mean and variance of the random variable \(Z = \frac { X - \mu } { \sigma }\).
  4. Write down the moment generating function of \(X\) and use the linear transformation result to show that the moment generating function of \(Z\) is $$\mathrm { M } _ { Z } ( \theta ) = \left( q \mathrm { e } ^ { - \frac { p \theta } { \sqrt { n p q } } } + p \mathrm { e } ^ { \frac { q \theta } { \sqrt { n p q } } } \right) ^ { n } .$$
  5. By expanding the exponential terms in \(\mathrm { M } _ { Z } ( \theta )\), show that the limit of \(\mathrm { M } _ { Z } ( \theta )\) as \(n \rightarrow \infty\) is \(\mathrm { e } ^ { \theta ^ { 2 } / 2 }\). You may use the result \(\lim _ { n \rightarrow \infty } \left( 1 + \frac { y + \mathrm { f } ( n ) } { n } \right) ^ { n } = \mathrm { e } ^ { y }\) provided \(\mathrm { f } ( n ) \rightarrow 0\) as \(n \rightarrow \infty\).
  6. What does the result in part (v) imply about the distribution of \(Z\) as \(n \rightarrow \infty\) ? Explain your reasoning briefly.
  7. What does the result in part (vi) imply about the distribution of \(X\) as \(n \rightarrow \infty\) ?
OCR MEI S4 2008 June Q2
24 marks Standard +0.8
2 Independent trials, on each of which the probability of a 'success' is \(p ( 0 < p < 1 )\), are being carried out. The random variable \(X\) counts the number of trials up to and including that on which the first success is obtained. The random variable \(Y\) counts the number of trials up to and including that on which the \(n\)th success is obtained.
  1. Write down an expression for \(\mathrm { P } ( X = x )\) for \(x = 1,2 , \ldots\). Show that the probability generating function of \(X\) is $$\mathrm { G } ( t ) = p t ( 1 - q t ) ^ { - 1 }$$ where \(q = 1 - p\), and hence that the mean and variance of \(X\) are $$\mu = \frac { 1 } { p } \quad \text { and } \quad \sigma ^ { 2 } = \frac { q } { p ^ { 2 } }$$ respectively.
  2. Explain why the random variable \(Y\) can be written as $$Y = X _ { 1 } + X _ { 2 } + \ldots + X _ { n }$$ where the \(X _ { i }\) are independent random variables each distributed as \(X\). Hence write down the probability generating function, the mean and the variance of \(Y\).
  3. State an approximation to the distribution of \(Y\) for large \(n\).
  4. The aeroplane used on a certain flight seats 140 passengers. The airline seeks to fill the plane, but its experience is that not all the passengers who buy tickets will turn up for the flight. It uses the random variable \(Y\) to model the situation, with \(p = 0.8\) as the probability that a passenger turns up. Find the probability that it needs to sell at least 160 tickets to get 140 passengers who turn up. Suggest a reason why the model might not be appropriate.
OCR MEI S4 2010 June Q2
24 marks Standard +0.8
2 The random variable \(X\) has the Poisson distribution with parameter \(\lambda\).
  1. Show that the probability generating function of \(X\) is \(\mathrm { G } ( t ) = \mathrm { e } ^ { \lambda ( t - 1 ) }\).
  2. Hence obtain the mean \(\mu\) and variance \(\sigma ^ { 2 }\) of \(X\).
  3. Write down the mean and variance of the random variable \(Z = \frac { X - \mu } { \sigma }\).
  4. Write down the moment generating function of \(X\). State the linear transformation result for moment generating functions and use it to show that the moment generating function of \(Z\) is $$\mathrm { M } _ { Z } ( \theta ) = \mathrm { e } ^ { \mathrm { f } ( \theta ) } \quad \text { where } \mathrm { f } ( \theta ) = \lambda \left( \mathrm { e } ^ { \theta / \sqrt { \lambda } } - \frac { \theta } { \sqrt { \lambda } } - 1 \right)$$
  5. Show that the limit of \(\mathrm { M } _ { Z } ( \theta )\) as \(\lambda \rightarrow \infty\) is \(\mathrm { e } ^ { \theta ^ { 2 } / 2 }\).
  6. Explain briefly why this implies that the distribution of \(Z\) tends to \(\mathrm { N } ( 0,1 )\) as \(\lambda \rightarrow \infty\). What does this imply about the distribution of \(X\) as \(\lambda \rightarrow \infty\) ?
OCR MEI S4 2014 June Q3
24 marks Challenging +1.8
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic, power.
  2. A chemical manufacturer is endeavouring to reduce the amount of a certain impurity in one of its bulk products by improving the production process. The amount of impurity is measured in a convenient unit of concentration, and this is modelled by the Normally distributed random variable \(X\). In the old production process, the mean of \(X\), denoted by \(\mu\), was 63 and the standard deviation of \(X\) was 3.7. Experimental batches of the product are to be made using the new process, and it is desired to examine the hypotheses \(\mathrm { H } _ { 0 } : \mu = 63\) and \(\mathrm { H } _ { 1 } : \mu < 63\) for the new process. Investigation of the variability in the new process has established that the standard deviation may be assumed unchanged. The usual Normal test based on \(\bar { X }\) is to be used, where \(\bar { X }\) is the mean of \(X\) over \(n\) experimental batches (regarded as a random sample), with a critical value \(c\) such that \(\mathrm { H } _ { 0 }\) is rejected if the value of \(\bar { X }\) is less than \(c\). The following criteria are set out.
    • If in fact \(\mu = 63\), the probability of concluding that \(\mu < 63\) must be only \(1 \%\).
    • If in fact \(\mu = 60\), the probability of concluding that \(\mu < 63\) must be \(90 \%\).
    Find \(c\) and the smallest value of \(n\) that is required. With these values, what is the power of the test if in fact \(\mu = 58.5\) ?
OCR S2 2013 January Q6
10 marks Standard +0.3
6 Gordon is a cricketer. Over a long period he knows that his population mean score, in number of runs per innings, is 28 , and the population standard deviation is 12 . In a new season he adopts a different batting style and he finds that in 30 innings using this style his mean score is 28.98 .
  1. Stating a necessary assumption, test at the \(5 \%\) significance level whether his population mean score has increased.
  2. Explain whether it was necessary to use the Central Limit Theorem in part (i).
OCR S2 2013 January Q7
9 marks Standard +0.8
7 The continuous random variable \(X\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). The mean of a random sample of \(n\) observations of \(X\) is denoted by \(\bar { X }\). It is given that \(\mathrm { P } ( \bar { X } < 35.0 ) = 0.9772\) and \(\mathrm { P } ( \bar { X } < 20.0 ) = 0.1587\).
  1. Obtain a formula for \(\sigma\) in terms of \(n\). Two students are discussing this question. Aidan says "If you were told another probability, for instance \(\mathrm { P } ( \bar { X } > 32 ) = 0.1\), you could work out the value of \(\sigma\)." Binya says, "No, the value of \(\mathrm { P } ( \bar { X } > 32 )\) is fixed by the information you know already."
  2. State which of Aidan and Binya is right. If you think that Aidan is right, calculate the value of \(\sigma\) given that \(\mathrm { P } ( \bar { X } > 32 ) = 0.1\). If you think that Binya is right, calculate the value of \(\mathrm { P } ( \bar { X } > 32 )\).
OCR S2 2009 January Q2
4 marks Standard +0.3
2 The continuous random variable \(Y\) has the distribution \(\mathrm { N } \left( 23.0,5.0 ^ { 2 } \right)\). The mean of \(n\) observations of \(Y\) is denoted by \(\bar { Y }\). It is given that \(\mathrm { P } ( \bar { Y } > 23.625 ) = 0.0228\). Find the value of \(n\).
OCR S2 2009 January Q7
12 marks Standard +0.3
7 A motorist records the time taken, \(T\) minutes, to drive a particular stretch of road on each of 64 occasions. Her results are summarised by $$\Sigma t = 876.8 , \quad \Sigma t ^ { 2 } = 12657.28$$
  1. Test, at the \(5 \%\) significance level, whether the mean time for the motorist to drive the stretch of road is greater than 13.1 minutes.
  2. Explain whether it is necessary to use the Central Limit Theorem in your test.
OCR S2 2009 January Q8
14 marks Moderate -0.3
8 A sales office employs 21 representatives. Each day, for each representative, the probability that he or she achieves a sale is 0.7 , independently of other representatives. The total number of representatives who achieve a sale on any one day is denoted by \(K\).
  1. Using a suitable approximation (which should be justified), find \(\mathrm { P } ( K \geqslant 16 )\).
  2. Using a suitable approximation (which should be justified), find the probability that the mean of 36 observations of \(K\) is less than or equal to 14.0 . 4
OCR S2 2011 January Q2
6 marks Standard +0.3
2 The random variable \(H\) has the distribution \(\mathrm { N } \left( \mu , 5 ^ { 2 } \right)\). The mean of a sample of \(n\) observations of \(H\) is denoted by \(\bar { H }\). It is given that \(\mathrm { P } ( \bar { H } > 53.28 ) = 0.0250\) and \(\mathrm { P } ( \bar { H } < 51.65 ) = 0.0968\), both correct to 4 decimal places. Find the values of \(\mu\) and \(n\).
OCR S2 2011 January Q4
7 marks Standard +0.3
4 The continuous random variable \(X\) has mean \(\mu\) and standard deviation 45. A significance test is to be carried out of the null hypothesis \(\mathrm { H } _ { 0 } : \mu = 230\) against the alternative hypothesis \(\mathrm { H } _ { 1 } : \mu \neq 230\), at the \(1 \%\) significance level. A random sample of size 50 is obtained, and the sample mean is found to be 213.4.
  1. Carry out the test.
  2. Explain whether it is necessary to use the Central Limit Theorem in your test.
OCR S2 2009 June Q6
10 marks Moderate -0.3
6 The continuous random variable \(R\) has the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). The results of 100 observations of \(R\) are summarised by $$\Sigma r = 3360.0 , \quad \Sigma r ^ { 2 } = 115782.84 .$$
  1. Calculate an unbiased estimate of \(\mu\) and an unbiased estimate of \(\sigma ^ { 2 }\).
  2. The mean of 9 observations of \(R\) is denoted by \(\bar { R }\). Calculate an estimate of \(\mathrm { P } ( \bar { R } > 32.0 )\).
  3. Explain whether you need to use the Central Limit Theorem in your answer to part (ii).
OCR S2 2009 June Q7
16 marks Standard +0.3
7 The continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 2 } { 9 } x ( 3 - x ) & 0 \leqslant x \leqslant 3 , \\ 0 & \text { otherwise } . \end{cases}$$
  1. Find the variance of \(X\).
  2. Show that the probability that a single observation of \(X\) lies between 0.0 and 0.5 is \(\frac { 2 } { 27 }\).
  3. 108 observations of \(X\) are obtained. Using a suitable approximation, find the probability that at least 10 of the observations lie between 0.0 and 0.5 .
  4. The mean of 108 observations of \(X\) is denoted by \(\bar { X }\). Write down the approximate distribution of \(\bar { X }\), giving the value(s) of any parameter(s).
OCR S2 2009 June Q8
11 marks Standard +0.3
8 In a large company the time taken for an employee to carry out a certain task is a normally distributed random variable with mean 78.0 s and unknown variance. A new training scheme is introduced and after its introduction the times taken by a random sample of 120 employees are recorded. The mean time for the sample is 76.4 s and an unbiased estimate of the population variance is \(68.9 \mathrm {~s} ^ { 2 }\).
  1. Test, at the \(1 \%\) significance level, whether the mean time taken for the task has changed.
  2. It is required to redesign the test so that the probability of making a Type I error is less than 0.01 when the sample mean is 77.0 s . Calculate an estimate of the smallest sample size needed, and explain why your answer is only an estimate.
OCR S2 2011 June Q6
12 marks Standard +0.3
6 Records show that before the year 1990 the maximum daily temperature \(T ^ { \circ } \mathrm { C }\) at a seaside resort in August can be modelled by a distribution with mean 24.3. The maximum temperatures of a random sample of 50 August days since 1990 can be summarised by $$n = 50 , \quad \Sigma t = 1314.0 , \quad \Sigma t ^ { 2 } = 36602.17 .$$
  1. Test, at the \(1 \%\) significance level, whether there is evidence of a change in the mean maximum daily temperature in August since 1990.
  2. Give a reason why it is possible to use the Central Limit Theorem in your test.
OCR S2 2012 June Q2
6 marks Standard +0.3
2
  1. For the continuous random variable \(V\), it is known that \(\mathrm { E } ( V ) = 72.0\). The mean of a random sample of 40 observations of \(V\) is denoted by \(\bar { V }\). Given that \(\mathrm { P } ( \bar { V } < 71.2 ) = 0.35\), estimate the value of \(\operatorname { Var } ( V )\).
  2. Explain why you need to use the Central Limit Theorem in part (i), and why its use is justified.
OCR S2 2013 June Q3
9 marks Challenging +1.2
3 The mean of a sample of 80 independent observations of a continuous random variable \(Y\) is denoted by \(\bar { Y }\). It is given that \(\mathrm { P } ( \bar { Y } \leqslant 157.18 ) = 0.1\) and \(\mathrm { P } ( \bar { Y } \geqslant 164.76 ) = 0.7\).
  1. Calculate \(\mathrm { E } ( Y )\) and the standard deviation of \(Y\).
  2. State
    1. where in your calculations you have used the Central Limit Theorem,
    2. why it was necessary to use the Central Limit Theorem,
    3. why it was possible to use the Central Limit Theorem.
OCR MEI S2 2015 June Q2
19 marks Moderate -0.3
2 It was stated in 2012 that \(3 \%\) of \(\pounds 1\) coins were fakes. Throughout this question, you should assume that this is still the case.
  1. Find the probability that, in a random selection of \(25 \pounds 1\) coins, there is exactly one fake coin. A random sample of \(250 \pounds 1\) coins is selected.
  2. Explain why a Poisson distribution is an appropriate approximating distribution for the number of fake coins in the sample.
  3. Use a Poisson distribution to find the probability that, in this sample, there are
    (A) exactly 10 fake coins,
    (B) at least 10 fake coins.
  4. Use a suitable approximating distribution to find the probability that there are at least 50 fake coins in a sample of 2000 coins. It is known that \(0.2 \%\) of another type of coin are fakes.
  5. A random sample of size \(n\) of these coins is taken. Using a Poisson approximating distribution, show that the probability of at most one fake coin in the sample is equal to \(\mathrm { e } ^ { - \lambda } + \lambda \mathrm { e } ^ { - \lambda }\), where \(\lambda = 0.002 n\).
  6. Use the approximation \(\mathrm { e } ^ { - \lambda } + \lambda \mathrm { e } ^ { - \lambda } \approx 1 - \frac { \lambda ^ { 2 } } { 2 }\) for small values of \(\lambda\) to estimate the value of \(n\) for which the probability in part ( \(\mathbf { v }\) ) is equal to 0.995 .