5.04a Linear combinations: E(aX+bY), Var(aX+bY)

316 questions

Sort by: Default | Easiest first | Hardest first
AQA S2 2007 June Q6
12 marks Standard +0.8
6 The continuous random variable \(X\) has the probability density function given by $$f ( x ) = \left\{ \begin{array} { c c } 3 x ^ { 2 } & 0 < x \leqslant 1 \\ 0 & \text { otherwise } \end{array} \right.$$
  1. Determine:
    1. \(\mathrm { E } \left( \frac { 1 } { X } \right)\);
      (3 marks)
    2. \(\operatorname { Var } \left( \frac { 1 } { X } \right)\).
  2. Hence, or otherwise, find the mean and the variance of \(\left( \frac { 5 + 2 X } { X } \right)\).
OCR S2 Q2
7 marks Moderate -0.8
2 The random variable \(W\) has the distribution \(B \left( 40 , \frac { 2 } { 7 } \right)\). Use an appropriate approximation to find \(\mathrm { P } ( W > 13 )\).
AQA Further AS Paper 2 Statistics 2018 June Q3
4 marks Standard +0.3
3 The discrete random variable \(X\) has the following probability distribution
\(\boldsymbol { x }\)1249
\(\mathbf { P } ( \boldsymbol { X } = \boldsymbol { x } )\)0.20.40.350.05
The continuous random variable \(Y\) has the following probability density function $$\mathrm { f } ( y ) = \begin{cases} \frac { 1 } { 64 } y ^ { 3 } & 0 \leq y \leq 4 \\ 0 & \text { otherwise } \end{cases}$$ Given that \(X\) and \(Y\) are independent, show that \(\mathrm { E } \left( X ^ { 2 } + Y ^ { 2 } \right) = \frac { 1327 } { 60 }\)
AQA Further AS Paper 2 Statistics 2023 June Q1
1 marks Easy -1.2
1 The continuous random variable \(X\) has variance 9 The discrete random variable \(Y\) has standard deviation 2 and is independent of \(X\) Find \(\operatorname { Var } ( X + Y )\) Circle your answer.
5111385
AQA Further AS Paper 2 Statistics 2023 June Q6
8 marks Standard +0.3
6 An insurance company models the number of motor claims received in 1 day using a Poisson distribution with mean 65 6
  1. Find the probability that the company receives at most 60 motor claims in 1 day. Give your answer to three decimal places. 6
  2. The company receives motor claims using a telephone line which is open 24 hours a day. Find the probability that the company receives exactly 2 motor claims in 1 hour. Give your answer to three decimal places.
    6
  3. The company models the number of property claims received in 1 day using a Poisson distribution with mean 23 Assume that the number of property claims received is independent of the number of motor claims received. 6 (c) (i) Find the standard deviation of the variable that represents the total number of motor claims and property claims received in 1 day. Give your answer to three significant figures.
    6 (c) (ii) Find the probability that the company receives a total of more than 90 motor claims and property claims in 1 day. Give your answer to three significant figures.
AQA Further Paper 3 Statistics 2020 June Q9
6 marks Challenging +1.2
9 The continuous random variable \(X\) has the cumulative distribution function shown below. $$\mathrm { F } ( x ) = \left\{ \begin{array} { c c } 0 & x < 0 \\ \frac { 1 } { 62 } \left( 4 x ^ { 3 } + 6 x ^ { 2 } + 3 x \right) & 0 \leq x \leq 2 \\ 1 & x > 2 \end{array} \right.$$ The discrete random variable \(Y\) has the probability distribution shown below.
\(y\)271319
\(\mathrm { P } ( Y = y )\)0.50.10.10.3
The random variables \(X\) and \(Y\) are independent.
Find the exact value of \(\mathrm { E } \left( X ^ { 3 } + Y \right)\).
AQA Further Paper 3 Statistics 2022 June Q6
8 marks Standard +0.3
6 The discrete random variable \(X\) has probability distribution function $$\mathrm { P } ( X = x ) = \begin{cases} a & x = 0 \\ b & x = 1 \\ c & x = 2 \\ 0 & \text { otherwise } \end{cases}$$ where \(a , b\) and \(c\) are constants.
The mean of \(X\) is 1.2 and the variance of \(X\) is 0.56
6
  1. Deduce the values of \(a , b\) and \(c\) 6
  2. The continuous random variable \(Y\) is independent of \(X\) and has variance 15 Find \(\operatorname { Var } ( X - 2 Y - 11 )\) [0pt] [2 marks]
AQA Further Paper 3 Statistics 2023 June Q6
7 marks Easy -1.2
6 A game consists of two rounds. The first round of the game uses a random number generator to output the score \(X\), a real number between 0 and 10 6
  1. Find \(\mathrm { P } ( X > 4 )\) 6
  2. The second round of the game uses an unbiased dice, with faces numbered 1 to 6 , to give the score \(Y\) The variables \(X\) and \(Y\) are independent.
    6 (b) (i) Find the mean total score of the game.
    6 (b) (ii) Find the variance of the total score of the game.
OCR MEI Further Statistics Major Specimen Q5
7 marks Standard +0.3
5 A particular brand of pasta is sold in bags of two different sizes. The mass of pasta in the large bags is advertised as being 1500 g ; in fact it is Normally distributed with mean 1515 g and standard deviation 4.7 g . The mass of pasta in the small bags is advertised as being 500 g ; in fact it is Normally distributed with mean 508 g and standard deviation 3.3 g .
  1. Find the probability that the total mass of pasta in 5 randomly selected small bags is less than 2550 g .
  2. Find the probability that the mass of pasta in a randomly selected large bag is greater than three times the mass of pasta in a randomly selected small bag.
OCR MEI Further Statistics Major Specimen Q11
24 marks Standard +0.3
11 Two girls, Lili and Hui, play a game with a fair six-sided dice. The dice is thrown 10 times. \(X _ { 1 } , X _ { 2 } , \ldots , X _ { 10 }\) represent the scores on the \(1 ^ { \text {st } } , 2 ^ { \text {nd } } , \ldots , 10 ^ { \text {th } }\) throws of the dice. \(L\) denotes Lili's score and \(L = 10 X _ { 1 }\). \(H\) denotes Hui's score and \(H = X _ { 1 } + X _ { 2 } + X _ { 3 } + \ldots + X _ { 10 }\).
  1. Calculate
    The spreadsheet below shows a simulation of 25 plays of the game. The cell E3, highlighted, shows the score when the dice is thrown the fourth time in the first game. \begin{table}[h]
    ABCDEFGHIJKLMN
    1Throw of diceLili'sHui's
    212345678910scorescore
    3Game 135211311143022
    4Game 263244353356038
    5Game 364265215236036
    6Game 415166314621035
    7Game 544316441624035
    8Game 621512515232027
    9Game 711344563421033
    10Game 811363445231032
    11Game 922243215562032
    12Game 1035335343113031
    13Game 1153655421155037
    14Game 1264324133536034
    15Game 1323212222212019
    16Game 1441331266134030
    17Game 1551263463645040
    18Game 1636115313333029
    19Game 1752524522345034
    20Game 1836355231123031
    21Game 1966315634166041
    22Game 2026456524332040
    23Game 2153545336615041
    24Game 2263556356116041
    25Game 2354556421365041
    26Game 2435232432333030
    27Game 2552424522525033
    28
    29mean37.6033.68
    30sd17.395.77
    \captionsetup{labelformat=empty} \caption{Fig. 11}
    \end{table}
  2. Use the simulation to estimate \(\mathrm { P } ( L > 40 )\) and \(\mathrm { P } ( H > 40 )\).
  3. (A) Calculate the exact value of \(\mathrm { P } ( L > 40 )\).
    (B) Comment on how the exact value compares with your estimate of \(\mathrm { P } ( L > 40 )\) in part (v). Hui wonders whether it is appropriate to use the Central Limit Theorem to approximate the distribution of \(X _ { 1 } + X _ { 2 } + X _ { 3 } + \ldots + X _ { 10 }\).
  4. (A) State what type of diagram Hui could draw, based on the output from the spreadsheet, to investigate this.
    (B) Explain how she should interpret the diagram.
  5. (A) Calculate an approximate value of \(\mathrm { P } \left( X _ { 1 } + X _ { 2 } + X _ { 3 } + \ldots + X _ { 10 } > 40 \right)\) using the Central Limit Theorem.
    (B) Comment on how this value compares with your estimate of \(\mathrm { P } ( H > 40 )\) in part (v). \section*{Copyright Information:} }{www.ocr.org.uk}) after the live examination series. If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
    For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the
WJEC Further Unit 5 2022 June Q2
15 marks Challenging +1.2
2. Geraint is a beekeeper. The amounts of honey, \(X \mathrm {~kg}\), that he collects annually, from each hive are modelled by the normal distribution \(\mathrm { N } \left( 15,5 ^ { 2 } \right)\). At location \(A\), Geraint has three hives and at location \(B\) he has five hives. You may assume that the amounts of honey collected from the eight hives are independent of each other.
    1. Find the probability that Geraint collects more than 14 kg of honey from the first hive at location \(A\).
    2. Find the probability that he collects more than 14 kg of honey from exactly two out of the three hives at location \(A\).
  1. Find the probability that the total amount of honey that Geraint collects from all eight hives is more than 160 kg .
  2. Find the probability that Geraint collects at least twice as much honey from location B as from location A.
WJEC Further Unit 5 2022 June Q7
19 marks Challenging +1.2
7. \includegraphics[max width=\textwidth, alt={}, center]{65369843-222f-48b2-b8cd-a1c304eac3d9-6_707_718_347_660} The diagram above shows a cyclic quadrilateral \(A B C D\), where \(\widehat { B A D } = \alpha , \widehat { B C D } = \beta\) and \(\alpha + \beta = 180 ^ { \circ }\). These angles are measured.
The random variables \(X\) and \(Y\) denote the measured values, in degrees, of \(\widehat { B A D }\) and \(\widehat { B C D }\) respectively. You are given that \(X\) and \(Y\) are independently normally distributed with standard deviation \(\sigma\) and means \(\alpha\) and \(\beta\) respectively.
  1. Calculate, correct to two decimal places, the probability that \(X + Y\) will differ from \(180 ^ { \circ }\) by less than \(\sigma\).
  2. Show that \(T _ { 1 } = 45 ^ { \circ } + \frac { 1 } { 4 } ( 3 X - Y )\) is an unbiased estimator for \(\alpha\) and verify that it is a better estimator than \(X\) for \(\alpha\).
  3. Now consider \(T _ { 2 } = \lambda X + ( 1 - \lambda ) \left( 180 ^ { \circ } - Y \right)\).
    1. Show that \(T _ { 2 }\) is an unbiased estimator for \(\alpha\) for all values of \(\lambda\).
    2. Find \(\operatorname { Var } \left( T _ { 2 } \right)\) in terms of \(\lambda\) and \(\sigma\).
    3. Hence determine the value of \(\lambda\) which gives the best unbiased estimator for \(\alpha\).
OCR Further Statistics 2021 June Q1
5 marks Standard +0.3
1 The performance of a piece of music is being recorded. The piece consists of three sections, \(A , B\) and \(C\). The times, in seconds, taken to perform the three sections are normally distributed random variables with the following means and standard deviations. \end{table}
QuestionAnswerMarkAOGuidance
\multirow[t]{3}{*}{1}\multirow[t]{3}{*}{(a)}
\(A + B + C \sim \mathrm {~N} ( 701 , \ldots\)
.. 419)
M11.1aNormal, mean \(\mu _ { A } + \mu _ { B } + \mu _ { C }\)\multirow{3}{*}{}
A11.1Variance 419
\(\mathrm { P } ( > 720 ) = 0.176649\)A11.1Answer, 0.177 or better, www
\multirow[t]{2}{*}{1}\multirow[t]{2}{*}{(b)}\(2 A + B \sim \mathrm {~N} ( 701,757 )\)M11.1aNormal, same mean, \(4 \sigma _ { A } { } ^ { 2 } + \sigma _ { B } { } ^ { 2 }\)\multirow{2}{*}{}
\(\mathrm { P } ( > 720 ) = 0.244919\)A1 [2]1.1Answer, art 0.245
\multirow{2}{*}{2}\multirow{2}{*}{(a)}\(\frac { { } ^ { 8 } C _ { 3 } \times { } ^ { 20 } C _ { 5 } } { { } ^ { 28 } C _ { 8 } }\)M1 A13.1b 1.1(Product of two \({ } ^ { n } C _ { r }\) ) ÷ \({ } ^ { n } C _ { r }\) At least two \({ } ^ { n } C _ { r }\) correct\multirow[t]{2}{*}{Or \(\frac { 8 } { 28 } \times \frac { 7 } { 27 } \times \frac { 6 } { 26 } \times \frac { 20 } { 25 } \times \ldots \times \frac { 16 } { 21 } \times { } ^ { 8 } C _ { 3 } = 0.27934 \ldots\)}
\(\frac { 56 \times 15504 } { 3108105 } = 0.27934 \ldots\)A1 [3]1.1Any exact form or awrt 0.279
2(b)
× B × B × B × B × B × B × B × B x
GGG in one \(\mathrm { x } , \mathrm { G }\) in another: \(9 \times 8\) \(\div \frac { 12 ! } { 8 ! \times 4 ! }\) \(= \frac { 72 } { 495 } = \frac { 8 } { 55 } \text { or } 0.145 \ldots\)
M1 A13.1b 2.1
Or e.g. \(\frac { 10 ! } { 8 ! } - 2 \times 9\)
Divide by \({ } _ { 12 } \mathrm { C } _ { 4 }\) oe
Or, e.g. find \({ } _ { 12 } \mathrm { C } _ { 4 }\) - (\# (all separate) +\#(all together) \(+ \# ( 2,1,1 ) \times 3 +\) \#(2,2))
M11.1
A11.1
[4]
QuestionAnswerMarkAOGuidance
\multirow{7}{*}{3}\multirow{7}{*}{(a)}\(\mathrm { H } _ { 0 } : \mu = 700\)B21.1One error, e.g. no or wrongIgnore failure to define \(\mu\)
\(\mathrm { H } _ { 1 } : \mu < 700\) where \(\mu\) is the mean reaction1.1letter, \(\neq\), etc : B1here
\(\bar { x } = 607\)M13.3Find sample mean
\(z = - 1.822\) or \(p = 0.0342\) or \(\mathrm { CV } = 616.05 \ldots\)A13.4Correct \(z , p\) or CV
\(z < - 1.645\) or \(p < 0.05\) or \(607 < \mathrm { CV }\)A11.1Correct comparison
Reject \(\mathrm { H } _ { 0 }\)M1ft1.1Correct first conclusionNeeds correct method, like-
Significant evidence that mean reaction timesA1ft2.2bContext, not too definite (e.g. not "international athletes' reaction times are shorter"ft on their \(z , p\) or CV
3(b)(i)Uses more information (e.g. magnitudes of differences)B1 [1]2.4
\multirow{5}{*}{3}\multirow{5}{*}{(b)}\multirow{5}{*}{(ii)}\(\mathrm { H } _ { 0 } : m = 700 , \mathrm { H } _ { 1 } : m < 700\) where \(m\) is the median reaction time for all international athletesB12.5Same as in (i) but different letter or "median" stated
\(W _ { - } = 18\)
\(W _ { + } = 3\) so \(T = 3\)
For both, and \(T\) correct
\(n = 6 , \mathrm { CV } = 2\)A11.1Correct CV
Do not reject \(\mathrm { H } _ { 0 }\). Insufficient evidence that median reaction times of international athletes are shorterA1ft [6]2.2bIn context, not too definiteFT on their \(T\)
3(c)They use different assumptionsB1 [1]2.3Not "one is more accurate"
QuestionAnswerMarkAOGuidance
4(a)\(\begin{aligned}\int _ { 0 } ^ { a } x \frac { 2 x } { a ^ { 2 } } d x = 4
{ \left[ \frac { 2 x ^ { 3 } } { 3 a ^ { 2 } } \right] = 4 }
\frac { 2 } { 3 } a = 4 \Rightarrow a = 6 \end{aligned}\)
M1
B1
A1 [3]
3.1a
1.1
2.2a
4(b)
\(\mathrm { F } ( x ) = \frac { x ^ { 2 } } { 36 }\)
Let the CDF of \(M\) be \(\mathrm { H } ( m )\). Then \(\mathrm { H } ( m ) = \mathrm { P } (\) all observations less than \(m )\) \(= [ \mathrm { P } ( X \leqslant m ) ] ^ { 5 }\) \(= \left[ \frac { m ^ { 2 } } { 36 } \right] ^ { 5 }\)
\(\mathrm { H } ( m ) = \begin{cases} 0m < 0 ,
\frac { m ^ { 10 } } { 60466176 }0 \leq m \leq 6 ,
1m > 6 . \end{cases}\)
M1 A1ft
M1
M1
A1
A1
A1
A1
[8]
1.1
1.1
2.1
3.1a
2.2a
2.1
2.1
1.2
Find \(\mathrm { F } ( x ) ; = \frac { x ^ { 2 } } { a ^ { 2 } }\)
Correct basis for CDF of \(m\)
Correct function, any letter Range \(0 \leq m \leq 6\)
Letter not \(x\), and 0, 1 present
ft on their \(a\)
Allow
Edexcel S1 2024 October Q6
Moderate -0.3
  1. A biased die with six faces is rolled. The discrete random variable \(X\) represents the score which is uppermost. The cumulative distribution function of \(X\) is shown in the table below.
\(x\)123456
\(\mathrm {~F} ( x )\)0.10.2\(3 k\)\(5 k\)\(7 k\)\(10 k\)
  1. Find the value of the constant \(k\)
  2. Find the probability distribution of \(X\) A biased die with eight faces is rolled. The discrete random variable \(Y\) represents the score which is uppermost. The probability distribution of \(Y\) is shown in the table below, where \(a\) and \(b\) are constants.
    \(y\)12345678
    \(\mathrm { P } ( Y = y )\)\(a\)\(a\)\(a\)\(b\)\(b\)\(b\)0.110.05
    Given that \(\mathrm { E } ( Y ) = 4.02\)
  3. form and solve two equations in \(a\) and \(b\) to show that \(a = 0.15\) You must show your working.
    (Solutions relying on calculator technology are not acceptable.)
  4. Show that \(\mathrm { E } \left( Y ^ { 2 } \right) = 20.7\)
  5. Find \(\operatorname { Var } ( 5 - 2 Y )\) These dice are each rolled once. The scores on the two dice are independent.
  6. Find the probability that the sum of these two scores is 3
Edexcel S3 Q7
Standard +0.8
7. A set of scaffolding poles come in two sizes, long and short. The length \(L\) of a long pole has the normal distribution \(\mathrm { N } \left( 19.7,0.5 ^ { 2 } \right)\). The length \(S\) of a short pole has the normal distribution \(\mathrm { N } \left( 4.9,0.2 ^ { 2 } \right)\). The random variables \(L\) and \(S\) are independent. A long pole and a short pole are selected at random.
  1. Find the probability that the length of the long pole is more than 4 times the length of the short pole. Four short poles are selected at random and placed end to end in a row. The random variable \(T\) represents the length of the row.
  2. Find the distribution of \(T\).
  3. Find \(\mathrm { P } ( | L - T | < 0.1 )\).
    \end{table}
    1. Some biologists were studying a large group of wading birds. A random sample of 36 were measured and the wing length, \(x \mathrm {~mm}\) of each wading bird was recorded. The results are summarised as follows
    $$\sum x = 6046 \quad \sum x ^ { 2 } = 1016338$$
  4. Calculate unbiased estimates of the mean and the variance of the wing lengths of these birds. Given that the standard deviation of the wing lengths of this particular type of bird is actually 5.1 mm ,
  5. find a \(99 \%\) confidence interval for the mean wing length of the birds from this group.
    2. Students in a mixed sixth form college are classified as taking courses in either Arts, Science or Humanities. A random sample of students from the college gave the following results \end{table}
    1. A telephone directory contains 50000 names. A researcher wishes to select a systematic sample of 100 names from the directory.
    2. Explain in detail how the researcher should obtain such a sample.
    3. Give one advantage and one disadvantage of
      1. quota sampling,
      2. systematic sampling.
    4. The heights of a random sample of 10 imported orchids are measured. The mean height of the sample is found to be 20.1 cm . The heights of the orchids are normally distributed.
    Given that the population standard deviation is 0.5 cm ,
  6. estimate limits between which \(95 \%\) of the heights of the orchids lie,
  7. find a 98\% confidence interval for the mean height of the orchids. A grower claims that the mean height of this type of orchid is 19.5 cm .
  8. Comment on the grower's claim. Give a reason for your answer.
    3. A doctor is interested in the relationship between a person's Body Mass Index (BMI) and their level of fitness. She believes that a lower BMI leads to a greater level of fitness. She randomly selects 10 female 18 year-olds and calculates each individual's BMI. The females then run a race and the doctor records their finishing positions. The results are shown in the table.
    Individual\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
    BMI17.421.418.924.419.420.122.618.425.828.1
    Finishing position35196410278
  9. Calculate Spearman's rank correlation coefficient for these data.
  10. Stating your hypotheses clearly and using a one tailed test with a \(5 \%\) level of significance, interpret your rank correlation coefficient.
  11. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data.
    4. A sample of size 8 is to be taken from a population that is normally distributed with mean 55 and standard deviation 3. Find the probability that the sample mean will be greater than 57.
    5. The number of goals scored by a football team is recorded for 100 games. The results are summarised in Table 1 below. \begin{table}[h]
    Number of goalsFrequency
    040
    133
    214
    38
    45
    \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
  12. Calculate the mean number of goals scored per game. The manager claimed that the number of goals scored per match follows a Poisson distribution. He used the answer in part (a) to calculate the expected frequencies given in Table 2. \begin{table}[h]
    Number of goalsExpected Frequency
    034.994
    1\(r\)
    2\(s\)
    36.752
    \(\geqslant 4\)2.221
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  13. Find the value of \(r\) and the value of \(s\) giving your answers to 3 decimal places.
  14. Stating your hypotheses clearly, use a \(5 \%\) level of significance to test the manager's claim.
    1. The lengths of a random sample of 120 limpets taken from the upper shore of a beach had a mean of 4.97 cm and a standard deviation of 0.42 cm . The lengths of a second random sample of 150 limpets taken from the lower shore of the same beach had a mean of 5.05 cm and a standard deviation of 0.67 cm .
    2. Test, using a \(5 \%\) level of significance, whether or not the mean length of limpets from the upper shore is less than the mean length of limpets from the lower shore. State your hypotheses clearly.
    3. State two assumptions you made in carrying out the test in part (a).
    4. A company produces climbing ropes. The lengths of the climbing ropes are normally distributed. A random sample of 5 ropes is taken and the length, in metres, of each rope is measured. The results are given below.
      119.9
      120.3
      120.1
      120.4
      120.2
    5. Calculate unbiased estimates for the mean and the variance of the lengths of the climbing ropes produced by the company.
    The lengths of climbing rope are known to have a standard deviation of 0.2 m . The company wants to make sure that there is a probability of at least 0.90 that the estimate of the population mean, based on a random sample size of \(n\), lies within 0.05 m of its true value.
  15. Find the minimum sample size required.
Edexcel S3 Q8
Moderate -0.3
  1. The random variable \(A\) is defined as
$$A = 4 X - 3 Y$$ where \(X \sim \mathrm {~N} \left( 30,3 ^ { 2 } \right) , Y \sim \mathrm {~N} \left( 20,2 ^ { 2 } \right)\) and \(X\) and \(Y\) are independent. Find
  1. \(\mathrm { E } ( A )\),
  2. \(\operatorname { Var } ( A )\). The random variables \(Y _ { 1 } , Y _ { 2 } , Y _ { 3 }\) and \(Y _ { 4 }\) are independent and each has the same distribution as \(Y\). The random variable \(B\) is defined as $$B = \sum _ { i = 1 } ^ { 4 } Y _ { i }$$
  3. Find \(\mathrm { P } ( B > A )\).
    advancing learning, changing lives
    1. A report states that employees spend, on average, 80 minutes every working day on personal use of the Internet. A company takes a random sample of 100 employees and finds their mean personal Internet use is 83 minutes with a standard deviation of 15 minutes. The company's managing director claims that his employees spend more time on average on personal use of the Internet than the report states.
    Test, at the \(5 \%\) level of significance, the managing director's claim. State your hypotheses clearly.
    2. Philip and James are racing car drivers. Philip's lap times, in seconds, are normally distributed with mean 90 and variance 9. James' lap times, in seconds, are normally distributed with mean 91 and variance 12. The lap times of Philip and James are independent. Before a race, they each take a qualifying lap.
  4. Find the probability that James' time for the qualifying lap is less than Philip's. The race is made up of 60 laps. Assuming that they both start from the same starting line and lap times are independent,
  5. find the probability that Philip beats James in the race by more than 2 minutes.
    3. A woodwork teacher measures the width, \(w \mathrm {~mm}\), of a board. The measured width, \(X \mathrm {~mm}\), is normally distributed with mean \(w \mathrm {~mm}\) and standard deviation 0.5 mm .
  6. Find the probability that \(X\) is within 0.6 mm of \(w\). The same board is measured 16 times and the results are recorded.
  7. Find the probability that the mean of these results is within 0.3 mm of \(w\). Given that the mean of these 16 measurements is 35.6 mm ,
  8. find a \(98 \%\) confidence interval for \(w\).
    1. A researcher claims that, at a river bend, the water gradually gets deeper as the distance from the inner bank increases. He measures the distance from the inner bank, \(b \mathrm {~cm}\), and the depth of a river, \(s \mathrm {~cm}\), at seven positions. The results are shown in the table below.
    advancing learning, changing lives \includegraphics[max width=\textwidth, alt={}, center]{fb233c8c-e1b7-4ba5-aa4d-c23d5382dc84-055_2632_1828_123_121}
    2. A county councillor is investigating the level of hardship, h , of a town and the number of calls per 100 people to the emergency services, c. He collects data for 7 randomly selected towns in the county. The results are shown in the table below.
    1. Interviews for a job are carried out by two managers. Candidates are given a score by each manager and the results for a random sample of 8 candidates are shown in the table below.
    \includegraphics[max width=\textwidth, alt={}, center]{fb233c8c-e1b7-4ba5-aa4d-c23d5382dc84-081_2642_1833_118_118}
    2. A random sample of size n is to be taken from a population that is normally distributed with mean 40 and standard deviation 3 . Find the minimum sample size such that the probability of the sample mean being greater than 42 is less than \(5 \%\).
    (5)
    3. The table below shows the population and the number of council employees for different towns and villages. \end{table} A nswers without working may not gain full credit. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{ \(0 - 3\) & 8
    \hline \(3 - 5\) & 12
    \hline \(5 - 6\) & 13
    \hline \(6 - 8\) & 9
    \hline \(8 - 12\) & 8
    \hline \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
  9. Show that an estimate of \(\bar { X } = 5.49\) and an estimate of \(S _ { X } ^ { 2 } = 6.88\) The post office manager believes that the customers' waiting times can be modelled by a normal distribution.
    Assuming the data is normally distributed, she calculates the expected frequencies for these data and some of these frequencies are shown in Table 2. \begin{table}[h]
    Waiting Time\(\mathrm { x } < 3\)\(3 - 5\)\(5 - 6\)\(6 - 8\)\(\mathrm { x } > 8\)
    Expected Frequency8.5612.737.56ab
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  10. Find the value of a and the value of b .
  11. Test, at the \(5 \%\) level of significance, the manager's belief. State your hypotheses clearly.
    \section*{Q uestion 4 continued}
    1. Blumen is a perfume sold in bottles. The amount of perfume in each bottle is normally distributed. The amount of perfume in a large bottle has mean 50 ml and standard deviation 5 ml . The amount of perfume in a small bottle has mean 15 ml and standard deviation 3 ml .
    One large and 3 small bottles of Blumen are chosen at random.
  12. Find the probability that the amount in the large bottle is less than the total amount in the 3 small bottles. A large bottle and a small bottle of Blumen are chosen at random.
  13. Find the probability that the large bottle contains more than 3 times the amount in the small bottle.
    \section*{Q uestion 5 continued} 6. Fruit-n-Veg4U M arket Gardens grow tomatoes. They want to improve their yield of tomatoes by at least 1 kg per plant by buying a new variety. The variance of the yield of the old variety of plant is \(0.5 \mathrm {~kg} ^ { 2 }\) and the variance of the yield for the new variety of plant is \(0.75 \mathrm {~kg} ^ { 2 }\). A random sample of 60 plants of the old variety has a mean yield of 5.5 kg . A random sample of 70 of the new variety has a mean yield of 7 kg .
  14. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not there is evidence that the mean yield of the new variety is more than 1 kg greater than the mean yield of the old variety.
  15. Explain the relevance of the Central Limit Theorem to the test in part (a). \section*{Q uestion 6 continued} \includegraphics[max width=\textwidth, alt={}, center]{fb233c8c-e1b7-4ba5-aa4d-c23d5382dc84-102_46_79_2620_1818}
    7. Lambs are born in a shed on M ill Farm. The birth weights, \(x \mathrm {~kg}\), of a random sample of 8 newborn lambs are given below. $$\begin{array} { l l l l l l l l } 4.12 & 5.12 & 4.84 & 4.65 & 3.55 & 3.65 & 3.96 & 3.40 \end{array}$$
  16. Calculate unbiased estimates of the mean and variance of the birth weight of lambs born on Mill Farm. A further random sample of 32 lambs is chosen and the unbiased estimates of the mean and variance of the birth weight of lambs from this sample are 4.55 and 0.25 respectively.
  17. Treating the combined sample of 40 lambs as a single sample, estimate the standard error of the mean. The owner of M ill Farm researches the breed of lamb and discovers that the population of birth weights is normally distributed with standard deviation 0.67 kg .
  18. Calculate a \(95 \%\) confidence interval for the mean birth weight of this breed of lamb using your combined sample mean.
    \section*{Q uestion 7 continued} \end{figure}
Pre-U Pre-U 9795/2 2010 June Q10
11 marks Challenging +1.2
10 A box contains a large number, \(n\), of identical dice, which are thought to be biased. The probability that one of these dice will show a six in a single roll is \(p\). The \(n\) dice are rolled many times and the number of sixes obtained in each trial is recorded. In \(4.01 \%\) of these trials 56 or more dice showed a six. In \(10.56 \%\) of these trials 37 or fewer dice showed a six. Using a suitable normal approximation, find the values of \(n\) and \(p\).
Pre-U Pre-U 9795/2 2010 June Q11
12 marks Standard +0.8
11 The thickness of a randomly chosen paperback book is \(P \mathrm {~cm}\) and the thickness of a randomly chosen hardback is \(H \mathrm {~cm}\), where \(P\) and \(H\) have distributions \(\mathrm { N } ( 2.0,0.75 )\) and \(\mathrm { N } ( 5.0,2.25 )\) respectively. When more than one book is selected, any book is selected independently of all other books.
  1. Calculate the probability that a randomly chosen hardback is more than 1 cm thicker than a randomly chosen paperback.
  2. Calculate the probability that 2 paperbacks and 4 hardbacks, randomly chosen, have a combined thickness of less than 20 cm .
  3. Find the probability that a randomly chosen hardback is more than twice the thickness of a randomly chosen paperback.
Pre-U Pre-U 9795/2 2011 June Q1
3 marks Standard +0.3
1 The independent random variables \(X\) and \(Y\) have distributions \(\mathrm { N } ( 30,9 )\) and \(\mathrm { N } ( 20,4 )\) respectively.
  1. Give the distribution of $$\left( X _ { 1 } + X _ { 2 } + X _ { 3 } \right) - \left( Y _ { 1 } + Y _ { 2 } + Y _ { 3 } + Y _ { 4 } \right)$$ where \(X _ { i } , i = 1,2,3\), and \(Y _ { j } , j = 1,2,3,4\), are independent observations of \(X\) and \(Y\) respectively. The time for female cadets to complete an assault course is \(X\) minutes and the time for male cadets to complete the same assault course is \(Y\) minutes.
  2. Find the probability that the total time for three randomly chosen female cadets to complete the assault course is greater than the total time for four randomly chosen male cadets to complete the assault course.
Pre-U Pre-U 9795/2 2012 June Q2
9 marks Standard +0.3
2 The independent random variables \(X\) and \(Y\) have normal distributions where \(X \sim \mathrm {~N} \left( \mu , \sigma ^ { 2 } \right)\) and \(Y \sim \mathrm {~N} \left( 3 \mu , 4 \sigma ^ { 2 } \right)\). Two random samples each of size \(n\) are taken, one from each of these normal populations.
  1. Show that \(a \bar { X } + b \bar { Y }\) is an unbiased estimator of \(\mu\) provided that \(a + 3 b = 1\), where \(a\) and \(b\) are constants and \(\bar { X }\) and \(\bar { Y }\) are the respective sample means. In the remainder of the question assume that \(a \bar { X } + b \bar { Y }\) is an unbiased estimator of \(\mu\).
  2. Show that \(\operatorname { Var } ( a \bar { X } + b \bar { Y } )\) can be written as \(\frac { \sigma ^ { 2 } } { n } \left( 1 - 6 b + 13 b ^ { 2 } \right)\).
  3. The value of the constant \(b\) can be varied. Find the value of \(b\) that gives the minimum of \(\operatorname { Var } ( a \bar { X } + b \bar { Y } )\), and hence find the minimum of \(\operatorname { Var } ( a \bar { X } + b \bar { Y } )\) in terms of \(\sigma\) and \(n\).
Pre-U Pre-U 9795/2 2013 June Q5
8 marks Standard +0.3
5 The discrete random variable \(X\) has probability generating function given by $$\mathrm { G } _ { X } ( t ) = k \left( 5 t ^ { - 1 } + 3 + 2 t ^ { 2 } \right) ,$$ where \(k\) is a constant.
  1. Find
    1. the value of \(k\),
    2. the modal value of \(X\).
    3. The random variables \(X _ { 1 }\) and \(X _ { 2 }\) are independent observations of \(X\).
      (a) Write down the probability generating function of \(Y\), where \(Y = X _ { 1 } + X _ { 2 }\).
      (b) Use your answer to part (ii)(a) to find \(\mathrm { E } ( Y )\) and \(\operatorname { Var } ( Y )\).
Pre-U Pre-U 9795/2 2013 November Q2
Standard +0.8
2
  1. The statistic \(T\) is derived from a random sample taken from a population which has an unknown parameter \(\theta\). \(T\) is an unbiased estimator of \(\theta\). What does the statement ' \(T\) is an unbiased estimator of \(\theta ^ { \prime }\) imply?
  2. A random sample of size \(n\) is taken from each of two independent populations. The first population has a non-zero mean \(\mu\) and variance \(\sigma ^ { 2 }\) and \(\bar { X } _ { 1 }\) denotes the sample mean. The second population has mean \(\frac { 1 } { 2 } \mu\) and variance \(b \sigma ^ { 2 }\), where \(b\) is a positive constant, and \(\bar { X } _ { 2 }\) denotes the sample mean. Two unbiased estimators for \(\mu\) are defined by $$T _ { 1 } = 3 \bar { X } _ { 1 } - a \bar { X } _ { 2 } \quad \text { and } \quad T _ { 2 } = \frac { 1 } { 5 } \left( 4 \bar { X } _ { 1 } + 2 \bar { X } _ { 2 } \right) .$$
    1. Determine the value of \(a\).
    2. Show that \(\operatorname { Var } \left( T _ { 1 } \right) = \frac { \sigma ^ { 2 } } { n } ( 9 + 16 b )\) and find a similar expression for \(\operatorname { Var } \left( T _ { 2 } \right)\).
    3. The estimator with the smaller variance is preferred. State which of \(T _ { 1 }\) and \(T _ { 2 }\) is the preferred estimator of \(\mu\).
Pre-U Pre-U 9795/2 2013 November Q3
Standard +0.3
3 The number of signal failures in a certain region of the railway network averages 10 every 3 weeks. Assume that signal failures occur independently, randomly and at constant mean rate.
  1. Find the probability that
    1. there are between 7 and 12 (inclusive) signal failures in a three-week period,
    2. there are more than 4 signal failures in a one-week period.
    3. It has been calculated, using a suitable distributional approximation, that the probability of more than 62 signal failures in a period of \(n\) weeks is 0.0385 . Find the value of \(n\).
Pre-U Pre-U 9795/2 2015 June Q1
4 marks Standard +0.8
1 The independent random variables \(X\) and \(Y\) are such that $$X \sim \mathrm {~N} ( \mu , 11 ) , \quad Y \sim \mathrm {~N} \left( 10 , \sigma ^ { 2 } \right) \quad \text { and } \quad 2 X - 5 Y \sim \mathrm {~N} ( 0,144 ) .$$ Find
  1. the values of \(\mu\) and \(\sigma ^ { 2 }\),
  2. \(\mathrm { P } ( X - Y > 10 )\).
Pre-U Pre-U 9795/2 2016 June Q6
6 marks Standard +0.3
6 A continuous random variable \(X\) has probability density function $$\mathrm { f } ( x ) = \begin{cases} 4 x \mathrm { e } ^ { - 2 x } & x \geqslant 0 \\ 0 & \text { otherwise } . \end{cases}$$
  1. Show that the moment generating function \(\mathrm { M } _ { X } ( t )\) of \(X\) is \(\frac { 4 } { ( 2 - t ) ^ { 2 } }\). You may assume that \(x \mathrm { e } ^ { - k x } \rightarrow 0\) as \(x \rightarrow + \infty\).
  2. What condition on \(t\) is needed in finding \(\mathrm { M } _ { X } ( t )\) ?
  3. \(Y\) is the sum of three independent observations of \(X\). Find the moment generating function of \(Y\), and use your answer to find \(\operatorname { Var } ( Y )\).