Questions — OCR MEI Further Statistics Minor (42 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR MEI Further Statistics Minor 2019 June Q1
1 In a game at a charity fair, a spinner is spun 4 times.
On each spin the chance that the spinner lands on a score of 5 is 0.2 .
The random variable \(X\) represents the number of spins on which the spinner lands on a score of 5 .
  1. Find \(\mathrm { P } ( X = 3 )\).
  2. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    One game costs \(\pounds 1\) to play and, for each spin that lands on a score of 5 , the player receives 50 pence.
    1. Find the expected total amount of money gained by a player in one game.
    2. Find the standard deviation of the total amount of money gained by a player in one game.
OCR MEI Further Statistics Minor 2019 June Q2
2 A market researcher wants to interview people who watched a particular television programme. Audience research data used by the broadcaster indicates that \(12 \%\) of the adult population watched this programme. This figure is used to model the situation.
The researcher asks people in a shopping centre, one at a time, if they watched the programme. You should assume that these people form a random sample of the adult population.
  1. Find the probability that the fifth person the researcher asks is the first to have watched the programme.
  2. Find the probability that the researcher has to ask at least 10 people in order to find one who watched the programme.
  3. Find the probability that the twentieth person the researcher asks is the third to have watched the programme.
  4. Find how many people the researcher would have to ask to ensure that there is a probability of at least 0.95 that at least one of them watched the programme.
OCR MEI Further Statistics Minor 2019 June Q3
3 A company has been commissioned to make 50 very expensive titanium components.
A sample of the components needs to be tested to ensure that they are sufficiently strong. However, this is a test to destruction, so the components which are tested can no longer be used.
  1. Explain why it would not be appropriate to use a census in these circumstances. A manager suggests that the first 5 components to be manufactured should be tested.
  2. Explain why this would not be a sensible method of selecting the sample. A statistician advises the manager that the sample selected should be a random sample.
  3. Give two desirable features (other than randomness) that the sample should have.
OCR MEI Further Statistics Minor 2019 June Q4
4 Zara uses a metal detector to search for coins on a beach.
She wonders if the numbers of coins that she finds in an area of \(10 \mathrm {~m} ^ { 2 }\) can be modelled by a Poisson distribution. The table below shows the numbers of coins that she finds in randomly chosen areas of \(10 \mathrm {~m} ^ { 2 }\) over a period of months.
Number of coins found0123456\(> 6\)
Frequency1328301410230
  1. Software gives the sample mean as 1.98 and the sample standard deviation as 1.4212. Explain how these values suggest that a Poisson distribution may be an appropriate model for the numbers of coins found. Zara decides to carry out a chi-squared test to investigate whether a Poisson distribution is an appropriate model.
    Fig. 4 is a screenshot showing part of the spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    ABCD
    1Number of coins foundObserved frequencyExpected frequencyChi-squared contribution
    201313.80690.0472
    3128
    423027.06430.3184
    531417.86250.8352
    64108.84190.1517
    7\(\geqslant 5\)50.0015
    \captionsetup{labelformat=empty} \caption{Fig. 4}
    \end{table}
  2. Showing your calculations, find the missing values in each of the following cells.
    • C3
    • C7
    • D3
    • Explain why the numbers for 5, 6 and more than 6 coins found have been combined into the single category of at least 5 coins found, as shown in the spreadsheet.
    • Complete the hypothesis test at the \(5 \%\) level of significance.
    For the rest of this question, you should assume that the number of coins that Zara finds in an area of \(10 \mathrm {~m} ^ { 2 }\) can be modelled by a Poisson distribution with mean 1.98.
    Zara also finds pieces of jewellery independently of the coins she finds. The number of pieces of jewellery that she finds per \(10 \mathrm {~m} ^ { 2 }\) area is modelled by a Poisson distribution with mean 0.42 .
  3. Find the probability that Zara finds a total of exactly 3 items (coins and/or jewellery) in an area of \(10 \mathrm {~m} ^ { 2 }\).
  4. Find the probability that Zara finds a total of at least 30 items (coins and/or jewellery) in an area of \(100 \mathrm {~m} ^ { 2 }\).
OCR MEI Further Statistics Minor 2019 June Q5
5 A student wants to know if there is a positive correlation between the amounts of two pollutants, sulphur dioxide and PM10 particulates, on different days in the area of London in which he lives; these amounts, measured in suitable units, are denoted by \(s\) and \(p\) respectively.
He uses a government website to obtain data for a random sample of 15 days on which the amounts of these pollutants were measured simultaneously. Fig. 5.1 is a scatter diagram showing the data. Summary statistics for these 15 values of \(s\) and \(p\) are as follows.
\(\sum s _ { 1 } = 155.4 \quad \sum p = 518.9 \quad \sum s ^ { 2 } = 2322.7 \quad \sum p ^ { 2 } = 21270.5 \quad \sum s p = 6009.1\) \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-4_935_1134_683_260} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. Explain why the student might come to the conclusion that a test based on Pearson's product moment correlation coefficient may be valid.
  2. Find the value of Pearson's product moment correlation coefficient.
  3. Carry out a test at the \(5 \%\) significance level to investigate whether there is positive correlation between the amounts of sulphur dioxide and PM10 particulates.
  4. Explain why the student made sure that the sample chosen was a random sample. The student also wishes to model the relationship between the amounts of nitrogen dioxide \(n\) and PM10 particulates \(p\).
    He takes a random sample of 54 values of the two variables, both measured at the same times. Fig. 5.2 is a scatter diagram which shows the data, together with the regression line of \(n\) on \(p\), the equation of the regression line and the value of \(r ^ { 2 }\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-5_824_1230_495_258} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  5. Predict the value of \(n\) for \(p = 150\).
  6. Discuss the reliability of your prediction in part (e).
OCR MEI Further Statistics Minor 2019 June Q6
6 The discrete random variable \(X\) has a uniform distribution over \(\{ n , n + 1 , \ldots , 2 n \}\).
  1. Given that \(n\) is odd, find \(\mathrm { P } \left( X < \frac { 3 } { 2 } n \right)\).
  2. Given instead that \(n\) is even, find \(\mathrm { P } \left( X < \frac { 3 } { 2 } n \right)\), giving your answer as a single algebraic fraction.
  3. The sum of 6 independent values of \(X\) is denoted by \(Y\). Find \(\operatorname { Var } ( Y )\).
OCR MEI Further Statistics Minor 2022 June Q1
1 In a quiz a contestant is asked up to four questions. The contestant's turn ends once the contestant gets a question wrong or has answered all four questions. The probability that a particular contestant gets any question correct is 0.6 , independently of other questions. The discrete random variable \(X\) models the number of questions which the contestant gets correct in a turn.
  1. Show that \(\mathrm { P } ( X = 4 ) = 0.1296\). The probability distribution of \(X\) is shown in Fig. 1.1. \begin{table}[h]
    \(r\)01234
    \(\mathrm { P } ( X = r )\)0.40.240.1440.08640.1296
    \captionsetup{labelformat=empty} \caption{Fig. 1.1}
    \end{table}
  2. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    The number of points that a contestant scores is as shown in Fig. 1.2. \begin{table}[h]
    Number of
    questions correct
    Number of
    points scored
    0 or 10
    22
    33
    45
    \captionsetup{labelformat=empty} \caption{Fig. 1.2}
    \end{table} The discrete random variable \(Y\) models the number of points which the contestant scores.
  3. Without doing any working, explain whether each of the following will be less than, equal to or greater than the corresponding value for \(X\).
    • \(\mathrm { E } ( Y )\)
    • \(\operatorname { Var } ( Y )\)
OCR MEI Further Statistics Minor 2022 June Q2
2 A forester is investigating the relationship between the diameter and the height of young beech trees. She selects a random sample of 15 young beech trees in a forest and records their diameters, \(d \mathrm {~cm}\), and their heights, \(h \mathrm {~m}\). The data are illustrated in the scatter diagram.
\includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-3_649_1116_386_230}
  1. State whether either or both of the variables \(d\) and \(h\) are random variables. Summary data for the diameters and heights are as follows. $$\mathrm { n } = 15 \quad \sum \mathrm {~d} = 84.9 \quad \sum \mathrm {~h} = 124.7 \quad \sum \mathrm {~d} ^ { 2 } = 624.55 \quad \sum \mathrm {~h} ^ { 2 } = 1230.57 \quad \sum \mathrm { dh } = 866.63$$
  2. Find the equation of the regression line of \(h\) on \(d\). Give your answer in the form \(h = a d + b\), giving the values of \(a\) and \(b\) correct to \(\mathbf { 2 }\) decimal places.
  3. Use the regression line to predict the heights of beech trees with the following diameters.
    • 7.5 cm
    • 20.0 cm
    • Comment on the reliability of your predictions.
    • There are many mature beech trees with diameter of 60 cm or greater. However, there are no beech trees with a height of more than 50 m .
    Comment on this in relation to your regression line.
  4. State the coordinates of the point at which the regression line of \(d\) on \(h\) meets the line which you calculated in part (b).
OCR MEI Further Statistics Minor 2022 June Q3
3 Jane wonders whether the number of wasps entering a wasp's nest per 5 second interval can be modelled by a Poisson distribution with mean \(\mu\). She counts the number of wasps entering the nest over 60 randomly selected 5 -second intervals. The results are shown in Fig. 3.1. \begin{table}[h]
Number of wasps0123456789\(\geqslant 10\)
Frequency025512101011140
\captionsetup{labelformat=empty} \caption{Fig. 3.1}
\end{table}
  1. Show that a suitable estimate for the value of \(\mu\) is 5.1. Fig. 3.2 shows part of a screenshot for a \(\chi ^ { 2 }\) test to assess the goodness of fit of a Poisson model. The sample mean has been used as an estimate for the population mean. Some of the values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    ABCDE
    \includegraphics[max width=\textwidth, alt={}]{e8624e9b-5143-49d2-9683-cc3a1082694e-4_132_40_1069_273}Number of waspsObserved frequencyPoisson probabilityExpected frequencyChi-squared contribution
    2\(\leqslant 2\)70.11656.98870.0000
    3358.08741.1786
    44120.2765
    55100.0255
    66100.14908.94000.1257
    77110.10866.51343.0904
    8\(\geqslant 8\)50.14408.6414
    9
    \captionsetup{labelformat=empty} \caption{Fig. 3.2}
    \end{table}
  2. Determine the missing values in each of the following cells, giving your answers correct to 4 decimal places.
    • C3
    • D5
    • E8
    • Explain why some of the frequencies have been combined into the categories \(\leqslant 2\) and \(\geqslant 8\).
    • In this question you must show detailed reasoning.
    Carry out the hypothesis test at the 5\% significance level.
  3. Jane also carries out a \(\chi ^ { 2 }\) test for the number of wasps leaving another nest. As part of her calculations, she finds that the probability of no wasps leaving the nest in a 5 -second period is 0.0053 . She finds that a Poisson distribution is also an appropriate model in this case. Find a suitable estimate for the value of the mean number of wasps leaving the nest per 5-second period.
OCR MEI Further Statistics Minor 2022 June Q4
4 Alex is practising bowling at a cricket wicket. Every time she bowls a ball, she has a \(30 \%\) chance of hitting the wicket.
  1. Assuming that successive bowls are independent, determine the probability that Alex first hits the wicket on her third attempt.
  2. Determine the probability that Alex hits the wicket for the fourth time on her tenth attempt.
OCR MEI Further Statistics Minor 2022 June Q5
5 A medical researcher is investigating whether there is any relationship between the age of a person and the level of a particular protein in the person’s blood. She measures the levels of the protein (measured in suitable units) in a random sample of 12 hospital patients of various ages (in years). The spreadsheet shows the values obtained, together with a scatter diagram which illustrates the data.
\includegraphics[max width=\textwidth, alt={}, center]{e8624e9b-5143-49d2-9683-cc3a1082694e-5_736_1470_1087_246}
  1. The researcher decides that a test based on Pearson's product moment correlation coefficient may not be valid. Explain why she comes to this conclusion.
  2. Calculate the value of Spearman's rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between age and protein level.
  4. Explain why the researcher chose a sample that was random.
  5. The researcher had originally intended to use a sample size of 6 rather than the 12 that she actually used. Explain what advantage there is in using the larger sample size.
OCR MEI Further Statistics Minor 2022 June Q6
6 The random variable \(X\) has a uniform distribution over the values \(\{ 1,4,7 , \ldots , 3 n - 2 \}\), where \(n\) is a positive integer.
  1. Determine \(\operatorname { Var } ( X )\) in terms of \(n\).
  2. Given that \(n = 100\), find the probability that \(X\) is within one standard deviation of the mean.
OCR MEI Further Statistics Minor 2023 June Q1
1 A fair spinner has ten sectors, labelled \(1,2 , \ldots , 10\). In order to start a game, Kofi has to obtain an 8,9 or 10 on the spinner.
  1. Find the probability that Kofi starts the game on the third spin.
  2. Find the probability that Kofi takes at least 5 spins to start the game.
  3. Determine the probability that the number of spins required to start the game is within 1 standard deviation of its mean.
OCR MEI Further Statistics Minor 2023 June Q2
2 A company manufactures batches of twenty thousand tins which are subsequently filled with fruit. The company tests tins from each batch to make sure that they are strong enough. The test is easy and cheap to carry out, but when a tin has been tested it is no longer suitable for filling with fruit.
    1. Explain why a sample size of 5 tins per batch may not be appropriate in this case.
    2. Explain why a sample size of 1000 tins per batch may not be appropriate in this case. The company tests a sample of 30 tins from each batch.
  1. Explain why it would not be sensible for the sample to consist of the final 30 tins produced in a batch.
  2. Give two features that the sample should have.
OCR MEI Further Statistics Minor 2023 June Q3
3 A fair four-sided dice has its faces numbered \(0,1,2,3\). The dice is rolled three times. The discrete random variable \(X\) is the sum of the lowest and highest scores obtained.
  1. Show that \(\mathrm { P } ( X = 1 ) = \frac { 3 } { 32 }\). The table below shows the probability distribution of \(X\).
    \(r\)0123456
    \(\mathrm { P } ( X = r )\)\(\frac { 1 } { 64 }\)\(\frac { 3 } { 32 }\)\(\frac { 13 } { 64 }\)\(\frac { 3 } { 8 }\)\(\frac { 13 } { 64 }\)\(\frac { 3 } { 32 }\)\(\frac { 1 } { 64 }\)
  2. In this question you must show detailed reasoning. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    • The random variable \(Y\) represents the sum of 10 values of \(X\).
      1. State a property of the 10 values of \(X\) that would make it possible to deduce the standard deviation of \(Y\).
      2. Given that this property holds, determine the standard deviation of \(Y\).
OCR MEI Further Statistics Minor 2023 June Q4
4 Eve lives in a narrow lane in the country. She wonders whether the number of vehicles passing her house per minute can be modelled by a Poisson distribution with mean \(\mu\). She counts the number of vehicles passing her house over 100 randomly selected one-minute intervals. The results are shown in Table 4.1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 4.1}
Number of vehicles012345678910\(\geqslant 11\)
Frequency3633141041001010
\end{table}
  1. Use the results to find an estimate for \(\mu\). The spreadsheet in Fig. 4.2 shows data for a \(\chi ^ { 2 }\) test to assess the goodness of fit of a Poisson model. The sample mean from part (a) has been used as an estimate for the population mean. Some of the values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 4.2}
    \multirow[b]{2}{*}{1}ABCDE
    Number of vehiclesObserved frequencyPoisson probabilityExpected frequencyChi-squared contribution
    20360.272527.25322.8073
    31330.354335.4291
    42143.5400
    5\(\geqslant 3\)170.5145
    6
    \end{table}
  2. Calculate the missing values in each of the following cells, giving your answers correct to 4 decimal places.
    • C4
    • D5
    • E3
    • In this question you must show detailed reasoning.
    Carry out the \(\chi ^ { 2 }\) test at the 5\% significance level.
  3. Eve checks her data and notices that the two largest numbers of vehicles per minute (8 and 10) occurred when some horses were being ridden along the lane, causing delays to the vehicles. She therefore repeats the analysis, missing out these two items of data. She finds that the value of the \(\chi ^ { 2 }\) test statistic is now 4.748. The number of degrees of freedom of the test is unchanged. Make two comments about this revised test.
OCR MEI Further Statistics Minor 2023 June Q5
5 An ornithologist is investigating the link between the wing length and the mass of small birds, in order to try to predict the mass from the wing length without having to weigh birds. The ornithologist takes a random sample of 9 birds and measures their wing lengths \(w \mathrm {~mm}\) and their masses \(m g\). The spreadsheet below shows the data, together with a scatter diagram which illustrates the data.
\includegraphics[max width=\textwidth, alt={}, center]{72215d69-c3e6-492d-bb3e-bdc28aeb4613-5_719_1424_495_246}
  1. Find the equation of the regression line of \(m\) on \(w\), giving the coefficients correct to \(\mathbf { 3 }\) significant figures.
  2. Use the equation which you found in part (a) to estimate the mass for each of the following wing lengths.
    • 99 mm
    • 110 mm
    • Comment on the reliability of your estimates.
    • The equation of the regression line of \(w\) on \(m\) is \(w = 0.473 m + 87.5\). A friend of the ornithologist suggests that this equation could also be used to estimate the masses of birds from their wing lengths.
    Comment on this suggestion.
OCR MEI Further Statistics Minor 2023 June Q6
6 Each competitor in a lumberjacking competition has to perform various disciplines for which they are timed. A spectator thinks that the times for two of the disciplines, chopping wood and sawing wood, are related. The table and the scatter diagram below show the times of a random sample of 8 competitors in these two disciplines.
CompetitorABCDEFGH
Sawing17.116.714.314.012.821.515.314.4
Chopping23.520.621.918.821.524.819.719.3
\includegraphics[max width=\textwidth, alt={}, center]{72215d69-c3e6-492d-bb3e-bdc28aeb4613-6_786_1130_708_239}
  1. The spectator decides to carry out a hypothesis test to investigate whether there is any relationship. Explain why the spectator decides that a test based on Pearson's product moment correlation coefficient may not be valid.
  2. Determine the value of Spearman's rank correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is positive association between sawing and chopping times.
OCR MEI Further Statistics Minor 2023 June Q7
7 The discrete random variable \(X\) has a uniform distribution over the set of all integers between 100 and \(n\) inclusive, where \(n\) is a positive integer with \(n > 100\).
  1. Given that \(n\) is even, determine \(\mathrm { P } \left( \mathrm { X } < \frac { 100 + \mathrm { n } } { 2 } \right)\).
  2. Determine the variance of the sum of 50 independent values of \(X\), giving your answer in the form \(\mathrm { a } \left( \mathrm { n } ^ { 2 } + \mathrm { bn } + \mathrm { c } \right)\), where \(a , b\) and \(c\) are constants.
OCR MEI Further Statistics Minor 2024 June Q1
1 When a footballer takes a penalty kick the result is that either a goal is scored or a goal is not scored. It is known that, on average, a certain footballer scores a goal on \(85 \%\) of penalty kicks. During one practice session, the footballer decides to take penalty kicks until a goal is not scored. You may assume that the outcome of each penalty kick that the footballer takes is independent of the outcome of each other penalty kick. The random variable representing the number of penalty kicks up to and including the first penalty kick that does not result in a goal is denoted by \(X\).
  1. State one further assumption that is necessary for \(X\) to be modelled by a Geometric distribution. For the remainder of this question you may assume that this assumption is valid.
  2. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    • Find the probability that the footballer takes exactly 3 penalty kicks.
    • Find the probability that the footballer takes at least 5 penalty kicks.
OCR MEI Further Statistics Minor 2024 June Q2
2 The sides of a fair 12 -sided spinner are labelled \(1,2 , \ldots , 12\). The spinner is spun and \(X\) is the random variable denoting the number on the side of the spinner that it lands on.
  1. Suggest a suitable distribution to model \(X\). You should state the value(s) of any parameter(s).
  2. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    You are given that \(\mathrm { E } ( X )\) is denoted by \(\mu\) and \(\operatorname { Var } ( X )\) is denoted by \(\sigma ^ { 2 }\).
  3. Determine \(\mathrm { P } \left( \left| \frac { 2 ( X - \mu ) } { \sigma } \right| > 1 \right)\).
OCR MEI Further Statistics Minor 2024 June Q3
3 The scatter diagram below illustrates data concerning average annual income per person, \(
) x\(, and average life expectancy, \)y$ years, for 45 randomly selected cities.
\includegraphics[max width=\textwidth, alt={}, center]{464c80be-007b-4d5a-9fe5-2f35100bdea6-3_860_1465_354_244}
  1. State whether neither variable, one variable or both variables can be considered to be random in this situation. A student is researching possible positive association between average annual income and average life expectancy. The student decides that the data point labelled A on the scatter diagram is an outlier.
  2. Describe the apparent relationship between average annual income and average life expectancy for this data point relative to the rest of the data. The data for point A is removed. The student now wishes to carry out a hypothesis test using the product moment correlation coefficient for the remaining 44 data points to investigate whether there is positive correlation between average annual income and average life expectancy.
  3. Explain why this type of hypothesis test is appropriate in this situation. Justify your answer. The summary statistics for these 44 data points are as follows.
    \(\sum x = 751120 \sum y = 2397.1 \sum x ^ { 2 } = 14363849200 \sum y ^ { 2 } = 133014.63 \sum x y = 42465962\)
  4. Determine the value of the product moment correlation coefficient.
  5. Carry out the test at the 1\% significance level.
OCR MEI Further Statistics Minor 2024 June Q4
4 A genetics researcher is investigating whether there is any association between natural hair colour and natural eye colour. A random sample of 800 adults is selected. Each adult can categorise their natural hair colour as blonde, brown, black or red and their natural eye colour as brown, blue or green.
  1. Explain the benefit of using a random sample in this investigation. The data collected from the sample are summarised in Table 4.1. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.1}
    \multirow{2}{*}{Observed frequency}Hair Colour
    BlondeBrownBlackRedTotal
    \multirow{3}{*}{Eye Colour}Brown4715319636432
    Blue617811526280
    Green1922311688
    Total12725334278800
    \end{table} The researcher decides to carry out a chi-squared test.
  2. Determine the expected frequencies for each eye colour in the blonde hair category. You are given that the test statistic is 28.62 to 2 decimal places.
  3. Carry out the chi-squared test at the 10\% significance level. Table 4.2 shows the chi-squared contributions for some of the categories. The contributions for the categories relating to green eye colour have been deliberately omitted. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.2}
    Hair Colour
    \cline { 2 - 6 }BlondeBrownBlackRed
    \multirow{3}{*}{
    Eye
    Colour
    }
    Brown6.7911.9640.6940.889
    \cline { 2 - 6 }Blue6.1621.2570.1850.062
    \cline { 2 - 6 }Green
    \end{table}
  4. Calculate the chi-squared contribution for the green eye and blonde hair category.
  5. With reference to the values in Table 4.2, discuss what the data suggest about brown eye colour and blue eye colour for people with blonde hair.
  6. A different researcher, carrying out the same investigation, independently takes a different random sample of size 800 and performs the same hypothesis test, but at the 1\% significance level, reaching the same conclusion as the original test. By comparing only the significance level of the two tests, specify which test, the one at the 10\% significance level or the one at the 1\% significance level, provides stronger evidence for the conclusion. Justify your answer.
  7. OCR MEI Further Statistics Minor 2024 June Q5
    5 Over a long period of time, it is found that the mean number of mistakes made by a certain player when playing a particular piece of music is 5 . The number of mistakes that the player makes when playing the piece is denoted by the random variable \(Y\).
    1. State two assumptions necessary for \(Y\) to be modelled by a Poisson distribution. For the remainder of this question you may assume that \(Y\) can be modelled by a Poisson distribution.
      1. Find the probability that the player makes exactly 3 mistakes when playing the piece.
      2. Find the probability that the player makes fewer than 3 mistakes when playing the piece.
      3. Find the probability that the player makes fewer than 6 mistakes in total when playing the piece twice, assuming that the performances are independent. In a recording studio, the player plays the piece once in the morning and once in the afternoon each day for one week (7 days). It can be assumed that all the performances are independent of each other. The performances are recorded onto two CDs, one for each of two critics, A and B, to review. The critics are interested in the total number of mistakes made by the player per day. Unfortunately, there is a recording error in one of the CDs; on this CD, every piece that is supposed to be an afternoon recording is in fact just a repeat of that morning’s recording. The random variables \(M _ { 1 }\) and \(M _ { 2 }\) represent the total number of mistakes per day for the correctly recorded CD and for the wrongly recorded CD respectively.
    2. By considering the values of \(\mathrm { E } \left( M _ { 1 } \right)\) and \(\mathrm { E } \left( M _ { 2 } \right)\) explain why it is not possible to use the mean number of mistakes per day on the CDs to determine which critic received the wrongly recorded CD. Each critic counts the total number of mistakes made per day, for each of the 7 days of recordings on their CD. Summary data for this is given below. Critic A: \(\quad n = 7 , \quad \sum x _ { A } = 70 , \quad \sum x _ { A } ^ { 2 } = 812\)
      Critic B: \(\quad \mathrm { n } = 7 , \sum \mathrm { x } _ { \mathrm { B } } = 72 , \sum \mathrm { x } _ { \mathrm { B } } ^ { 2 } = 800\)
    3. By considering the values of \(\operatorname { Var } \left( M _ { 1 } \right)\) and \(\operatorname { Var } \left( M _ { 2 } \right)\) determine which critic is likely to have received the wrongly recorded CD.
    OCR MEI Further Statistics Minor 2024 June Q6
    6 The probability distribution of a discrete random variable, \(X\), is shown in the table below.
    \(x\)012
    \(\mathrm { P } ( X = x )\)\(1 - a - b\)\(a\)\(b\)
    1. Find \(\mathrm { E } ( X )\) in terms of \(a\) and \(b\).
      1. In the case where \(\mathrm { E } ( \mathrm { X } ) = \mathrm { a } + 0.4\), find an expression for \(\operatorname { Var } ( X )\) in terms of \(a\).
      2. In this case, show that the greatest possible value of \(\operatorname { Var } ( X )\) is 0.65 . You must state the associated value of \(a\).
    2. You are now given instead that \(\mathrm { E } ( X )\) is not known.
      1. State the least possible value of \(\operatorname { Var } ( X )\).
      2. Give all possible pairs of values of \(a\) and \(b\) which give the least possible value of \(\operatorname { Var } ( X )\) stated in part (c)(i).