Questions — OCR MEI Further Statistics Major (78 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR MEI Further Statistics Major 2020 November Q7
10 marks
7 The lengths in mm of a random sample of 6 one-year-old fish of a particular species are as follows.
\(\begin{array} { l l l l l l } 271 & 293 & 306 & 287 & 264 & 290 \end{array}\)
  1. State an assumption required in order to find a confidence interval for the mean length of one-year-old fish of this species. Fig. 7 shows a Normal probability plot for these data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{8d36bc92-07ac-40c3-9e75-26f2bc9d2fcc-07_599_753_646_246} \captionsetup{labelformat=empty} \caption{Fig. 7}
    \end{figure}
  2. Explain why the Normal probability plot suggests that the assumption in part (a) may be valid.
  3. In this question you must show detailed reasoning. Assuming that this assumption is true, find a 95\% confidence interval for the mean length of one-year-old fish of this species.
OCR MEI Further Statistics Major 2020 November Q9
9 A supermarket sells trays of peaches. Each tray contains 10 peaches. Often some of the peaches in a tray are rotten. The numbers of rotten peaches in a random sample of 150 trays are shown in Table 9.1. \begin{table}[h]
Number of rotten peaches0123456\(\geqslant 7\)
Frequency393933198840
\captionsetup{labelformat=empty} \caption{Table 9.1}
\end{table} A manager at the supermarket thinks that the number of rotten peaches in a tray may be modelled by a binomial distribution.
  1. Use these data to estimate the value of the parameter \(p\) for the binomial model \(\mathrm { B } ( 10 , p )\). The manager decides to carry out a goodness of fit test to investigate further. The screenshot in Fig. 9.2 shows part of a spreadsheet to assess the goodness of fit of the distribution \(\mathrm { B } ( 10 , p )\), using the value of \(p\) estimated from the data. \begin{table}[h]
    -ABCDE
    1Number of rotten peachesObserved frequencyBinomial probabilityExpected frequencyChi-squared contribution
    2039
    31391.4229
    42330.294144.11672.8012
    53190.162924.43831.2102
    6\(\geqslant 4\)200.076911.53116.2199
    7
    \captionsetup{labelformat=empty} \caption{Fig. 9.2}
    \end{table}
  2. Calculate the missing values in each of the following cells.
    • C2
    • D2
    • E2
    • Explain why the numbers for 4, 5, 6 and at least 7 rotten peaches have been combined into the single category of at least 4 rotten peaches, as shown in the spreadsheet.
    • Carry out the test at the \(1 \%\) significance level.
    • Using the values of the contributions, comment on the results of the test.
OCR MEI Further Statistics Major 2020 November Q10
10 The discrete random variables \(X\) and \(Y\) have distributions as follows: \(X \sim \mathrm {~B} ( 20,0.3 )\) and \(Y \sim \operatorname { Po } ( 3 )\). The spreadsheet in Fig. 10 shows a simulation of the distributions of \(X\) and \(Y\). Each of the 20 rows below the heading row consists of a value of \(X\), a value of \(Y\), and the value of \(X - 2 Y\). \begin{table}[h]
1ABC
1XY\(X - 2 Y\)
266-6
354-3
4816
565-4
6630
7816
864-2
954-3
1074-1
11832
12622
13513
14614
1554-3
16723
17521
1844-4
19505
20513
21420
nn
\captionsetup{labelformat=empty} \caption{Fig. 10}
\end{table}
  1. Use the spreadsheet to estimate each of the following.
    • \(\mathrm { P } ( X - 2 Y > 0 )\)
    • \(\mathrm { P } ( X - 2 Y > 1 )\)
    • How could the estimates in part (a) be improved?
    The mean of 50 values of \(X - 2 Y\) is denoted by the random variable \(W\).
  2. Calculate an estimate of \(\mathrm { P } ( W > 1 )\).
OCR MEI Further Statistics Major 2020 November Q11
11 The length of time in minutes for which a particular geyser erupts is modelled by the continuous random variable \(T\) with cumulative distribution function given by
\(\mathrm { F } ( t ) = \begin{cases} 0 & t \leqslant 2 ,
k \left( 8 t ^ { 2 } - t ^ { 3 } - 24 \right) & 2 < t < 4 ,
1 & t \geqslant 4 , \end{cases}\)
where \(k\) is a positive constant.
  1. Show that \(k = \frac { 1 } { 40 }\).
  2. Find the probability that a randomly selected eruption time lies between 2.5 and 3.5 minutes.
  3. Show that the median \(m\) of the distribution satisfies the equation \(m ^ { 3 } - 8 m ^ { 2 } + 44 = 0\).
  4. Verify that the median eruption time is 2.95 minutes, correct to 2 decimal places. The mean and standard deviation of \(T\) are denoted by \(\mu\) and \(\sigma\) respectively.
  5. Find \(\mathrm { P } ( \mu - \sigma < T < \mu + \sigma )\).
  6. Sketch the graph of the probability density function of \(T\).
  7. A Normally distributed random variable \(X\) has the same mean and standard deviation as \(T\). By considering the shape of the Normal distribution, and without doing any calculations, explain whether \(\mathrm { P } ( \mu - \sigma < X < \mu + \sigma )\) will be greater than, equal to or less than the probability that you calculated in part (e).
OCR MEI Further Statistics Major 2021 November Q1
1 When babies are born, their head circumferences are measured. A random sample of 50 newborn female babies is selected. The sample mean head circumference is 34.711 cm . The sample standard deviation head circumference is 1.530 cm .
  1. Determine a 95\% confidence interval for the population mean head circumference of newborn female babies.
  2. Explain why you can calculate this interval even though the distribution of the population of head circumferences of newborn female babies is unknown.
OCR MEI Further Statistics Major 2021 November Q2
2 In a game at a charity fair, a player rolls 3 unbiased six-sided dice. The random variable \(X\) represents the difference between the highest and lowest scores.
  1. Show that \(\mathrm { P } ( X = 0 ) = \frac { 1 } { 36 }\). The table shows the probability distribution of \(X\).
    \(r\)012345
    \(\mathrm { P } ( \mathrm { X } = \mathrm { r } )\)\(\frac { 1 } { 36 }\)\(\frac { 5 } { 36 }\)\(\frac { 2 } { 9 }\)\(\frac { 1 } { 4 }\)\(\frac { 2 } { 9 }\)\(\frac { 5 } { 36 }\)
  2. Draw a graph to illustrate the distribution.
  3. Describe the shape of the distribution.
  4. In this question you must show detailed reasoning. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    As a result of playing the game, the player receives \(30 X\) pence from the organiser of the game.
  5. Find the variance of the amount that the player receives.
  6. The player pays \(k\) pence to play the game. Given that the average profit made by the organiser is 12.5 pence per game, determine the value of \(k\).
OCR MEI Further Statistics Major 2021 November Q3
3 In air traffic management, air traffic controllers send radio messages to pilots. On receiving a message, the pilot repeats it back to the controller to check that it has been understood correctly. At a particular site, on average \(4 \%\) of messages sent by controllers are not repeated back correctly and so have been misunderstood. You should assume that instances of messages being misunderstood occur randomly and independently.
  1. Find the probability that exactly 2 messages are misunderstood in a sequence of 50 messages.
  2. Find the probability that in a sequence of messages, the 10th message is the first one which is misunderstood.
  3. Find the probability that in a sequence of 20 messages, there are no misunderstood messages.
  4. Determine the expected number of messages required for 3 of them to be misunderstood.
  5. Determine the probability that in a sequence of messages, the 3rd misunderstood message is the 60th message in the sequence.
OCR MEI Further Statistics Major 2021 November Q4
4 A radioactive source contains 1000000 nuclei of a particular radioisotope. On average 1 in 200000 of these nuclei will decay in a period of 1 second. The random variable \(X\) represents the number of nuclei which decay in a period of 1 second. You should assume that nuclei decay randomly and independently of each other.
  1. Explain why you could use either a binomial distribution or a Poisson distribution to model the distribution of \(X\). Use a Poisson distribution to answer parts (b) and (c).
  2. Calculate each of the following probabilities.
    • \(\mathrm { P } ( X = 6 )\)
    • \(\mathrm { P } ( X > 6 )\)
    • Determine an estimate of the probability that at least 60 nuclei decay in a period of 10 seconds.
OCR MEI Further Statistics Major 2021 November Q5
5 A manufacturer uses three types of capacitor in a particular electronic device. The capacitances, measured in suitable units, are modelled by independent Normal distributions with means and standard deviations as shown in the table.
\cline { 2 - 3 } \multicolumn{1}{c|}{}Capacitance
TypeMean
Standard
deviation
A3.90.32
B7.80.41
C30.20.64
  1. Determine the probability that the total capacitance of a randomly chosen capacitor of Type B and two randomly chosen capacitors of Type A is at least 16 units.
  2. Determine the probability that the capacitance of a randomly chosen capacitor of Type C is within 1 unit of the total capacitance of four randomly chosen capacitors of Type B. When the manufacturer gets a new batch of 1000 capacitors from the supplier, a random sample of 10 of them is tested to check the capacitances. For a new batch of Type C capacitors, summary statistics for the capacitances, \(x\) units, of the random sample are as follows.
    \(n = 10\) $$\sum x = 299.6 \quad \sum x ^ { 2 } = 8981.0$$ You should assume that the capacitances of the sample come from a Normally distributed population, but you should not assume that the standard deviation is 0.64 as for previous Type C capacitors.
  3. In this question you must show detailed reasoning. Carry out a hypothesis test at the \(5 \%\) significance level to check whether it is reasonable to assume that the capacitors in this batch have the specified mean capacitance for Type C of 30.2 units.
OCR MEI Further Statistics Major 2021 November Q6
6 Cosmic rays passing through the upper atmosphere cause muons, and other types of particle, to be formed. Muons can be detected when they reach the surface of the earth. It is known that the mean number of muons reaching a particular detector is 1.7 per second. The numbers of muons reaching this detector in 200 randomly selected periods of 1 second are shown in Fig. 6.1. \begin{table}[h]
Number of muons0123456\(\geqslant 7\)
Frequency3465552414620
\captionsetup{labelformat=empty} \caption{Fig. 6.1}
\end{table}
  1. Use the values of the sample mean and sample variance to discuss the suitability of a Poisson distribution as a model. The screenshot in Fig. 6.2 shows part of a spreadsheet to assess the goodness of fit of the distribution Po(1.7). \begin{table}[h]
    ABCDE
    1Number of muonsObserved frequencyPoisson probabilityExpected frequencyChi-squared contribution
    20340.182736.53670.1761
    3165
    42550.264052.79550.0920
    53240.149629.91751.1704
    64140.1299
    7\(\geqslant 5\)80.02965.92300.7284
    \captionsetup{labelformat=empty} \caption{Fig. 6.2}
    \end{table}
  2. Calculate the missing values in each of the following cells.
    • C3
    • D3
    • E3
    • Explain why the numbers for 5, 6 and at least 7 muons have been combined into the single category of at least 5 muons, as shown in Fig. 6.2.
    • In this question you must show detailed reasoning.
    Carry out the test at the 5\% significance level.
OCR MEI Further Statistics Major 2021 November Q7
7 A physiotherapist is investigating hand grip strength in adult women under 30 years old. She thinks that the grip strength of the dominant hand will be on average 2 kg higher than the grip strength of the non-dominant hand. The physiotherapist selects a random sample of 12 adult women under 30 years old and measures the grip strength of each of their hands. She then uses software to produce a \(95 \%\) confidence interval for the mean difference in grip strength between the two hands (dominant minus nondominant), as shown in Fig. 7. \begin{table}[h]
T Estimate of a Mean
Confidence Level0.95
Sample
\multirow{3}{*}{
}
Result
T Estimate of a Mean
Mean2.79
s3.92
SE1.13161
N12
df11
Lower Limit0.29935
Upper Limit5.28065
Interval\(2.79 \pm 2.49065\)
\captionsetup{labelformat=empty} \caption{Fig. 7} \end{table}
  1. Explain why the physiotherapist used the same people for testing their dominant and nondominant grip strengths.
  2. State any assumptions necessary in order to construct the confidence interval shown in Fig. 7.
  3. Explain whether the confidence interval supports the physiotherapist's belief.
  4. The physiotherapist then finds some data which have previously been collected on grip strength using a sample of 100 adult women. A 95\% confidence interval, based on this sample and calculated using a Normal distribution, for the mean difference in grip strength between the two hands (dominant minus non-dominant) is (1.94, 2.84).
    1. For this sample, find
      • the mean difference
  5. the standard deviation of the differences.
    (ii) Explain what you would need to know about the nature of this sample if you wanted to draw conclusions about the mean difference in grip strength in the population of adult women.
OCR MEI Further Statistics Major 2021 November Q8
8
  1. \(\mathrm { VO } _ { 2 \max }\) is a measure of athletic fitness. Since \(\mathrm { VO } _ { 2 \max }\) is fairly time-consuming and expensive to measure, an exercise scientist wants to predict \(\mathrm { VO } _ { 2 _ { \text {max } } }\) from data such as times for running different distances. The scientist uses these data for a random sample of 15 athletes to predict their \(\mathrm { V } \mathrm { O } _ { 2 \text { max } }\) values, denoted by \(y\), in suitable units. She also obtains accurate measurements of the \(\mathrm { V } \mathrm { O } _ { 2 \text { max } }\) values, denoted by \(x\), in the same units. The scatter diagram in Fig. 8.1 shows the values of \(x\) and \(y\) obtained, together with the equation of the regression line of \(y\) on \(x\) and the value of \(r ^ { 2 }\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{ce557137-f9eb-4c09-a7e3-e4ec626109dc-08_750_1324_660_317} \captionsetup{labelformat=empty} \caption{Fig. 8.1}
    \end{figure}
    1. Use the regression line to estimate the predicted \(\mathrm { VO } _ { 2 \text { max } }\) of an athlete whose accurately measured \(\mathrm { VO } _ { 2 \text { max } }\) is 50 .
    2. Comment on the reliability of your estimate.
    3. The equation of the regression line of \(x\) on \(y\) is \(x = 0.7565 y + 10.493\). Find the coordinates of the point at which the two regression lines meet.
    4. State what the point you found in part (iii) represents.
  2. It is known that there is negative correlation between \(\mathrm { VO } _ { 2 \text { max } }\) and marathon times in very good runners (those whose best marathon times are under 3 hours). The exercise scientist wishes to know whether the same applies to runners who take longer to run a marathon. She selects a random sample of 20 runners whose best marathon times are between \(3 \frac { 1 } { 2 }\) hours and \(4 \frac { 1 } { 2 }\) hours and accurately measures their \(\mathrm { VO } _ { 2 \text { max } }\). Fig. 8.2 is a scatter diagram of accurately measured \(\mathrm { VO } _ { \text {2max } }\), \(v\) units, against best marathon time, \(t\) hours, for these runners. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{ce557137-f9eb-4c09-a7e3-e4ec626109dc-09_671_1064_648_319} \captionsetup{labelformat=empty} \caption{Fig. 8.2}
    \end{figure}
    1. Explain why the exercise scientist comes to the conclusion that a test based on Pearson's product moment correlation coefficient may be valid. Summary statistics for the 20 runners are as follows. $$\sum t = 80.37 \quad \sum v = 970.86 \quad \sum t ^ { 2 } = 324.71 \quad \sum v ^ { 2 } = 47829.24 \quad \sum t v = 3886.53$$
    2. Find the value of Pearson's product moment correlation coefficient.
    3. Carry out a test at the \(5 \%\) significance level to investigate whether there is negative correlation between accurately measured \(\mathrm { VO } _ { 2 _ { \text {max } } }\) and best marathon time for runners whose best marathon times are between \(3 \frac { 1 } { 2 }\) hours and \(4 \frac { 1 } { 2 }\) hours.
OCR MEI Further Statistics Major 2021 November Q9
9 The discrete random variable \(X\) has a uniform distribution over the set of all integers between \(- n\) and \(n\) inclusive, where \(n\) is a positive integer.
  1. Given that \(n\) is odd, determine \(\mathrm { P } \left( \mathrm { X } > \frac { 1 } { 2 } \mathrm { n } \right)\), giving your answer as a single fraction in terms of \(n\).
  2. Determine the variance of the sum of 10 independent values of \(X\), giving your answer in the form \(\mathrm { an } ^ { 2 } + \mathrm { bn }\), where \(a\) and \(b\) are constants.
OCR MEI Further Statistics Major 2021 November Q10
10 Sarah takes a bus to work each weekday morning and returns each evening. The times in minutes that she has to wait for the bus in the morning and evening are modelled by uniform distributions over the intervals \([ 0,10 ]\) and \([ 0,6 ]\) respectively. The times in minutes for the bus journeys in the morning and evening are modelled by \(\mathrm { N } ( 25,4 )\) and \(\mathrm { N } ( 28,16 )\) respectively. You should assume that all of the times are independent. The total time in minutes that she takes for her two journeys, including the waiting times, is denoted by the random variable \(T\). The spreadsheet below shows the first 20 rows of a simulation of 500 return journeys. It also shows in column H the numbers of values of \(T\) that are less than or equal to the corresponding values in column G. For example, there are 156 out of the 500 simulated values of \(T\) which are less than or equal to 58 minutes. All of the times have been rounded to 2 decimal places.
ABCDEFGH
1Waiting time morningJourney time morningWaiting time eveningJourney time eveningTotal timeTotal time tNumber \(\leqslant \mathbf { t }\)
20.8920.781.8826.3049.86460
33.5521.241.0429.6155.44484
42.1321.832.4028.6455.005013
55.1225.043.1324.3057.605235
64.0327.492.1930.8164.525457
72.4720.544.3234.6161.9356104
83.2126.933.7827.6661.5858156
99.7224.150.6327.5362.0360218
101.5928.450.0835.8765.9962288
117.3423.044.0224.7759.1764357
121.0424.691.6631.9559.3366408
137.1722.162.5525.3957.2868441
145.2026.972.4130.0564.6270475
155.0126.841.8836.2169.9372490
163.7626.032.2130.9662.9674496
170.9623.722.5529.3656.5976500
188.6424.972.8226.3962.82
190.5920.824.5731.4157.38
209.8523.685.5429.9268.99
01
  1. Use the spreadsheet output to estimate each of the following.
    • \(\mathrm { P } ( T \leqslant 56 )\)
    • \(\mathrm { P } ( T > 61 )\)
    • The random variable \(W\) is Normally distributed with the same mean and variance as \(T\). Find each of the following.
    • \(\mathrm { P } ( W \leqslant 56 )\)
    • \(\mathrm { P } ( W > 61 )\)
    • Explain why, if many more journeys were used in the simulation, you would expect \(\mathrm { P } ( T > 61 )\) to be extremely close to \(\mathrm { P } ( W > 61 )\).
OCR MEI Further Statistics Major 2021 November Q11
11 The continuous random variable \(X\) has probability density function given by
\(f ( x ) = \begin{cases} a x ^ { 2 } & 0 \leqslant x < 2 ,
b ( 3 - x ) ^ { 2 } & 2 \leqslant x \leqslant 3 ,
0 & \text { otherwise } \end{cases}\)
where \(a\) and \(b\) are positive constants.
  1. Given that \(\mathrm { E } ( X ) = 2\), determine the values of \(a\) and \(b\).
  2. Determine the median value of \(X\).
  3. A random sample of 50 observations of \(X\) is selected. Given that \(\operatorname { Var } ( X ) = 0.2\), determine an estimate of the probability that the mean value of the 50 observations is less than 1.9.
OCR MEI Further Statistics Major 2022 June Q6
  1. Determine a 95\% confidence interval for the mean weight of liquid paraffin in a tub.
  2. Explain whether the confidence interval supports the researcher's belief.
  3. Explain why the sample has to be random in order to construct the confidence interval.
    [0pt]
  4. A 95\% confidence interval for the mean weight in grams of another ingredient in the skin cream is [1.202, 1.398]. This confidence interval is based on a large sample and the unbiased estimate of the population variance calculated from the sample is 0.25 . Find each of the following.
    • The mean of the sample
    • The size of the sample
OCR MEI Further Statistics Major 2020 November Q8
10 marks
8 In this question you must show detailed reasoning. On the manufacturer's website, it is claimed that the average daily electricity consumption of a particular model of fridge is 1.25 kWh (kilowatt hours). A researcher at a consumer organisation decides to check this figure. A random sample of 40 fridges is selected. Summary statistics for the electricity consumption \(x \mathrm { kWh }\) of these fridges, measured over a period of 24 hours, are as follows.
\(\Sigma x = 51.92 \quad \Sigma x ^ { 2 } = 70.57\) Carry out a test at the \(5 \%\) significance level to investigate the validity of the claim on the website.
[0pt] [10]
OCR MEI Further Statistics Major Specimen Q1
1 In a promotion for a new type of cereal, a toy dinosaur is included in each pack. There are three different types of dinosaur to collect. They are distributed, with equal probability, randomly and independently in the packs. Sam is trying to collect all three of the dinosaurs.
  1. Find the probability that Sam has to open only 3 packs in order to collect all three dinosaurs. Sam continues to open packs until she has collected all three dinosaurs, but once she has opened 6 packs she gives up even if she has not found all three. The random variable \(X\) represents the number of packs which Sam opens.
  2. Complete the table below, using the copy in the Printed Answer Booklet, to show the probability distribution of \(X\).
    \(r\)3456
    \(\mathrm { P } ( X = r )\)\(\frac { 2 } { 9 }\)\(\frac { 14 } { 81 }\)
    \section*{(iii) In this question you must show detailed reasoning.} Find
    • \(\mathrm { E } ( X )\) and
    • \(\operatorname { Var } ( X )\).
OCR MEI Further Statistics Major Specimen Q2
2 The continuous random variable \(X\) takes values in the interval \(- 1 \leq x \leq 1\) and has probability density function $$f ( x ) = \left\{ \begin{array} { l r } a & - 1 \leq x < 0
a + x ^ { 2 } & 0 \leq x \leq 1 \end{array} \right.$$ where \(a\) is a constant.
  1. (A) Sketch the probability density function.
    (B) Show that \(a = \frac { 1 } { 3 }\).
  2. Find
    (A) \(\mathrm { P } \left( X < \frac { 1 } { 2 } \right)\),
    (B) the mean of \(X\).
  3. Show that the median of \(X\) satisfies the equation \(2 m ^ { 3 } + 2 m - 1 = 0\).
OCR MEI Further Statistics Major Specimen Q3
3 A researcher is investigating factors that might affect how many hours per day different species of mammals spend asleep. First she investigates human beings. She collects data on body mass index, \(x\), and hours of sleep, \(y\), for a random sample of people. A scatter diagram of the data is shown in Fig. 3.1 together with the regression line of \(y\) on \(x\). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-04_885_1584_598_274} \captionsetup{labelformat=empty} \caption{Fig. 3.1}
\end{figure}
  1. Calculate the residual for the data point which has the residual with the greatest magnitude.
  2. Use the equation of the regression line to estimate the mean number of hours spent asleep by a person with body mass index
    (A) 26,
    (B) 16,
    commenting briefly on each of your predictions. The researcher then collects additional data for a large number of species of mammals and analyses different factors for effect size. Definitions of the variables measured for a typical animal of the species, the correlations between these variables, and guidelines often used when considering effect size are given in Fig. 3.2.
    VariableDefinition
    Body massMass of animal in kg
    Brain massMass of brain in g
    Hours of sleep/dayNumber of hours per day spent asleep
    Life spanHow many years the animal lives
    DangerA measure of how dangerous the animal's situation is when asleep, taking into account predators and how protected the animal's den is: higher value indicates greater danger.
    Correlations (pmcc)Body MassBrain MassHours of sleep/dayLife spanDanger
    Body Mass1.00
    Brain Mass0.931.00
    Hours of sleep/day-0.31-0.361.00
    Life span0.300.51-0.411.00
    Danger0.130.15-0.590.061.00
    \begin{table}[h]
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    \captionsetup{labelformat=empty} \caption{Fig. 3.2}
    \end{table}
  3. State two conclusions the researcher might draw from these tables, relevant to her investigation into how many hours mammals spend asleep. One of the researcher's students notices the high correlation between body mass and brain mass and produces a scatter diagram for these two variables, shown in Fig. 3.3 below. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-05_675_698_1802_735} \captionsetup{labelformat=empty} \caption{Fig. 3.3}
    \end{figure}
  4. Comment on the suitability of a linear model for these two variables.
OCR MEI Further Statistics Major Specimen Q4
4 A fair six-sided dice is rolled repeatedly. Find the probability of the following events.
  1. A five occurs for the first time on the fourth roll.
  2. A five occurs at least once in the first four rolls.
  3. A five occurs for the second time on the third roll.
  4. At least two fives occur in the first three rolls. The dice is rolled repeatedly until a five occurs for the second time.
  5. Find the expected number of rolls required for two fives to occur. Justify your answer.
OCR MEI Further Statistics Major Specimen Q5
5 A particular brand of pasta is sold in bags of two different sizes. The mass of pasta in the large bags is advertised as being 1500 g ; in fact it is Normally distributed with mean 1515 g and standard deviation 4.7 g . The mass of pasta in the small bags is advertised as being 500 g ; in fact it is Normally distributed with mean 508 g and standard deviation 3.3 g .
  1. Find the probability that the total mass of pasta in 5 randomly selected small bags is less than 2550 g .
  2. Find the probability that the mass of pasta in a randomly selected large bag is greater than three times the mass of pasta in a randomly selected small bag.
OCR MEI Further Statistics Major Specimen Q6
6 Fig. 6 shows the wages earned in the last 12 months by each of a random sample of American males aged between 16 and 65 . \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e6ee3a4a-3e76-4422-9a78-17b64b458f83-07_771_1278_340_392} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure} A researcher wishes to test whether the sample provides evidence of a tendency for higher wages to be earned by older men in the age range 16 to 65 in America.
  1. The researcher needs to decide whether to use a test based on Pearson's product moment correlation coefficient or Spearman's rank correlation coefficient. Use the information in Fig. 6 to decide which test is more appropriate.
  2. Should it be a one-tail or a two-tail test? Justify your answer.
OCR MEI Further Statistics Major Specimen Q7
7 A newspaper reports that the average price of unleaded petrol in the UK is 110.2 p per litre. The price, in pence, of a litre of unleaded petrol at a random sample of 15 petrol stations in Yorkshire is shown below together with some output from software used to analyse the data.
116.9114.9110.9113.9114.9
117.9112.999.9114.9103.9
123.9105.7108.9102.9112.7
\begin{table}[h]
\(| l |\)Statistics
n15
Mean111.6733
\(\sigma\)6.1877
s6.4048
\(\Sigma \mathrm { x }\)1675.1
\(\Sigma \mathrm { x } ^ { 2 }\)187638.31
Min99.9
Q 1105.7
Median112.9
Q 3114.9
Max123.9
\captionsetup{labelformat=empty} \caption{Fig. 7.1}
\end{table}
\(n\)15
Kolmogorov-Smirnov
test
\(p > 0.15\)
Null hypothesis
The data can be modelled
by a Normal distribution
Alternative hypothesis
The data cannot be
modelled by a Normal
distribution
  1. Select a suitable hypothesis test to investigate whether there is any evidence that the average price of unleaded petrol in Yorkshire is different from 110.2 p. Justify your choice of test.
  2. Conduct the hypothesis test at the \(5 \%\) level of significance.
OCR MEI Further Statistics Major Specimen Q8
8 Natural background radiation consists of various particles, including neutrons. A detector is used to count the number of neutrons per second at a particular location.
  1. State the conditions required for a Poisson distribution to be a suitable model for the number of neutrons detected per second. The number of neutrons detected per second due to background radiation only is modelled by a Poisson distribution with mean 1.1.
  2. Find the probability that the detector detects
    (A) no neutrons in a randomly chosen second,
    (B) at least 60 neutrons in a randomly chosen period of 1 minute. A neutron source is switched on. It emits neutrons which should all be contained in a protective casing. The detector is used to check whether any neutrons have not been contained; these are known as stray neutrons. If the detector detects more than 8 neutrons in a period of 1 second, an alarm will be triggered in case this high reading is due to stray neutrons.
  3. Suppose that there are no stray neutrons and so the neutrons detected are all due to the background radiation. Find the expected number of times the alarm is triggered in 1000 randomly chosen periods of 1 second.
  4. Suppose instead that stray neutrons are being produced at a rate of 3.4 per second in addition to the natural background radiation. Find the probability that at least one alarm will be triggered in 10 randomly chosen periods of 1 second. You should assume that all stray neutrons produced are detected.