OCR MEI Further Statistics A AS (Further Statistics A AS) 2024 June

Question 1
View details
1 The probability distribution for a discrete random variable \(X\) is given in the table below.
\(x\)0123
\(\mathrm { P } ( \mathrm { X } = \mathrm { x } )\)\(2 c\)\(3 c\)\(0.5 - c\)\(c\)
  1. Find the value of \(c\).
  2. Find the value of each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    The random variable \(Y\) is defined by \(Y = 2 X - 3\).
  3. Find the value of each of the following.
    • E(Y)
    • \(\operatorname { Var } ( Y )\)
Question 2
View details
2 In a game of chance there are 32 slots, numbered 1 to 32, and on each turn a ball lands in one of them. You may assume that the process is completely random. You are given that \(X\) is the random variable denoting the number of the slot that the ball lands in on a given turn.
  1. Suggest a suitable distribution to model \(X\). You should state the value(s) of any parameter(s).
  2. Write down \(\mathrm { P } ( X = 7 )\). Players of the game start with a score of 0 . On each turn a player may choose to play the game by selecting a number. If the ball lands in the slot with that number then 15 is added to the player's score. Otherwise, the player's score is reduced by 1 . A player's score may become negative. A player decides to play the game, selecting the number 7 on each turn, until the ball lands in the slot numbered 7. You are given that \(Y\) is the random variable denoting the number of turns up to and including the turn in which the ball lands in the slot numbered 7.
  3. Determine \(\mathrm { P } ( Y \leqslant 15 )\).
  4. Determine the player's expected final score.
Question 3
View details
3 A glassware factory produces a large number of ornaments each week. Just before they leave the factory, all the ornaments are checked and some may be found to be defective. The Quality Assurance Manager of the factory wishes to model the number of defective ornaments that are found each week using a Poisson distribution. The numbers of defective ornaments found each week in a period of 40 weeks are shown in Table 3.1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 3.1}
No. of defective ornaments in a week, \(r\)0123456\(\geqslant 7\)
No. of weeks with \(r\) defective ornaments, \(f\)2141353120
\end{table} You are given that summary statistics for the data are \(\sum f = 40 , \sum \mathrm { rf } = 84\) and \(\sum \mathrm { r } ^ { 2 } \mathrm { f } = 256\).
  1. By using the summary statistics to determine estimates for the mean and variance of the number of defective ornaments produced by the factory each week, explain how the data support the suggestion that the number of defective ornaments produced each week can be modelled using a Poisson distribution. The Quality Assurance Manager is asked by the head office to carry out a chi-squared hypothesis test for goodness of fit based on a \(\operatorname { Po } ( 2 )\) distribution.
  2. Table 3.2, which is incomplete, gives observed frequency, probability, expected frequency and chi-squared contribution. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 3.2}
    No. of defective ornaments in a week, \(r\)Observed frequencyProbabilityExpected frequencyChi-squared contribution
    020.135345.41342.15232
    114
    2130.270670.43620
    357.2179
    \(\geqslant 4\)60.142880.01421
    \end{table}
    1. Complete the copy of the table in the Printed Answer Booklet.
    2. Carry out the test at the \(10 \%\) significance level.
  3. On one occasion a fork-lift truck in the factory drops a crate containing eight ornaments and all of them are subsequently found to be defective. Explain why the Poisson model cannot model defects occurring in this manner.
Question 4
View details
4 A chemist is conducting an experiment in which the concentration of a certain chemical, A , is supposed to be recorded at the start of the experiment and then every 30 seconds after the start. The time after the start is denoted by \(t \mathrm {~s}\) and the concentration by \(\mathrm { z } \mathrm { mg } \mathrm { cm } ^ { - 3 }\). The collected data are shown in the table below. Note that the concentration at \(t = 90\) was not recorded.
Time, \(t\)03060120150
Concentration of A, \(z\)40.031.327.512.811.4
The chemist wishes to plot the data on a graph.
  1. Explain why \(t\) should be plotted on the horizontal axis. You are given that the summary statistics for the data are as follows.
    \(n = 5 \quad \sum t = 360 \quad \sum z = 123.0 \quad \sum t ^ { 2 } = 41400 \quad \sum z ^ { 2 } = 3629.74 \quad \sum \mathrm { t } = 5835\) The regression line of \(z\) on \(t\) is given by \(\mathbf { z = a + b t }\) and is used to model the concentration of chemical A for \(t \geqslant 0\).
    1. Use the summary statistics to determine the value of \(a\) and the value of \(b\).
    2. Find the value of the residual at each of the following values of \(t\).
      • \(t = 60\)
  2. \(t = 120\)
    1. Use the equation of the regression line to estimate the value of the concentration at 90 seconds.
    2. With reference to your answers to part (b)(ii), comment on the reliability of your answer to part (c)(i).
  3. Further experiments indicate that the model is reasonably reliable for times greater than 150 seconds up to about 200 seconds.
  4. Show that the model cannot be valid beyond a time of about 200 seconds.
Question 5
View details
5 A student is investigating possible association between the amount of coffee that an adult drinks each day and the number of hours that they remain awake each day. In an initial investigation, a random sample of 8 adults is selected. The student obtains the following information from each of these adults: the amount of coffee that they drink each day and the number of hours that they remain awake each day. The student analyses the data and finds that the associated product moment correlation coefficient is 0.6030 .
  1. State one assumption that must be made for a hypothesis test based on the product moment correlation coefficient to be carried out. For the remainder of this question you may assume that this assumption is true.
  2. Carry out a test at the \(5 \%\) significance level to investigate whether there is any correlation between amount of coffee drunk and number of hours awake. The student conducts a second investigation which is similar to the first but this time based on a random sample of 30 adults. The product moment correlation coefficient for the new data is 0.5487 . The student carries out an equivalent hypothesis test to the one carried out in part (b), again using a 5\% significance level.
  3. Identify any differences between the two tests and their results. You do not need to restate the hypotheses or explain the conclusion in context.
  4. You may assume the following guidelines for considering effect size.
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    Explain briefly why the results of the student's second investigation are likely to be more reliable than the results of the initial investigation.
Question 6
View details
6 A bank monitors the amounts of cash withdrawn from a cash machine. It categorises any withdrawal of an amount of \(\pounds 50\) or less as 'small' and any withdrawal of an amount greater than \(\pounds 50\) as 'large'. Over a long period of time the bank finds that the proportion of withdrawals that are small is 0.43 .
The bank wishes to model a sample of 10 withdrawals to examine the number of small withdrawals.
    1. State a suitable probability distribution for such a model, justifying your answer.
    2. State one assumption needed for the model to be valid.
    1. Find the probability that exactly 4 of the 10 withdrawals are small.
    2. Find the probability that exactly 4 of the 10 withdrawals are large.
    3. Find the probability that no more than 4 of the 10 withdrawals are large.
  1. Find the probability that, in the 10 withdrawals, the 7th withdrawal is large and there are exactly 3 that are small.