OCR MEI Further Statistics B AS (Further Statistics B AS) 2018 June

Question 1
View details
1 The birth weights, in kilograms, of a random sample of 9 captive-bred elephants are as follows. $$\begin{array} { l l l l l l l l l } 94 & 138 & 130 & 118 & 146 & 165 & 82 & 115 & 69 \end{array}$$ A researcher uses software to produce a \(99 \%\) confidence interval for the mean birth weight of captive-bred elephants. The output from the software is shown in Fig. 1. \begin{table}[h]
Result
T Estimate of a Mean
Mean
s
SE
N
df
Lower limit
Upper limit
Interval
\captionsetup{labelformat=empty} \caption{Fig. 1}
\end{table}
  1. State an assumption about the distribution of the population from which these weights come that is necessary in order to produce this interval.
  2. State the confidence interval which the software gives, in the form \(a < \mu < b\).
  3. Explain
    • what the label df means,
    • how the value of df is calculated for a confidence interval produced using the \(t\) distribution.
    • State two ways in which the researcher could have obtained a narrower confidence interval.
Question 2
View details
2 A supermarket sells oranges. Their weights are modelled by the random variable \(X\) which has a Normal distribution with mean 345 grams and standard deviation 15 grams. When the oranges have been peeled, their weights in grams, \(Y\), are modelled by \(Y = 0.7 X\).
  1. Find the probability that a randomly chosen peeled orange weighs less than 250 grams. I randomly choose 5 oranges to buy.
  2. Find the probability that the total weight of the 5 unpeeled oranges is at least 1800 grams.
  3. I peel three of the oranges and leave the remaining two unpeeled. Find the probability that the total weight of the two unpeeled oranges is greater than the total weight of the three peeled ones.
Question 3
View details
3 The probability density function of the continuous random variable \(X\) is given by $$\mathrm { f } ( x ) = \begin{cases} c + x & - c \leqslant x \leqslant 0
c - x & 0 \leqslant x \leqslant c
0 & \text { otherwise } \end{cases}$$ where \(c\) is a positive constant.
  1. (A) Sketch the graph of the probability density function.
    (B) Show that \(c = 1\).
  2. Find \(\mathrm { P } \left( X < \frac { 1 } { 4 } \right)\).
  3. Find
    • the mean of \(X\),
    • the standard deviation of \(X\).
Question 4
View details
4 The random variable \(X\) has a continuous uniform distribution on [ 0,10 ].
  1. Find \(\mathrm { P } ( 3 < X < 6 )\).
  2. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    Marisa is investigating the sample mean, \(Y\), of 8 independent values of \(X\). She designs a simulation shown in the spreadsheet in Fig. 4.1. Each of the 25 rows below the heading row consists of 8 values of \(X\) together with the value of \(Y\). All of the values in the spreadsheet have been rounded to 2 decimal places. \begin{table}[h]
    1ABCDEFGHIJ
    1\(X _ { 1 }\)\(X _ { 2 }\)\(X _ { 3 }\)\(X _ { 4 }\)\(X _ { 5 }\)\(X _ { 6 }\)\(X _ { 7 }\)\(X _ { 8 }\)\(Y\)
    26.312.453.273.064.161.530.437.993.65
    31.701.527.108.936.442.709.967.835.77
    49.150.524.956.996.523.150.815.354.68
    50.652.717.929.650.504.876.462.674.43
    63.096.113.960.090.184.670.676.203.12
    77.065.841.973.609.361.974.483.474.72
    81.461.575.450.373.767.568.489.124.72
    99.421.854.911.611.948.001.775.344.36
    102.985.322.914.129.161.769.976.885.39
    112.833.443.287.851.000.938.774.034.01
    124.510.595.849.878.653.947.180.235.10
    134.490.693.658.784.968.963.771.434.59
    146.578.084.856.757.920.279.694.046.02
    158.351.098.638.047.232.122.579.595.95
    165.249.536.088.213.617.076.657.636.75
    177.895.503.090.716.475.496.474.955.07
    188.367.272.359.040.582.263.017.905.10
    193.761.019.619.657.899.986.284.346.56
    209.946.843.385.530.268.535.725.125.66
    217.259.100.342.884.662.656.377.635.11
    227.187.145.380.044.096.474.964.234.94
    238.695.044.902.942.004.234.130.974.11
    243.466.330.489.350.231.187.976.374.42
    252.377.267.161.245.262.803.553.844.19
    262.168.307.173.322.961.309.110.314.33
    27
    \captionsetup{labelformat=empty} \caption{Fig. 4.1}
    \end{table}
  3. Use the spreadsheet to estimate \(\mathrm { P } ( 3 < Y < 6 )\).
  4. Explain why it is not surprising that this estimated probability is substantially greater than the value which you calculated in part (i). Marisa wonders whether, even though the sample size is only 8, use of the Central Limit Theorem will provide a good approximation to \(\mathrm { P } ( 3 < Y < 6 )\).
  5. Calculate an estimate of \(\mathrm { P } ( 3 < Y < 6 )\) using the Central Limit Theorem. A Normal probability plot of the 25 simulated values of \(Y\) is shown in Fig. 4.2. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{0c58d4d7-10e9-473a-888a-b407ec90bf08-5_800_1291_306_386} \captionsetup{labelformat=empty} \caption{Fig. 4.2}
    \end{figure}
  6. Explain what the Normal probability plot suggests about the use of the Central Limit Theorem to approximate \(\mathrm { P } ( 3 < Y < 6 )\). Marisa now decides to use a spreadsheet with 1000 rows below the heading row, rather than the 25 which she used in the initial simulation shown in Fig. 4.1. She uses a counter to count the number of values of \(Y\) between 3 and 6. This value is 808.
  7. Explain whether the value 808 supports the suggestion that the Central Limit Theorem provides a good approximation to \(\mathrm { P } ( 3 < Y < 6 )\). Marisa decides to repeat each of her two simulations many times in order to investigate how variable the probability estimates are in each case.
  8. Explain whether you would expect there to be more, the same or less variability in the probability estimates based on 1000 rows than in the probability estimates based on 25 rows.
Question 5
View details
5 The flight time between two airports is known to be Normally distributed with mean 3.75 hours and standard deviation 0.21 hours. A new airline starts flying the same route. The flight times for a random sample of 12 flights with the new airline are shown in the spreadsheet (Fig. 5), together with the sample mean. \begin{table}[h]
ABCDEFGHIJKL
13.5953.7233.5843.6433.6693.6973.5503.6743.9243.5633.3303.706
2
3Mean3.638
\captionsetup{labelformat=empty} \caption{Fig. 5}
\end{table} \section*{(i) In this question you must show detailed reasoning.} You should assume that:
  • the flight times for the new airline are Normally distributed,
  • the standard deviation of the flight times is still 0.21 hours.
Carry out a test at the \(5 \%\) significance level to investigate whether the mean flight time for the new airline is less than 3.75 hours.
(ii) If both of the assumptions in part (i) were false, name an alternative test that you could carry out to investigate average flight times, stating any assumption necessary for this test.
(iii) If instead the flight times were still Normally distributed but the standard deviation was not known to be 0.21 hours, name another test that you could carry out.
Question 6
View details
6 A company has a large fleet of cars. It is claimed that use of a fuel additive will reduce fuel consumption. In order to test this claim a researcher at the company randomly selects 40 of the cars. The fuel consumption of each of the cars is measured, both with and without the fuel additive. The researcher then calculates the difference \(d\) litres per kilometre between the two figures for each car, where \(d\) is the fuel consumption without the additive minus the fuel consumption with the additive. The sample mean of \(d\) is 0.29 and the sample standard deviation is 1.64 .
  1. Showing your working, find a 95\% confidence interval for the population mean difference.
  2. Explain whether the confidence interval suggests that, on average, the fuel additive does reduce fuel consumption.
  3. Explain why you can construct the interval in part (i) despite not having any information about the distribution of the population of differences.
  4. Explain why the sample used was random.