Edexcel FS2 (Further Statistics 2) 2024 June

Question 1
View details
  1. Two students are experimenting with some water in a plastic bottle. The bottle is filled with water and a hole is put in the bottom of the bottle. The students record the time, \(t\) seconds, it takes for the water level to fall to each of 10 given values of the height, \(h \mathrm {~cm}\), above the hole.
Student \(A\) models the data with an equation of the form \(t = a + b \sqrt { h }\)
The data is coded using \(v = t - 40\) and \(w = \sqrt { h }\) and the following information is obtained. $$\sum v = 626 \quad \sum v ^ { 2 } = 64678 \quad \sum w = 22.47 \quad \mathrm {~S} _ { w w } = 4.52 \quad \mathrm {~S} _ { v w } = - 338.83$$
  1. Find the equation of the regression line of \(t\) on \(\sqrt { h }\) in the form \(t = a + b \sqrt { h }\) The time it takes the water level to fall to a height of 9 cm above the hole is 47 seconds.
  2. Calculate the residual for this data point. Give your answer to 2 decimal places. Given that the residual sum of squares (RSS) for the model of \(t\) on \(\sqrt { h }\) is the same as the RSS for the model of \(v\) on \(w\),
  3. calculate the RSS for these 10 data points. Student \(B\) models the data with an equation of the form \(t = c + d h\)
    The regression line of \(t\) on \(h\) is calculated and the residual sum of squares (RSS) is found to be 980 to 3 significant figures.
  4. With reference to part (c) state, giving a reason, whether Student B's model or Student A's model is the more suitable for these data.
Question 2
View details
  1. An estate agent asks customers to rank 7 features of a house, \(A , B , C , D , E , F\) and \(G\), in order of importance. The responses for two randomly selected customers are in the table below.
Rank1234567
Customer 1\(A\)\(E\)\(C\)\(F\)\(G\)\(B\)\(D\)
Customer 2\(E\)\(F\)\(C\)\(G\)\(A\)\(D\)\(B\)
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses and critical value clearly, test at the \(5 \%\) level of significance, whether or not the two customers are generally in agreement.
Question 3
View details
  1. A factory produces bolts. The lengths of the bolts are normally distributed with mean \(\mu \mathrm { mm }\) and standard deviation 0.868 mm
A random sample of 15 of these bolts is taken and the mean length is 30.03 mm
  1. Calculate a 90\% confidence interval for \(\mu\) A suitable test, at the \(10 \%\) level of significance, is carried out using these 15 bolts, to see whether or not there is evidence that the variance of the length of the bolts has increased.
  2. Calculate the critical region for \(S ^ { 2 }\) The manager of the factory decides that, in future, he will check each month whether the machine making the bolts is working properly. He uses a \(10 \%\) level of significance to test whether or not there is evidence that
    • the mean length of the bolts has changed
    • the variance of the length of the bolts has increased
    The next month a random sample of 15 bolts is taken.
    The mean length of these bolts is 30.06 mm and the standard deviation is 1.02 mm
  3. With reference to your answers to part (a) and part (b), state whether or not there is any evidence that the machine is not working properly.
    Give reasons for your answer.
Question 4
View details
  1. The random variable \(G\) has a continuous uniform distribution over the interval \([ - 3,15 ]\)
    1. Calculate \(\mathrm { P } ( G > 12 )\)
    The random variable \(H\) has a continuous uniform distribution over the interval [2, w] The random variables \(G\) and \(H\) are independent and \(\mathrm { E } ( H ) = 10\)
  2. Show that the probability that \(G\) and \(H\) are both greater than 12 is \(\frac { 1 } { 16 }\) The random variable \(A\) is the area on a coordinate grid bounded by $$\begin{aligned} & y = - 3
    & y = - 4 | x | + k \end{aligned}$$ where \(k\) is a value from the continuous uniform distribution over the interval [5,10]
  3. Calculate the expected value of \(A\)
Question 5
View details
  1. A continuous random variable \(X\) has probability density function
$$f ( x ) = \left\{ \begin{array} { c l } a x ^ { - 2 } - b x ^ { - 3 } & 2 \leqslant x < \infty
0 & \text { otherwise } \end{array} \right.$$ where \(a\) and \(b\) are constants. Given that \(\mathrm { P } ( X \leqslant 4 ) = \frac { 3 } { 8 }\)
  1. use algebraic integration to show that \(a = 3\) Show your working clearly.
  2. Find the exact value of the median of \(X\)
Question 6
View details
  1. A researcher set up a trial to assess the effect that a food supplement has on the increase in weight of Herdwick lambs. The researcher randomly selected 8 sets of twin lambs. One of each set of twins was given the food supplement and the other had no food supplement. The gain in weight, in kg, of each lamb over the period of the trial was recorded.
Set of twin lambsA\(B\)CD\(E\)\(F\)\(G\)\(H\)
\multirow{2}{*}{Weight gain (kg)}With food supplement4.15.36.03.65.94.27.16.4
No food supplement5.04.85.23.45.13.97.06.5
  1. State why a two sample \(t\)-test is not suitable for use with these data.
  2. Suggest 2 other factors about the lambs that the researcher may need to control when selecting the sample.
  3. State one assumption, in context, that needs to be made for a paired \(t\)-test to be valid. For a pair of twin lambs, the random variable \(W\) represents the weight gain of the lamb given the food supplement minus the weight gain of the lamb not given the food supplement.
  4. Using the data in the table, calculate a \(98 \%\) confidence interval for the mean of \(W\) Show your working clearly. The researcher believes that the mean of \(W\) is greater than 200 g
  5. Stating your hypotheses clearly, use your confidence interval to explain whether or not there is evidence to support the researcher's belief.
Question 7
View details
  1. Two organisations are each asked to carry out a survey to find out the proportion, \(p\), of the population that would vote for a particular political party.
The first organisation finds that out of \(m\) people, \(X\) would vote for this particular political party. The second organisation finds that out of \(n\) people, \(Y\) would vote for this particular political party. An unbiased estimator, \(Q\), of \(p\) is proposed where $$Q = k \left( \frac { X } { m } + \frac { Y } { n } \right)$$
  1. Show that \(k = \frac { 1 } { 2 }\) A second unbiased estimator, \(R\), of \(p\) is proposed where $$R = \frac { a X } { m } + \frac { b Y } { n }$$
  2. Show that \(a + b = 1\) Given that \(m = 100\) and \(n = 200\) and that \(R\) is a better estimator of \(p\) than \(Q\)
  3. calculate the range of possible values of \(a\) Show your working clearly.
Question 8
View details
  1. A company packs chickpeas into small bags and large bags.
The weight of a small bag of chickpeas is normally distributed with mean 500 g and standard deviation 5 g A random sample of 3 small bags of chickpeas is taken.
  1. Find the probability that the total weight of these 3 bags of chickpeas is between 1490 g and 1530 g The weight of a large bag of chickpeas is normally distributed with mean 1020 g and standard deviation 20 g One large bag and one small bag of chickpeas are chosen at random.
  2. Calculate the probability that the weight of the large bag of chickpeas is at least 30 g more than twice the weight of the small bag of chickpeas. Show your working clearly.