Edexcel FS2 (Further Statistics 2) 2021 June

Question 1
View details
  1. Anisa is investigating the relationship between marks on a History test and marks on a Geography test. She collects information from 7 students. She wants to calculate the Spearman's rank correlation coefficient for the 7 students so she ranks their performance on each test.
StudentHistory markGeography markHistory rankGeography rank
A765813
B706022
C6457\(s\)\(t\)
D6463\(s\)1
E6457\(s\)\(t\)
F595067
G555276
  1. Write down the value of \(s\) and the value of \(t\) The full product moment correlation coefficient (pmcc) formula is used with the ranks to calculate the Spearman's rank correlation coefficient instead of \(r _ { s } = 1 - \frac { 6 \Sigma d ^ { 2 } } { n \left( n ^ { 2 } - 1 \right) }\) and the value obtained is 0.7106 to 4 significant figures.
  2. Explain why the full pmcc formula is used to carry out the calculation.
  3. Stating your hypotheses clearly, test whether or not there is evidence to suggest that the higher a student ranks in the History test, the higher the student ranks in the Geography test. Use a \(5 \%\) level of significance.
Question 2
View details
  1. A company produces two colours of candles, blue and white. The standard deviation of the burning times of the blue candles is 2.6 minutes and the standard deviation of the burning times of the white candles is 2.4 minutes.
Nissim claims that the mean burning time of blue candles is more than 5 minutes greater than the mean burning time of white candles. A random sample of 90 blue candles is found to have a mean burning time of 39.5 minutes. A random sample of 80 white candles is found to have a mean burning time of 33.7 minutes.
  1. Stating your hypotheses clearly, use a suitable test to assess Nissim's belief. Use a \(1 \%\) level of significance.
  2. Explain how the hypothesis test in part (a) would be carried out differently if the variances of the burning times of candles were unknown. The burning times for the candles may not follow a normal distribution.
  3. Describe the effect this would have on the calculations in the hypothesis test in part (a). Give a reason for your answer.
Question 3
View details
  1. The continuous random variable \(X\) has cumulative distribution function given by
$$\mathrm { F } ( x ) = \left\{ \begin{array} { c r } 0 & x < 2
1.25 - \frac { 2.5 } { x } & 2 \leqslant x \leqslant 10
1 & x > 10 \end{array} \right.$$
  1. Find \(\mathrm { P } ( \{ X < 5 \} \cup \{ X > 8 \} )\)
  2. Find the median of \(X\).
  3. Find \(\mathrm { E } \left( X ^ { 2 } \right)\)
    1. Sketch the probability density function of \(X\).
    2. Describe the skewness of the distribution of \(X\).
Question 4
View details
  1. A researcher is investigating the relationship between elevation, \(x\) metres, and annual mean temperature, \(t ^ { \circ } \mathrm { C }\).
From a random sample of 20 weather stations in Switzerland, the following results were obtained $$\mathrm { S } _ { x x } = 8820655 \quad \mathrm {~S} _ { t t } = 444.7 \quad \sum x = 28130 \quad \sum t = 94.62$$ The product moment correlation coefficient for these data is found to be - 0.959
  1. Interpret the value of this correlation coefficient.
  2. Show that the equation of the regression line of \(t\) on \(x\) can be written as $$t = 14.3 - 0.00681 x$$ The random variable \(W\) represents the elevations of the weather stations in kilometres.
  3. Write down the equation of the regression line of \(t\) on \(w\) for these 20 weather stations in the form \(t = a + b w\)
  4. Show that the residual sum of squares (RSS) for the model for \(t\) and \(x\) is 35.7 correct to one decimal place. One of the weather stations in the sample had a recorded elevation of 1100 metres and an annual mean temperature of \(1.4 ^ { \circ } \mathrm { C }\)
    1. Calculate this weather station's contribution to the residual sum of squares. Give your answer as a percentage
    2. Comment on the data for this weather station in light of your answer to part (e)(i).
Question 5
View details
  1. The continuous random variable \(X\) is uniformly distributed over the interval \([ 0,4 \beta ]\), where \(\beta\) is an unknown constant.
Three independent observations, \(X _ { 1 } , X _ { 2 }\) and \(X _ { 3 }\), are taken of \(X\) and the following estimators for \(\beta\) are proposed $$\begin{aligned} & A = \frac { X _ { 1 } + X _ { 2 } } { 2 }
& B = \frac { X _ { 1 } + 2 X _ { 2 } + 3 X _ { 3 } } { 8 }
& C = \frac { X _ { 1 } + 2 X _ { 2 } - X _ { 3 } } { 8 } \end{aligned}$$
  1. Calculate the bias of \(A\), the bias of \(B\) and the bias of \(C\)
  2. By calculating the variances, explain which of \(B\) or \(C\) is the better estimator for \(\beta\)
  3. Find an unbiased estimator for \(\beta\)
Question 6
View details
  1. Elsa is collecting information on the wingspan of two different species of butterfly, Ringlet and Meadow Brown. She takes a random sample of each type of butterfly. The wingspans, \(w \mathrm {~cm}\), are summarised in the table below. The wingspans of Ringlet and Meadow Brown butterflies each follow normal distributions.
Number of
butterflies
\(\sum w\)\(\sum w ^ { 2 }\)
Ringlet841021032
Meadow Brown629414426
  1. Test, at the \(2 \%\) level of significance, whether or not there is evidence that the variance of the wingspans of Ringlet butterflies is different from the variance of the wingspans of Meadow Brown butterflies. You should state your hypotheses clearly. The \(k \%\) confidence interval for the variance of the wingspans of Meadow Brown butterflies is (1.194, 48.54)
  2. Find the value of \(k\)
  3. Calculate a \(95 \%\) confidence interval for the difference between the mean wingspan of the Ringlet butterfly and the mean wingspan of the Meadow Brown butterfly.
Question 7
View details
  1. The weights of a particular type of apple, \(A\) grams, and a particular type of orange, \(R\) grams, each follow independent normal distributions.
$$A \sim \mathrm {~N} \left( 160,12 ^ { 2 } \right) \quad R \sim \mathrm {~N} \left( 140,10 ^ { 2 } \right)$$
  1. Find the distribution of
    1. \(A + R\)
    2. the total weight of 2 randomly selected apples. A box contains 4 apples and 1 orange only. Jesse selects 2 pieces of fruit at random from the box.
  2. Find the probability that the total weight of the 2 pieces of fruit exceeds 310 grams. From a large number of apples and oranges, Celeste selects \(m\) apples and 1 orange at random. The random variable \(W\) is given by $$W = \left( \sum _ { i = 1 } ^ { m } A _ { i } \right) - n \times R$$ where \(n\) is a positive integer.
    Given that the middle \(95 \%\) of the distribution of \(W\) lies between 1100.08 and 1499.92 grams, (c) find the value of \(m\) and the value of \(n\)