Edexcel FS2 (Further Statistics 2) Specimen

Question 1
View details
  1. The three independent random variables \(A , B\) and \(C\) each have a continuous uniform distribution over the interval \([ 0,5 ]\).
    1. Find the probability that \(A , B\) and \(C\) are all greater than 3
    The random variable \(Y\) represents the maximum value of \(A , B\) and \(C\).
    The cumulative distribution function of \(Y\) is $$\mathrm { F } ( y ) = \begin{cases} 0 & y < 0
    \frac { y ^ { 3 } } { 125 } & 0 \leqslant y \leqslant 5
    1 & y > 5 \end{cases}$$
  2. Using algebraic integration, show that \(\operatorname { Var } ( Y ) = 0.9375\)
  3. Find the mode of \(Y\), giving a reason for your answer.
  4. Describe the skewness of the distribution of \(Y\). Give a reason for your answer.
  5. Find the value of \(k\) such that \(\mathrm { P } ( k < Y < 2 k ) = 0.189\)
Question 2
View details
  1. A researcher claims that, at a river bend, the water gradually gets deeper as the distance from the inner bank increases. He measures the distance from the inner bank, \(b \mathrm {~cm}\), and the depth of a river, \(s \mathrm {~cm}\), at 7 positions. The results are shown in the table below.
PositionABCDEFG
Distance from
inner bank \(\boldsymbol { b } \mathbf { c m }\)
100200300400500600700
Depth \(\boldsymbol { s } \mathbf { c m }\)60758576110120104
The Spearman's rank correlation coefficient between \(b\) and \(s\) is \(\frac { 6 } { 7 }\)
  1. Stating your hypotheses clearly, test whether or not the data provides support for the researcher's claim. Use a \(1 \%\) level of significance.
  2. Without re-calculating the correlation coefficient, explain how the Spearman's rank correlation coefficient would change if
    1. the depth for G is 109 instead of 104
    2. an extra value H with distance from the inner bank of 800 cm and depth 130 cm is included. The researcher decided to collect extra data and found that there were now many tied ranks.
  3. Describe how you would find the correlation with many tied ranks.
Question 3
View details
  1. A nutritionist studied the levels of cholesterol, \(X \mathrm { mg } / \mathrm { cm } ^ { 3 }\), of male students at a large college. She assumed that \(X\) was distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and examined a random sample of 25 male students. Using this sample she obtained unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\) as \(\hat { \mu }\) and \(\hat { \sigma } ^ { 2 }\)
A \(95 \%\) confidence interval for \(\mu\) was found to be \(( 1.128,2.232 )\)
  1. Show that \(\hat { \sigma } ^ { 2 } = 1.79\) (correct to 3 significant figures)
  2. Obtain a \(95 \%\) confidence interval for \(\sigma ^ { 2 }\)
Question 4
View details
  1. The times, \(x\) seconds, taken by the competitors in the 100 m freestyle events at a school swimming gala are recorded. The following statistics are obtained from the data.
\cline { 2 - 4 } \multicolumn{1}{c|}{}No. of competitorsSample mean \(\overline { \boldsymbol { x } }\)\(\sum \boldsymbol { x } ^ { \mathbf { 2 } }\)
Girls883.155746
Boys788.956130
Following the gala, a mother claims that girls are faster swimmers than boys. Assuming that the times taken by the competitors are two independent random samples from normal distributions,
  1. test, at the \(10 \%\) level of significance, whether or not the variances of the two distributions are the same. State your hypotheses clearly.
  2. Stating your hypotheses clearly, test the mother's claim. Use a \(5 \%\) level of significance.
Question 5
View details
  1. Scaffolding poles come in two sizes, long and short. The length \(L\) of a long pole has the normal distribution \(\mathrm { N } \left( 19.6,0.6 ^ { 2 } \right)\). The length \(S\) of a short pole has the normal distribution N(4.8, 0.32). The random variables \(L\) and \(S\) are independent.
A long pole and a short pole are selected at random.
  1. Find the probability that the length of the long pole is more than 4 times the length of the short pole. Show your working clearly. Four short poles are selected at random and placed end to end in a row. The random variable \(T\) represents the length of the row.
  2. Find the distribution of \(T\).
  3. Find \(\mathrm { P } ( | L - T | < 0.2 )\)
Question 6
View details
  1. A random sample of 10 female pigs was taken. The number of piglets, \(x\), born to each female pig and their average weight at birth, \(m \mathrm {~kg}\), was recorded. The results were as follows:
Number of piglets, \(\boldsymbol { x }\)45678910111213
Average weight at
birth, \(\boldsymbol { m } \mathbf { ~ k g }\)
1.501.201.401.401.231.301.201.151.251.15
(You may use \(\mathrm { S } _ { x x } = 82.5\) and \(\mathrm { S } _ { m m } = 0.12756\) and \(\mathrm { S } _ { x m } = - 2.29\) )
  1. Find the equation of the regression line of \(m\) on \(x\) in the form \(m = a + b x\) as a model for these results.
  2. Show that the residual sum of squares (RSS) is 0.064 to 3 decimal places.
  3. Calculate the residual values.
  4. Write down the outlier.
    1. Comment on the validity of ignoring this outlier.
    2. Ignoring the outlier, produce another model.
    3. Use this model to estimate the average weight at birth if \(x = 15\)
    4. Comment, giving a reason, on the reliability of your estimate.
Question 7
View details
  1. Over a period of time, researchers took 10 blood samples from one patient with a blood disease. For each sample, they measured the levels of serum magnesium, \(s \mathrm { mg } / \mathrm { dl }\), in the blood and the corresponding level of the disease protein, \(d \mathrm { mg } / \mathrm { dl }\). One of the researchers coded the data for each sample using \(x = 10 s\) and \(y = 10 ( d - 9 )\) but spilt ink over his work.
The following summary statistics and unfinished scatter diagram are the only remaining information. $$\sum d ^ { 2 } = 1081.74 \quad \mathrm {~S} _ { d s } = 59.524$$ and $$\sum y = 64 \quad \mathrm {~S} _ { x x } = 2658.9$$ \(d \mathrm { mg } / \mathrm { dl }\)
\includegraphics[max width=\textwidth, alt={}, center]{e777c787-0d39-4d84-a0f9-fc4a6712184f-22_983_1534_840_303}
  1. Use the formula for \(\mathrm { S } _ { x x }\) to show that \(\mathrm { S } _ { s s } = 26.589\)
  2. Find the value of the product moment correlation coefficient between \(s\) and \(d\).
  3. With reference to the unfinished scatter diagram, comment on your result in part (b).