Edexcel S3 (Statistics 3) 2023 June

Question 1
View details
  1. (a) State two conditions under which it might be more appropriate to use Spearman's rank correlation coefficient rather than the product moment correlation coefficient.
A random sample of 10 melons was taken from a market stall. The length, in centimetres, and maximum diameter, in centimetres, of each melon were recorded. The Spearman's rank correlation coefficient between the results was - 0.673
(b) Test, at the \(5 \%\) level of significance, whether or not there is evidence of a correlation. State clearly your hypotheses and the critical value used. The product moment correlation coefficient between the results was - 0.525
(c) Test, at the \(5 \%\) level of significance, whether or not there is evidence of a negative correlation.
State clearly your hypotheses and the critical value used.
Question 2
View details
  1. A business accepts cash, bank cards or mobile apps as payment methods.
The manager wishes to test whether or not there is an association between the payment amount and the payment method used. The manager takes a random sample of 240 payments and records the payment amount and the payment method used. The manager's results are shown in the table.
\multirow{2}{*}{}Payment amount
Under £50£50 to £150Over £150
\multirow{3}{*}{Payment method}Cash231918
Bank card213231
Mobile app163941
Using these results,
  1. calculate the expected frequencies for the payment amount under \(\pounds 50\) that
    1. use cash
    2. use a bank card
    3. use a mobile app Given that for the other 6 classes \(\sum \frac { ( O - E ) ^ { 2 } } { E } = 2.4048\) to 4 decimal places,
  2. test, at the \(5 \%\) level of significance, whether or not there is evidence for an association between the payment amount and the payment method used. You should state the hypotheses, the test statistic, the degrees of freedom and the critical value used for this test.
Question 3
View details
  1. A random sample of 2 observations, \(X _ { 1 }\) and \(X _ { 2 }\), is taken from a population with unknown mean \(\mu\) and unknown variance \(\sigma ^ { 2 }\)
    1. Explain why \(\frac { X _ { 1 } - X _ { 2 } } { \sigma }\) is not a statistic.
    $$S = \frac { 3 } { 5 } X _ { 1 } + \frac { 5 } { 7 } X _ { 2 }$$
  2. Show that \(S\) is a biased estimator of \(\mu\)
  3. Hence find the bias, in terms of \(\mu\), when \(S\) is used as an estimator of \(\mu\) Given that \(Y = a X _ { 1 } + b X _ { 2 }\) is an unbiased estimator of \(\mu\), where \(a\) and \(b\) are constants,
  4. find an equation, in terms of \(a\) and \(b\), that must be satisfied.
  5. Using your answer to part (d), show that \(\operatorname { Var } ( Y ) = \left( 2 a ^ { 2 } - 2 a + 1 \right) \sigma ^ { 2 }\)
Question 4
View details
  1. It is suggested that the delay, in hours, of certain flights from a particular country may be modelled by the continuous random variable, \(T\), with probability density function
$$f ( t ) = \left\{ \begin{array} { c l } \frac { 2 } { 25 } t & 0 \leqslant t < 5
0 & \text { otherwise } \end{array} \right.$$
  1. Show that for \(0 \leqslant a \leqslant 4\) $$P ( a \leqslant T < a + 1 ) = \frac { 1 } { 25 } ( 2 a + 1 )$$ A random sample of 150 of these flights is taken. The delays are summarised in the table below.
    Delay ( \(\boldsymbol { t }\) hours)Frequency
    \(0 \leqslant t < 1\)10
    \(1 \leqslant t < 2\)13
    \(2 \leqslant t < 3\)24
    \(3 \leqslant t < 4\)35
    \(4 \leqslant t < 5\)68
  2. Test, at the \(5 \%\) significance level, whether the given probability density function is a suitable model for these delays.
    You should state your hypotheses, expected frequencies, test statistic and the critical value used.
Question 5
View details
  1. The continuous random variable \(X\) is normally distributed with
$$X \sim \mathrm {~N} \left( \mu , 5 ^ { 2 } \right)$$ A random sample of 10 observations of \(X\) is taken and \(\bar { X }\) denotes the sample mean.
  1. Show that a \(90 \%\) confidence interval for \(\mu\), in terms of \(\bar { x }\), is given by $$( \bar { x } - 2.60 , \bar { x } + 2.60 )$$ The continuous random variable \(Y\) is normally distributed with $$Y \sim \mathrm {~N} \left( \mu , 3 ^ { 2 } \right)$$ A random sample of 20 observations of \(Y\) are taken and \(\bar { Y }\) denotes the sample mean.
  2. Find a 95\% confidence interval for \(\mu\), in terms of \(\bar { y }\)
  3. Given that \(X\) and \(Y\) are independent,
    1. find the distribution of \(\bar { X } - \bar { Y }\)
    2. calculate the probability that the two confidence intervals from part (a) and part (b) do not overlap.
Question 6
View details
  1. Roxane, a scientist, carries out an investigation into the fat content of different brands of crisps.
Roxane took random samples of different brands of crisps and recorded, in grams, the fat content ( \(x\) ) of a 30 gram serving. The table below shows some results for just two of these brands.
Brand\(\sum x\)\(\sum \boldsymbol { x } ^ { \mathbf { 2 } }\)\(\bar { x }\)\(s\)Sample size
A3501753.97445.00.2470
B331.51694.65\(\alpha\)β65
  1. Calculate the value of \(\alpha\) and the value of \(\beta\) Roxane claims that these results show that the crisps from brand A have a lower fat content than the crisps from brand B , as the mean fat content for brand A is, statistically, significantly less than the mean fat content for brand B .
  2. Stating your hypotheses clearly, carry out a suitable test, at the \(5 \%\) level of significance, to assess Roxane's claim.
    You should state your test statistic and critical value.
  3. For the test in part (b), state whether or not it is necessary to assume that the fat content of crisps is normally distributed. Give a reason for your answer.
  4. State an assumption you have made in carrying out the test in part (b).
Question 7
View details
  1. The random variable \(X\) is defined as
$$X = 4 A - 3 B$$ where \(A\) and \(B\) are independent and $$A \sim \mathrm {~N} \left( 15,5 ^ { 2 } \right) \quad B \sim \mathrm {~N} \left( 10,4 ^ { 2 } \right)$$
  1. Find \(\mathrm { P } ( X < 40 )\) The random variable \(C\) is such that \(C \sim \mathrm {~N} \left( 20 , \sigma ^ { 2 } \right)\)
    The random variables \(C _ { 1 } , C _ { 2 }\) and \(C _ { 3 }\) are independent and each has the same distribution as \(C\) The random variable \(D\) is defined as $$D = \sum _ { i = 1 } ^ { 3 } C _ { i }$$ Given that \(\mathrm { P } ( A + B + D < 76 ) = 0.2420\) and that \(A , B\) and \(D\) are independent,
  2. showing your working clearly, find the standard deviation of \(C\)