Edexcel S3 (Statistics 3) 2022 June

Question 1
View details
  1. The table below shows the number of televised tournaments won and the total number of tournaments won by the top 10 ranked darts players in 2020
Player's rankTelevised tournaments wonTotal tournaments won
155135
2733
3517
4214
549
625
7936
8015
933
10013
Michael did not want to calculate Spearman’s rank correlation coefficient between player's rank and the rank in televised tournaments won because there would be tied ranks.
  1. Explain how Michael could have dealt with these tied ranks. Given that the largest number of total tournaments won is ranked number 1
  2. calculate the value of Spearman's rank correlation coefficient between player's rank and the rank in total tournaments won.
  3. Stating your hypotheses and critical value clearly, test at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between player's rank and the rank in total tournaments won for these darts players. Michael does not believe that there is a positive correlation between player's rank and the rank in total number of tournaments won.
  4. Find the largest level of significance, that is given in the tables provided, which could be used to support Michael's claim.
    You must state your critical value.
Question 2
View details
  1. An experiment is conducted to compare the heat retention of two brands of flasks, brand \(A\) and brand \(B\). Both brands of flask have a capacity of 750 ml .
In the experiment 750 ml of boiling water is poured into the flask, which is then sealed. Four hours later the temperature, in \({ } ^ { \circ } \mathrm { C }\), of the water in the flask is recorded. A random sample of 100 flasks from brand \(A\) gives the following summary statistics, where \(x\) is the temperature of the water in the flask after four hours. $$\sum x = 7690 \quad \sum ( x - \bar { x } ) ^ { 2 } = 669.24$$
  1. Find unbiased estimates for the mean and variance of the temperature of the water, after four hours, for brand \(A\). A random sample of 80 flasks from brand \(B\) gives the following results, where \(y\) is the temperature of the water in the flask after four hours. $$\bar { y } = 75.9 \quad s _ { y } = 2.2$$
  2. Test, at the \(1 \%\) significance level, whether there is a difference in the mean water temperature after four hours between brand \(A\) and brand \(B\). You should state your hypotheses, test statistic and critical value clearly.
  3. Explain why it is reasonable to assume that \(\sigma ^ { 2 } = s ^ { 2 }\) in this situation.
Question 3
View details
  1. The random variable \(X\) is normally distributed with unknown mean \(\mu\) and known variance \(\sigma ^ { 2 }\)
A random sample of 25 observations of \(X\) produced a \(95 \%\) confidence interval for \(\mu\) of (26.624, 28.976)
  1. Find the mean of the sample.
  2. Show that the standard deviation is 3 The \(a\) \% confidence interval using the 25 observations has a width of 2.1
  3. Calculate the value of \(a\)
  4. Find the smallest sample size, of observations from \(X\), that would be required to obtain a 95\% confidence interval of width at most 1.5
Question 4
View details
  1. Navtej travels to work by train. A train leaves the station every 7 minutes and Navtej's arrival at the station is independent of when the train is due to leave.
    1. Write down a suitable model for the distribution of the time, \(T\) minutes, that he has to wait for a train to leave.
    2. Find the mean and standard deviation of \(T\)
    During a 10-week period, Navtej travels to work by train on 46 occasions.
  2. Estimate the probability that the mean length of time that he has to wait for a train to leave is between 3.4 and 3.6 minutes.
  3. State a necessary assumption for the calculation in part (c).
Question 5
View details
  1. A random sample of two observations \(X _ { 1 }\) and \(X _ { 2 }\) is taken from a population with unknown mean \(\mu\) and unknown variance \(\sigma ^ { 2 }\)
    1. Explain why \(\frac { X _ { 1 } - \mu } { \sigma }\) is not a statistic.
    2. Explain what you understand by an unbiased estimator for \(\mu\)
    Two estimators for \(\mu\) are \(U _ { 1 }\) and \(U _ { 2 }\) where $$U _ { 1 } = 3 X _ { 1 } - 2 X _ { 2 } \quad \text { and } \quad U _ { 2 } = \frac { X _ { 1 } + 3 X _ { 2 } } { 4 }$$
  2. Show that both \(U _ { 1 }\) and \(U _ { 2 }\) are unbiased estimators for \(\mu\) The most efficient estimator among a group of unbiased estimators is the one with the smallest variance.
  3. By finding the variance of \(U _ { 1 }\) and the variance of \(U _ { 2 }\) state, giving a reason, the most efficient estimator for \(\mu\) from these two estimators.
Question 6
View details
6 A particular lift has a maximum load capacity of 700 kg .
The weights of men are normally distributed with mean 80 kg and standard deviation 10 kg . The weights of women are normally distributed with mean 69 kg and standard deviation 5 kg . You may assume that weights of people are independent.
  1. Find the probability that when 6 men and 3 women are in the lift, the load exceeds 700 kg . A sign in the lift states: "Maximum number of people in the lift is \(c\) "
  2. Find the value of \(c\) such that the probability of the load exceeding 700 kg is less than \(2.5 \%\) no matter the gender of the occupants.
Question 7
View details
7 The following table shows observed frequencies, where \(x\) is an integer, from an experiment to test whether or not a six-sided die is biased.
Number on die123456
Observed frequency\(x + 6\)\(x - 8\)\(x + 8\)\(x - 5\)\(x + 4\)\(x - 5\)
A goodness of fit test is conducted to determine if there is evidence that the die is biased.
  1. Write down suitable null and alternative hypotheses for this test. It is found that the null hypothesis is not rejected at the \(5 \%\) significance level.
  2. Hence
    1. find the minimum value of \(x\)
    2. determine the minimum number of times the die was rolled.