Edexcel S1 (Statistics 1) 2024 January

Question 1
View details
  1. The histogram below shows the distribution of the heights, to the nearest cm , of 408 plants.
    \includegraphics[max width=\textwidth, alt={}, center]{86446ce3-496a-4f02-9566-9b207bac9efa-02_1001_1473_340_296}
    1. Use the histogram to complete the following table.
    Height \(( h\) cm)\(5 \leqslant h < 9\)\(9 \leqslant h < 13\)\(13 \leqslant h < 15\)\(15 \leqslant h < 17\)\(17 \leqslant h < 25\)
    Frequency32152120
  2. Use interpolation to estimate the median. The mean height of these plants is 13.2 cm correct to one decimal place.
  3. Describe the skew of these data. Give a reason for your answer. Two of these plants are chosen at random.
  4. Estimate the probability that both of their heights are between 8 cm and 14 cm
Question 2
View details
  1. The average minimum monthly temperature, \(x\) degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ), and the average maximum monthly temperature, \(y\) degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ), in Kolkata were recorded for 12 months.
Some of the summary statistics are given below. $$\sum x = 862 \quad \sum x ^ { 2 } = 62802 \quad \mathrm {~S} _ { y y } = 413.67 \quad S _ { x y } = 512.67 \quad n = 12$$
    1. Calculate the mean of the 12 values of the average minimum
      monthly temperature.
    2. Show that the standard deviation of the 12 values of the average minimum monthly temperature is \(8.57 ^ { \circ } \mathrm { F }\) to 3 significant figures.
  1. Calculate the product moment correlation coefficient between \(x\) and \(y\) For comparative purposes with a UK city, it was necessary to convert the temperatures from degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ) to degrees Celsius ( \({ } ^ { \circ } \mathrm { C }\) ). The formula used was $$c = \frac { 5 } { 9 } ( f - 32 )$$ where \(f\) is the temperature in \({ } ^ { \circ } \mathrm { F }\) and \(c\) is the temperature in \({ } ^ { \circ } \mathrm { C }\)
  2. Use this formula and the values from part (a) to calculate, in \({ } ^ { \circ } \mathrm { C }\), the mean and the standard deviation of the 12 values of the average minimum monthly temperature in Kolkata.
    Give your answers to 3 significant figures. Given that
    • \(u\) is the equivalent temperature in \({ } ^ { \circ } \mathrm { C }\) of \(x\)
    • \(\quad v\) is the equivalent temperature in \({ } ^ { \circ } \mathrm { C }\) of \(y\)
    • state, giving a reason, the product moment correlation coefficient between \(u\) and \(v\)
Question 3
View details
  1. In a sixth form college each student in Year 12 and Year 13 is either left-handed (L) or right-handed (R).
The partially completed tree diagram, where \(p\) is a probability, gives information about these students.
\includegraphics[max width=\textwidth, alt={}, center]{86446ce3-496a-4f02-9566-9b207bac9efa-10_960_981_477_543}
  1. Complete the tree diagram, in terms of \(p\) where necessary. The probability that a student is left-handed is 0.11
  2. Find the value of \(p\)
  3. Find the probability that a student selected at random is in Year 12 and left-handed. Given that a student is right-handed,
  4. find the probability that the student is in Year 12
Question 4
View details
  1. A French test and a Spanish test were sat by 11 students.
The table below shows their marks.
StudentABCDEFGHIJK
French mark ( f )2430323236364044506068
Spanish mark ( \(\boldsymbol { s }\) )1690242832363844484868
Greg says that if these points were plotted on a scatter diagram, then the point \(( 30,90 )\) would be an outlier because 90 is an outlier for the Spanish marks. An outlier is defined as a value that is $$\text { greater than } Q _ { 3 } + 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { or smaller than } Q _ { 1 } - 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)$$
  1. Show that 90 is an outlier for the Spanish marks. Ignoring the point (30, 90), Greg calculated the following summary statistics. $$\sum f = 422 \quad \sum s = 382 \quad S _ { f f } = 1667.6 \quad S _ { f s } = 1735.6$$
  2. Use these summary statistics to show that the equation of the least squares regression line of \(s\) on \(f\) for the remaining 10 students is $$s = - 5.72 + 1.04 f$$ where the values of the intercept and gradient are given to 3 significant figures. You must show your working.
  3. Give an interpretation of the gradient of the regression line. Two further students sat the French test but missed the Spanish test.
  4. Using the equation given in part (b), estimate
    1. a Spanish mark for the student who scored 55 marks in their French test,
    2. a Spanish mark for the student who scored 18 marks in their French test.
  5. State, giving a reason, which of the two estimates found in part (d) would be the more reliable estimate.
Question 5
View details
  1. The distance an athlete can throw a discus is normally distributed with mean 40 m and standard deviation 4 m
    1. Using standardisation, show that the probability that this athlete throws the discus less than 38.8 m is 0.3821
    This athlete enters a discus competition.
    To qualify for the final, they have 3 attempts to throw the discus a distance of more than 38.8 m
    Once they qualify, they do not use any of their remaining attempts.
    Given that they qualified for the final and that throws are independent,
  2. find the probability that this athlete qualified for the final on their second throw with a distance of more than 44 m
Question 6
View details
  1. The events \(A\) and \(B\) satisfy
$$\mathrm { P } ( A ) = x \quad \mathrm { P } ( B ) = y \quad \mathrm { P } ( A \cup B ) = 0.65 \quad \mathrm { P } ( B \mid A ) = 0.3$$
  1. Show that $$14 x + 20 y = 13$$ The events \(B\) and \(C\) are mutually exclusive such that $$\mathrm { P } ( B \cup C ) = 0.85 \quad \mathrm { P } ( C ) = \frac { 1 } { 2 } x + y$$
    1. Find a second equation in \(x\) and \(y\)
    2. Hence find the value of \(x\) and the value of \(y\)
  2. Determine whether or not \(A\) and \(B\) are statistically independent. You must show your working clearly.
Question 7
View details
  1. The cumulative distribution of a discrete random variable \(X\) is given by
\(x\)1234
\(\mathrm {~F} ( x )\)\(\frac { 1 } { 13 }\)\(\frac { 2 k - 1 } { 26 }\)\(\frac { 3 ( k + 1 ) } { 26 }\)\(\frac { k + 4 } { 8 }\)
where \(k\) is a positive constant.
  1. Show that \(k = 4\)
  2. Find the probability distribution of the discrete random variable \(X\)
  3. Using your answer to part (b), write down the mode of \(X\)
  4. Calculate \(\operatorname { Var } ( 13 X - 6 )\)
Question 8
View details
  1. The random variable \(X\) is normally distributed with mean \(\mu\) and variance 36
Given that $$\mathrm { P } ( \mu - 2 k < X < \mu + 2 k ) = 0.6$$
  1. find the value of \(k\) The random variable \(Y\) is normally distributed with mean \(\mu\) and standard deviation \(\sigma\) Given that $$2 \mu = 3 \sigma ^ { 2 } \quad \text { and } \quad \mathrm { P } \left( \mathrm { Y } > \frac { 3 } { 2 } \mu \right) = 0.0668$$
  2. find the value of \(\mu\) and the value of \(\sigma\)