Edexcel S1 (Statistics 1) 2024 June

Question 1
View details
  1. A researcher is investigating the growth of two types of tree, Birch and Maple. The height, to the nearest cm, a seedling grows in one year is recorded for 35 Birch trees and 32 Maple trees. The results are summarised in the back-to-back stem and leaf diagram below.
TotalsBirchMapleTotals
(2)98257789(5)
(8)9996531130266899(7)
(9)9887631114\(111 \boldsymbol { k } 78\)(6)
(9)77754321050123444(7)
(3)7656346(3)
(3)654707(2)
(1)5800(2)
Key: 5 | 6 | 3 means 65 cm for a Birch tree and 63 cm for a Maple tree
The median height that these Maple trees grow in one year is 45 cm .
  1. Find the value of \(\boldsymbol { k }\), used in the stem and leaf diagram.
  2. Find the lower quartile and the upper quartile of the height grown in one year for these Birch trees. The researcher defines an outlier as an observation that is $$\text { greater than } Q _ { 3 } + 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right) \text { or less than } Q _ { 1 } - 1.5 \times \left( Q _ { 3 } - Q _ { 1 } \right)$$
  3. Show that there is only one outlier amongst the Birch trees. The grid on page 3 shows a box plot for the heights that the Maple trees grow in one year.
  4. On the same grid draw a box plot for the heights that the Birch trees grow in one year.
  5. Comment on any difference in the distributions of the growth of these Birch trees and the growth of these Maple trees.
    State the values of any statistics you have used to support your comment. The researcher realises he has missed out 4 pieces of data for the Maple trees. The heights each seedling grows in one year, to the nearest cm, in ascending order, for these 4 Maple trees are \(27 \mathrm {~cm} , a \mathrm {~cm} , 48 \mathrm {~cm} , 2 a \mathrm {~cm}\). Given that there is no change to the box plot for the Maple trees given on page 3
  6. find the range of possible values for \(a\) Show your working clearly.
    \includegraphics[max width=\textwidth, alt={}]{ee0c7c12-84f3-479c-b36a-3357f8529a1c-03_1243_1659_1464_210}
    Only use this grid if you need to redraw your answer for part (d)
    \includegraphics[max width=\textwidth, alt={}, center]{ee0c7c12-84f3-479c-b36a-3357f8529a1c-05_1154_1643_1503_217}
    (Total for Question 1 is 13 marks)
Question 2
View details
2. A spinner can land on the numbers \(2,4,5,7\) or 8 only. The random variable \(X\) represents the number that this spinner lands on when it is spun once. The probability distribution of \(X\) is given in the table below.
\(\boldsymbol { x }\)24578
\(\mathbf { P } ( \boldsymbol { X } = \boldsymbol { x } )\)0.250.30.20.10.15
  1. Find \(\mathrm { P } ( 2 X - 3 > 5 )\) Given that \(\mathrm { E } ( X ) = 4.6\)
  2. show that \(\operatorname { Var } ( X ) = 4.14\) The random variable \(Y = a X - b\) where \(a\) and \(b\) are positive constants.
    Given that $$\mathrm { E } ( Y ) = 13.4 \quad \text { and } \quad \operatorname { Var } ( Y ) = 66.24$$
  3. find the value of \(a\) and the value of \(b\) In a game Sam and Alex each spin the spinner once, landing on \(X _ { 1 }\) and \(X _ { 2 }\) respectively.
    Sam's score is given by the random variable \(S = X _ { 1 }\)
    Alex's score is given by the random variable \(R = 2 X _ { 2 } - 3\)
    The person with the higher score wins the game. If the scores are the same it is a draw.
  4. Find the probability that Sam wins the game.
Question 3
View details
  1. The lengths, \(x \mathrm {~mm}\), of 50 pebbles are summarised in the table below.
LengthFrequency
\(20 \leqslant x < 30\)2
\(30 \leqslant x < 32\)16
\(32 \leqslant x < 36\)20
\(36 \leqslant x < 40\)8
\(40 \leqslant x < 45\)3
\(45 \leqslant x < 50\)1
A histogram is drawn to represent these data.
The bar representing the class \(32 \leqslant x < 36\) is 2.5 cm wide and 7.5 cm tall.
  1. Calculate the width and the height of the bar representing the class \(30 \leqslant x < 32\)
  2. Using linear interpolation, estimate the median of \(x\) The weight, \(w\) grams, of each of the 50 pebbles is coded using \(10 y = w - 20\) These coded data are summarised by $$\sum y = 104 \quad \sum y ^ { 2 } = 233.54$$
  3. Show that the mean of \(w\) is 40.8
  4. Calculate the standard deviation of \(w\) The weight of a pebble recorded as 40.8 grams is added to the sample.
  5. Without carrying out any further calculations, state, giving a reason, what effect this would have on the value of
    1. the mean of \(w\)
    2. the standard deviation of \(w\)
Question 4
View details
  1. A biologist is studying bears. The biologist records the length, \(d \mathrm {~cm}\), and the girth, \(g \mathrm {~cm}\), of 8 bears. The biologist summarises the data as follows
$$\begin{gathered} \sum d = 1456.8 \quad \sum g = 713.2 \quad \sum d g = 141978.84 \quad \sum g ^ { 2 } = 72675.98
S _ { d d } = 16769.78 \end{gathered}$$
  1. Calculate the exact value of \(S _ { d g }\) and the exact value of \(S _ { g g }\)
  2. Calculate the value of the product moment correlation coefficient between \(d\) and \(g\)
  3. Show that the equation of the regression line of \(g\) on \(d\) can be written as $$g = - 42.3 + 0.722 d$$ where the values of the intercept and gradient are given to 3 significant figures.
  4. Give an interpretation, in context, of the gradient of the regression line. Using the equation of the regression line given in part (c)
    1. estimate the girth of a bear with a length of 2.5 metres,
    2. explain why an estimate for the girth of a bear with a length of 0.5 metres is not reliable. Using the regression line from part (c), the biologist estimates that for each \(x \mathrm {~cm}\) increase in the length of a bear there will be a 17.3 cm increase in the girth.
  5. Find the value of \(x\)
Question 5
View details
  1. A competition consists of two rounds.
The time, in minutes, taken by adults to complete round one is modelled by a normal distribution with mean 15 minutes and standard deviation 2 minutes.
  1. Use standardisation to find the proportion of adults that take less than 18 minutes to complete round one. Only the fastest \(60 \%\) of adults from round one take part in round two.
  2. Use standardisation to find the longest time that an adult can take to complete round one if they are to take part in round two. The time, \(T\) minutes, taken by adults to complete round two is modelled by a normal distribution with mean \(\mu\) Given that \(\mathrm { P } ( \mu - 10 < T < \mu + 10 ) = 0.95\)
  3. find \(\mathrm { P } ( T > \mu - 5 \mid T > \mu - 10 )\)
Question 6
View details
  1. The Venn diagram shows the probabilities related to teenagers playing 3 particular board games.
    \(C\) is the event that a teenager plays Chess
    \(S\) is the event that a teenager plays Scrabble
    \(G\) is the event that a teenager plays Go
    where \(p\) and \(q\) are probabilities.
    \includegraphics[max width=\textwidth, alt={}, center]{ee0c7c12-84f3-479c-b36a-3357f8529a1c-22_684_935_598_566}
    1. Find the probability that a randomly selected teenager plays Chess but does not play Go.
    Given that the events \(C\) and \(S\) are independent,
  2. find the value of \(p\)
  3. Hence find the value of \(q\)
  4. Find (i) \(\mathrm { P } \left( ( C \cup S ) \cap G ^ { \prime } \right)\)
    (ii) \(\mathrm { P } ( C \mid ( S \cap G ) )\) A youth club consists of a large number of teenagers.
    In this youth club 76 teenagers play Chess and Go.
  5. Use the information in the Venn diagram to estimate how many of the teenagers in the youth club do not play Scrabble.