Questions — SPS SPS SM Statistics (20 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks PURE Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 PURE S1 S2 S3 S4 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 Pre-U Pre-U 9794/1 Pre-U 9794/2 Pre-U 9794/3 Pre-U 9795 Pre-U 9795/1 Pre-U 9795/2 WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
SPS SPS SM Statistics 2024 January Q1
4 marks Easy -1.8
At the beginning of the academic year, all the pupils in year 12 at a college take part in an assessment. Summary statistics for the marks obtained by the 2021 cohort are given below. \(n = 205\) \(\sum x = 23042\) \(\sum x^2 = 2591716\) Marks may only be whole numbers, but the Head of Mathematics believes that the distribution of marks may be modelled by a Normal distribution.
  1. Calculate
    [2]
  2. Use your answers to part (a) to write down a possible Normal model for the distribution of marks. [2]
SPS SPS SM Statistics 2024 January Q2
14 marks Moderate -0.8
The heights, in centimetres, of a random sample of 150 plants of a certain variety were measured. The results are summarised in the histogram. \includegraphics{figure_2} One of the 150 plants is chosen at random, and its height, \(X\) cm, is noted.
  1. Show that P\((20 < X < 30) = 0.147\), correct to 3 significant figures. [2]
Sam suggests that the distribution of \(X\) can be well modelled by the distribution N\((40, 100)\).
    1. Give a brief justification for the use of the normal distribution in this context. [1]
    2. Give a brief justification for the choice of the parameter values 40 and 100. [2]
  1. Use Sam's model to find P\((20 < X < 30)\). [1]
Nina suggests a different model. She uses the midpoints of the classes to calculate estimates, \(m\) and \(s\), for the mean and standard deviation respectively, in centimetres, of the 150 heights. She then uses the distribution N\((m, s^2)\) as her model.
  1. Use Nina's model to find P\((20 < X < 30)\). [4]
    1. Complete the table in the Printed Answer Booklet to show the probabilities obtained from Sam's model and Nina's model. [2]
    2. By considering the different ranges of values of \(X\) given in the table, discuss how well the two models fit the original distribution. [2]
SPS SPS SM Statistics 2024 January Q3
12 marks Moderate -0.8
Zac is planning to write a report on the music preferences of the students at his college. There is a large number of students at the college.
  1. State one reason why Zac might wish to obtain information from a sample of students, rather than from all the students. [1]
  2. Amaya suggests that Zac should use a sample that is stratified by school year. Give one advantage of this method as compared with random sampling, in this context. [1]
Zac decides to take a random sample of 60 students from his college. He asks each student how many hours per week, on average, they spend listening to music during term. From his results he calculates the following statistics.
MeanStandard deviationMedianLower quartileUpper quartile
21.04.2020.518.022.9
  1. Sundip tells Zac that, during term, she spends on average 30 hours per week listening to music. Discuss briefly whether this value should be considered an outlier. [3]
  2. Layla claims that, during term, each student spends on average 20 hours per week listening to music. Zac believes that the true figure is higher than 20 hours. He uses his results to carry out a hypothesis test at the 5\% significance level. Assume that the time spent listening to music is normally distributed with standard deviation 4.20 hours. Carry out the test. [7]
SPS SPS SM Statistics 2024 January Q4
6 marks Easy -1.2
The table shows the increases, between 2001 and 2011, in the percentages of employees travelling to work by various methods, in the Local Authorities (LAs) in the North East region of the UK. \includegraphics{figure_4} The first two digits of the Geography code give the type of each of the LAs: 06: Unitary authority 07: Non-metropolitan district 08: Metropolitan borough
  1. In what type of LA are the largest increases in percentages of people travelling by underground, metro, light rail or tram? [1]
  2. Identify two main changes in the pattern of travel to work in the North East region between 2001 and 2011. [2]
Now assume the following.
  • The data refer to residents in the given LAs who are in the age range 20 to 65 at the time of each census.
  • The number of people in the age range 20 to 65 who move into or out of each given LA, or who die, between 2001 and 2011 is negligible.
  1. Estimate the percentage of the people in the age range 20 to 65 in 2011 whose data appears in both 2001 and 2011. [2]
  2. In the light of your answer to part (c), suggest a reason for the changes in the pattern of travel to work in the North East region between 2001 and 2011. [1]
SPS SPS SM Statistics 2024 January Q5
7 marks Standard +0.8
Labrador puppies may be black, yellow or chocolate in colour. Some information about a litter of 9 puppies is given in the table.
malefemale
black13
yellow21
chocolate11
Four puppies are chosen at random to train as guide dogs.
  1. Determine the probability that at least 3 black puppies are chosen. [3]
  2. Determine the probability that exactly 3 females are chosen given that at least 3 black puppies are chosen. [3]
  3. Explain whether the 2 events 'choosing exactly 3 females' and 'choosing at least 3 black puppies' are independent events. [1]
SPS SPS SM Statistics 2024 January Q6
6 marks Standard +0.3
A firm claims that no more than 2\% of their packets of sugar are underweight. A market researcher believes that the actual proportion is greater than 2\%. In order to test the firm's claim, the researcher weighs a random sample of 600 packets and carries out a hypothesis test, at the 5\% significance level, using the null hypothesis \(p = 0.02\).
  1. Given that the researcher's null hypothesis is correct, determine the probability that the researcher will conclude that the firm's claim is incorrect. [5]
  2. The researcher finds that 18 out of the 600 packets are underweight. A colleague says "18 out of 600 is 3\%, so there is evidence that the actual proportion of underweight bags is greater than 2\%." Criticise this statement. [1]
SPS SPS SM Statistics 2024 January Q7
11 marks Standard +0.8
The probability distribution of a random variable \(X\) is modelled as follows. $$\text{P}(X = x) = \begin{cases} \frac{k}{x} & x = 1, 2, 3, 4, \\ 0 & \text{otherwise,} \end{cases}$$ where \(k\) is a constant.
  1. Show that \(k = \frac{12}{25}\). [2]
  2. Show in a table the values of \(X\) and their probabilities. [1]
  3. The values of three independent observations of \(X\) are denoted by \(X_1\), \(X_2\) and \(X_3\). Find P\((X_1 > X_2 + X_3)\). [3]
In a game, a player notes the values of successive independent observations of \(X\) and keeps a running total. The aim of the game is to reach a total of exactly 7.
  1. Determine the probability that a total of exactly 7 is first reached on the 5th observation. [5]
SPS SPS SM Statistics 2025 April Q2
5 marks Easy -1.2
The histogram shows information about the lengths, \(l\) centimetres, of a sample of worms of a certain species. \includegraphics{figure_2} The number of worms in the sample with lengths in the class \(3 \leq l < 4\) is 30.
  1. Find the number of worms in the sample with lengths in the class \(0 \leq l < 2\). [2]
  2. Find an estimate of the number of worms in the sample with lengths in the range \(4.5 \leq l < 5.5\). [3]
SPS SPS SM Statistics 2025 April Q3
5 marks Moderate -0.8
A researcher has collected data on the heights of a sample of adults but has encoded the actual values using a linear transformation of the form \(aX + b\), where \(X\) represents the original height in centimetres. Given the following information about the encoded data: The mean of the encoded heights is 5.4 cm The standard deviation of the encoded heights is 2.0 cm The researcher knows that the transformation used was \(0.2X - 30\)
  1. Find the mean of the original heights in the sample. [2]
  2. Find the standard deviation of the original heights in the sample. [2]
  3. If an encoded height value is 6.8, what was the original height in centimetres? [1]
SPS SPS SM Statistics 2025 April Q4
8 marks Moderate -0.3
A manufacturing plant produces electronic circuit boards that need to pass two quality checks - a mechanical inspection and an electrical test. Historical data shows that 15% of boards fail the mechanical inspection. Of those that pass the mechanical inspection, 8% fail the electrical test. Of those that fail the mechanical inspection, 60% fail the electrical test.
  1. If a board is randomly selected from production, what is the probability that it passes both inspections? [2]
  2. If a board is selected at random and is found to have passed the electrical test, what is the probability that it also passed the mechanical inspection? [3]
  3. The company continues to test boards from a large batch until finding one that passes both inspections. Each board is tested independently of all others. What is the probability that they need to test exactly 3 boards to find one that passes both inspections? [3]
SPS SPS SM Statistics 2025 April Q5
13 marks Easy -1.3
In a study of reaction times, 25 participants completed a test where their reaction times (in milliseconds) were recorded. The results are shown in the stem-and-leaf diagram below: 20 | 3 5 7 9 21 | 0 2 5 6 8 22 | 1 3 4 5 7 9 23 | 0 2 5 8 24 | 1 4 6 7 25 | 2 5 Key: 21 | 0 represents a reaction time of 210 milliseconds
  1. State the median reaction time. [1]
  2. Calculate the interquartile range of these reaction times. [2]
  3. Find the mean and standard deviation of these reaction times. [3]
  4. State one advantage of using a stem-and-leaf diagram to display this data rather than a frequency table. [1]
  5. One participant completed the test again and recorded a reaction time of 195 milliseconds. Add this result to the stem-and-leaf diagram and state the effect this would have on: a. the median b. the mean c. the standard deviation [4]
  6. Explain why the interquartile range might be preferred to the standard deviation as a measure of spread in this context [2]
SPS SPS SM Statistics 2025 April Q6
11 marks Moderate -0.8
A retail bakery makes cherry muffins where, due to the production process, 15% of muffins contain a lower than expected quantity of cherries. The bakery sells these muffins in boxes of 20.
  1. State a suitable distribution to model the number of muffins with a lower than expected quantity of cherries in a box, giving the value(s) of any parameter(s). State any assumptions needed for your model to be valid. [4]
  2. Using your model from part (a), find the probability that a randomly selected box contains:
    1. exactly 3 muffins with a lower than expected quantity of cherries, [2]
    2. at least 5 muffins with a lower than expected quantity of cherries. [2]
  3. The bakery sells 25 boxes of muffins in one day. Find the probability that fewer than 4 of these boxes contain exactly 3 muffins with a lower than expected quantity of cherries. [3]
SPS SPS SM Statistics 2025 April Q7
9 marks Standard +0.3
Miguel has six numbered tiles, labelled 2, 2, 3, 3, 4, 4. He selects two tiles at random, without replacement. The variable \(M\) denotes the sum of the numbers on the two tiles.
  1. Show that \(P(M = 6) = \frac{1}{3}\) [2]
The table shows the probability distribution of \(M\)
\(m\)45678
\(P(M = m)\)\(\frac{1}{15}\)\(\frac{4}{15}\)\(\frac{1}{3}\)\(\frac{4}{15}\)\(\frac{1}{15}\)
Miguel returns the two tiles to the collection. Now Sofia selects two tiles at random from the six tiles, without replacement. The variable \(S\) denotes the sum of the numbers on the two tiles that Sofia selects.
  1. Find \(P(M = S)\) [3]
  2. Find \(P(S = 7 | M = S)\) [4]
SPS SPS SM Statistics 2024 September Q1
5 marks Easy -1.2
The histogram shows information about the lengths, \(l\) centimetres, of a sample of worms of a certain species. \includegraphics{figure_1} The number of worms in the sample with lengths in the class \(3 \leqslant l < 4\) is 30.
  1. Find the number of worms in the sample with lengths in the class \(0 \leqslant l < 2\). [2]
  2. Find an estimate of the number of worms in the sample with lengths in the range \(4.5 \leqslant l < 5.5\). [3]
SPS SPS SM Statistics 2024 September Q2
4 marks Moderate -0.8
A factory buys 10\% of its components from supplier \(A\), 30\% from supplier \(B\) and the rest from supplier \(C\). It is known that 6\% of the components it buys are faulty. Of the components bought from supplier \(A\), 9\% are faulty and of the components bought from supplier \(B\), 3\% are faulty.
  1. Find the percentage of components bought from supplier \(C\) that are faulty. [3]
A component is selected at random.
  1. Explain why the event "the component was bought from supplier \(B\)" is not statistically independent from the event "the component is faulty". [1]
SPS SPS SM Statistics 2024 September Q3
11 marks Standard +0.3
The discrete random variable \(X\) takes values 1, 2, 3, 4 and 5, and its probability distribution is defined as follows. $$\mathrm{P}(X = x) = \begin{cases} a & x = 1, \\ \frac{1}{2}\mathrm{P}(X = x - 1) & x = 2, 3, 4, 5, \\ 0 & \text{otherwise,} \end{cases}$$ where \(a\) is a constant.
  1. Show that \(a = \frac{16}{31}\). [2]
The discrete probability distribution for \(X\) is given in the table.
\(x\)12345
P\((X = x)\)\(\frac{16}{31}\)\(\frac{8}{31}\)\(\frac{4}{31}\)\(\frac{2}{31}\)\(\frac{1}{31}\)
  1. Find the probability that \(X\) is odd. [1]
Two independent values of \(X\) are chosen, and their sum \(S\) is found.
  1. Find the probability that \(S\) is odd. [2]
  2. Find the probability that \(S\) is greater than 8, given that \(S\) is odd. [3]
Sheila sometimes needs several attempts to start her car in the morning. She models the number of attempts she needs by the discrete random variable \(Y\) defined as follows. $$\mathrm{P}(Y = y + 1) = \frac{1}{2}\mathrm{P}(Y = y) \quad \text{for all positive integers } y.$$
  1. Find P\((Y = 1)\). [2]
  2. Give a reason why one of the variables, \(X\) or \(Y\), might be more appropriate as a model for the number of attempts that Sheila needs to start her car. [1]
SPS SPS SM Statistics 2024 September Q4
7 marks Easy -1.8
The radar diagrams illustrate some population figures from the 2011 census results. \includegraphics{figure_4} Each radius represents an age group, as follows:
Radius123456
Age group0-1718-2930-4445-5960-7475+
The distance of each dot from the centre represents the number of people in the relevant age group.
  1. The scales on the two diagrams are different. State an advantage and a disadvantage of using different scales in order to make comparisons between the ages of people in these two Local Authorities. [2]
  2. Approximately how many people aged 45 to 59 were there in Liverpool? [1]
  3. State the main two differences between the age profiles of the two Local Authorities. [2]
  4. James makes the following claim. "Assuming that there are no significant movements of population either into or out of the two regions, the 2021 census results are likely to show an increase in the number of children in Liverpool and a decrease in the number of children in Rutland." Use the radar diagrams to give a justification for this claim. [2]
SPS SPS SM Statistics 2024 September Q5
10 marks Moderate -0.3
At a factory that makes crockery the quality control department has found that 10\% of plates have minor faults. These are classed as 'seconds'. Plates are stored in batches of 12. The number of seconds in a batch is denoted by \(X\).
  1. State an appropriate distribution with which to model \(X\). Give the value(s) of any parameter(s) and state any assumptions required for the model to be valid. [4]
Assume now that your model is valid.
  1. Find
    1. P\((X = 3)\), [2]
  2. A random sample of 4 batches is selected. Find the probability that the number of these batches that contain at least 1 second is fewer than 3. [4]
SPS SPS SM Statistics 2024 September Q6
11 marks Standard +0.3
A television company believes that the proportion of households that can receive Channel C is 0.35.
  1. In a random sample of 14 households it is found that 2 can receive Channel C. Test, at the 2.5\% significance level, whether there is evidence that the proportion of households that can receive Channel C is less than 0.35. [7]
  2. On another occasion the test is carried out again, with the same hypotheses and significance level as in part (i), but using a new sample, of size \(n\). It is found that no members of the sample can receive Channel C. Find the largest value of \(n\) for which the null hypothesis is not rejected. Show all relevant working. [4]
SPS SPS SM Statistics 2024 September Q7
4 marks Moderate -0.3
The Venn diagram shows the numbers of students studying various subjects, in a year group of 100 students. \includegraphics{figure_7} A student is chosen at random from the 100 students. Then another student is chosen from the remaining students. Find the probability that the first student studies History and the second student studies Geography but not Psychology. [4]