SPS SPS SM Statistics (SPS SM Statistics) 2026 January

Question 1
View details
1. A telephone directory contains 50000 names. A researcher wishes to select a systematic sample of 100 names from the directory.
  1. Explain in detail how the researcher should obtain such a sample.
  2. Give one advantage and one disadvantage of
    1. quota sampling,
    2. systematic sampling.
Question 2
View details
2. Each member of a group of 27 people was timed when completing a puzzle.
The time taken, \(x\) minutes, for each member of the group was recorded.
These times are summarised in the following box and whisker plot.
\includegraphics[max width=\textwidth, alt={}, center]{fdff6575-679e-4d25-ad43-e9d343c1746f-06_346_1383_427_278}
  1. Find the range of the times.
  2. Find the interquartile range of the times. For these 27 people \(\sum x = 607.5\) and \(\sum x ^ { 2 } = 17623.25\)
  3. calculate the mean time taken to complete the puzzle,
  4. calculate the standard deviation of the times taken to complete the puzzle. Taruni defines an outlier as a value more than 3 standard deviations above the mean.
  5. State how many outliers Taruni would say there are in these data, giving a reason for your answer. Adam and Beth also completed the puzzle in \(a\) minutes and \(b\) minutes respectively, where \(a > b\).
    When their times are included with the data of the other 27 people
    • the median time increases
    • the mean time does not change
    • Suggest a possible value for \(a\) and a possible value for \(b\), explaining how your values satisfy the above conditions.
    • Without carrying out any further calculations, explain why the standard deviation of all 29 times will be lower than your answer to part (d).
Question 3
View details
3. Researchers investigated the change in the numbers of people in employment using underground, metro, light rail or tram (UMLRT) between 2001 and 2011. The data are combined for those Local Authorities (LAs) with UMLRT stations into five regions: Birmingham, Liverpool, Manchester, Sheffield and Rotherham, and Tyne and Wear. Fig. 1 shows the total numbers of people in employment in those LAs. Fig. 2 shows the total numbers of people in employment who use UMLRT in those LAs. \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Fig. 1} \includegraphics[alt={},max width=\textwidth]{fdff6575-679e-4d25-ad43-e9d343c1746f-08_834_1694_836_166}
\end{figure} \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Fig. 2} \includegraphics[alt={},max width=\textwidth]{fdff6575-679e-4d25-ad43-e9d343c1746f-08_833_1694_1822_166}
\end{figure}
  1. Use these charts to explain which of Birmingham and Liverpool has the larger proportion of people in employment who used UMLRT in 2011. One of the researchers says, "Between 2001 and 2011, the increase in the number of people in employment who use UMLRT is greatest in Tyne and Wear." Sam says, "But what matters more is which region has the greatest increase in the proportion of people in employment who use UMLRT."
  2. Give a reason why the planners responsible for the building of trains and the maintenance of infrastructure might disagree with Sam.
  3. Explain whether those responsible for encouraging the greater use of public transport would agree with Sam.
  4. The charts are compiled from data in the Large Data Set by using those LAs which contain UMLRT stations in each region. Explain a disadvantage of using these data.
Question 4
View details
4. Patrick is practising his skateboarding skills. On each day, he has 30 attempts at performing a difficult trick. Every time he attempts the trick, there is a probability of 0.2 that he will fall off his skateboard.
Assume that the number of times he falls off on any given day may be modelled by a binomial distribution.
    1. Find the mean number of times he falls off in a day.
  1. (ii) Find the variance of the number of times he falls off in a day.
    1. Find the probability that, on a particular day, he falls off exactly 10 times.
  2. (ii) Find the probability that, on a particular day, he falls off 5 or more times.
  3. Patrick has 30 attempts to perform the trick on each of 5 consecutive days.
    1. Calculate the probability that he will fall off his skateboard at least 5 times on each of the 5 days.
  4. (ii) Explain why it may be unrealistic to use the same value of 0.2 for the probability of falling off for all 5 days.
Question 5
View details
5. The proportion of left-handed adults in a country is 10\%
Freya believes that the proportion of left-handed adults under the age of 25 in this country is different from 10\%
She takes a random sample of 40 adults under the age of 25 from this country to investigate her belief.
  1. Find the critical region for a suitable test to assess Freya's belief. You should
    • state your hypotheses clearly
    • use a \(5 \%\) level of significance
    • state the probability of rejection in each tail
    • Given the null hypothesis is true what is the probability of it being rejected in part (a)?
    In Freya's sample 7 adults were left-handed.
  2. With reference to your answer in part (a) comment on Freya's belief. \section*{6.}
Question 6
View details
6. Skilled operators make a particular component for an engine. The company believes that the time taken to make this component may be modelled by the normal distribution. They timed one of their operators, Sheila, over a long period. They find that when she makes a component, she takes over 90 minutes to make one \(10 \%\) of the time, and that \(20 \%\) of the time, a component was less than 70 minutes to make. Estimate the mean and standard deviation of the time Sheila takes to make a component.
Question 7
View details
7. A team game involves solving puzzles to escape from a room.
Using data from the past, the mean time to solve the puzzles and escape from one of these rooms is 65 minutes with a standard deviation of 11.3 minutes. After recent changes to the puzzles in the room, it is claimed that the mean time to solve the puzzles and escape has changed. To test this claim, a random sample of 100 teams is selected.
The total time to solve the puzzles and escape for the 100 teams is 6780 minutes.
Assuming that the times are normally distributed, test at the \(2 \%\) level the claim that the mean time has changed.
Question 8
View details
8. The discrete random variable \(R\) takes even integer values from 2 to \(2 n\) inclusive.
The probability distribution of \(R\) is given by $$\mathrm { P } ( R = r ) = \frac { r } { k } \quad r = 2,4,6 , \ldots , 2 n$$ where \(k\) is a constant.
  1. Show that \(k = n ( n + 1 )\) When \(n = 20\)
  2. find the exact value of \(\mathrm { P } ( 16 \leqslant R < 26 )\) When \(n = 20\), a random value \(g\) of \(R\) is taken and the quadratic equation in \(x\) $$x ^ { 2 } + g x + 3 g = 5$$ is formed.
  3. Find the exact probability that the equation has no real roots.
Question 9
View details
9. The Venn diagram, where \(p , q\) and \(r\) are probabilities, shows the events \(A , B , C\) and \(D\) and associated probabilities.
\includegraphics[max width=\textwidth, alt={}, center]{fdff6575-679e-4d25-ad43-e9d343c1746f-22_623_1130_326_438}
  1. State any pair of mutually exclusive events from \(A\), \(B\), \(C\) and \(D\) The events \(B\) and \(C\) are independent.
  2. Find the value of \(p\)
  3. Find the greatest possible value of \(\mathrm { P } \left( A \mid B ^ { \prime } \right)\) Given that \(\mathrm { P } \left( B \mid A ^ { \prime } \right) = 0.5\)
  4. find the value of \(q\) and the value of \(r\)
  5. Find \(\mathrm { P } \left( [ A \cup B ] ^ { \prime } \cap C \right)\)
  6. Use set notation to write an expression for the event with probability \(p\)