SPS SPS SM Statistics (SPS SM Statistics) 2024 April

Question 1
View details
1. The masses of a random sample of 120 boulders in a certain area were recorded. The results are summarized in the histogram.
\includegraphics[max width=\textwidth, alt={}, center]{d59e9fea-31cb-4b6d-b1d6-f09f912b5b37-04_773_1765_402_148}
  1. Calculate the number of boulders with masses between 60 and 65 kg .
    1. Use midpoints to find estimates of the mean and standard deviation of the masses of the boulders in the sample.
    2. Explain why your answers are only estimates.
  2. Use your answers to part (b)(i) to determine an estimate of the number of outliers, if any, in the distribution.
  3. Give one advantage of using a histogram rather than a pie chart in this context.
    [0pt] [BLANK PAGE]
Question 2 11 marks
View details
2.
  1. A certain five-sided die is biased with faces numbered 0 to 4 . The score, Y , on each throw is a random variable with probability distribution given by:
    \(Y\)01234
    \(\mathrm { P } ( Y = y )\)\(a\)\(b\)\(c\)0.10.15
    where \(a\), \(b\) and \(c\) are constants. $$\begin{aligned} & \mathrm { P } ( Y = 1 ) = \mathrm { P } ( Y \geq 3 )
    & \mathrm { P } ( Y = 0 ) = \mathrm { P } ( Y = 2 ) - 0.1 \end{aligned}$$ Find the values of \(a , b\) and \(c\).
    [0pt] [4 marks]
  2. The same die is thrown 10 times. Find the probability that there are not more than 4 throws on which the score is 3 , stating the distribution used as well as any modelling assumptions made.
    [0pt] [4 marks]
  3. A game uses the same biased die. The die is thrown once. If it shows 1, 3 or 4 then this number is the final score. If it shows 0 or 2 then the die is thrown again and the final score is the sum of the numbers shown on the two throws.
    (a) Find the probability that the final score is 3 .
    (b) Given that the die is thrown twice, find the probability that the final score is 3 .
    [0pt] [3 marks]
    [0pt] [BLANK PAGE]
Question 3
View details
3. The table shows the increases, between 2001 and 2011, in the percentages of employees travelling to work by various methods, in the Local Authorities (LAs) in the North East region of the UK.
Geography codeLocal authorityWork mainly at or from homeUnderground, metro, light rail or tramBus, minibus or coachDriving a car or vanPassenger in a car or vanOn foot
E06000047County Durham0.74\%0.05\%-1.50\%4.58\%-2.99\%-0.97\%
E06000005Darlington0.26\%-0.01\%-3.25\%3.06\%-1.28\%0.99\%
E08000020Gateshead-0.01\%-0.01\%-2.28\%4.62\%-2.35\%-0.18\%
E06000001Hartlepool0.03\%-0.04\%-1.62\%4.80\%-2.38\%-0.26\%
E06000002Middlesbrough-0.34\%-0.01\%-2.32\%2.19\%-1.33\%0.67\%
E08000021Newcastle upon Tyne0.10\%-0.23\%-0.67\%-0.48\%-1.51\%1.75\%
E08000022North Tyneside0.05\%0.54\%-1.18\%3.30\%-2.21\%-0.60\%
E06000048Northumberland1.39\%-0.08\%-0.95\%3.50\%-2.37\%-1.44\%
E06000003Redcar and Cleveland-0.02\%-0.01\%-2.09\%4.20\%-2.06\%-0.49\%
E08000023South Tyneside-0.36\%2.03\%-3.05\%4.50\%-2.41\%-0.51\%
E06000004Stockton-on-Tees0.14\%0.03\%-2.02\%3.52\%-2.01\%-0.15\%
E08000024Sunderland0.17\%1.48\%-3.11\%4.89\%-2.21\%-0.52\%
\section*{Increase in percentage of employees travelling to work by various methods} The first two digits of the Geography code give the type of each of the LAs:
06: Unitary authority
07: Non-metropolitan district
08: Metropolitan borough
  1. In what type of LA are the largest increases in percentages of people travelling by underground, metro, light rail or tram?
  2. Identify two main changes in the pattern of travel to work in the North East region between 2001 and 2011. Now assume the following.
    • The data refer to residents in the given LAs who are in the age range 20 to 65 at the time of each census.
    • The number of people in the age range 20 to 65 who move into or out of each given LA, or who die, between 2001 and 2011 is negligible.
    • Estimate the percentage of the people in the age range 20 to 65 in 2011 whose data appears in both 2001 and 2011.
    • In the light of your answer to part (c), suggest a reason for the changes in the pattern of travel to work in the North East region between 2001 and 2011.
      [0pt] [BLANK PAGE]
Question 4
View details
4. An online shopping company takes orders through its website. On average \(80 \%\) of orders from the website are delivered within 24 hours. The quality controller selects 10 orders at random to check when they are delivered.
  1. Find the probability that
    (A) exactly 8 of these orders are delivered within 24 hours,
    (B) at least 8 of these orders are delivered within 24 hours. The company changes its delivery method. The quality controller suspects that the changes will mean that fewer than \(80 \%\) of orders will be delivered within 24 hours. A random sample of 18 orders is checked and it is found that 12 of them arrive within 24 hours.
  2. Write down suitable hypotheses and carry out a test at the \(5 \%\) significance level to determine whether there is any evidence to support the quality controller's suspicion.
  3. A statistician argues that it is possible that the new method could result in either better or worse delivery times. Therefore it would be better to carry out a 2 -tail test at the \(5 \%\) significance level. State the alternative hypothesis for this test. Assuming that the sample size is still 18, find the critical region for this test, showing all of your calculations.
    [0pt] [BLANK PAGE]
Question 5
View details
5. In this question you must show detailed reasoning.
A disease that affects trees shows no visible evidence for the first few years after the tree is infected. A test has been developed to determine whether a particular tree has the disease. A positive result to the test suggests that the tree has the disease. However, the test is not \(100 \%\) reliable, and a researcher uses the following model.
  • If the tree has the disease, the probability of a positive result is 0.95 .
  • If the tree does not have the disease, the probability of a positive result is 0.1 .
    1. It is known that in a certain county, \(A , 35 \%\) of the trees have the disease. A tree in county \(A\) is chosen at random and is tested.
Given that the result is positive, determine the probability that this tree has the disease. A forestry company wants to determine what proportion of trees in another county, \(B\), have the disease. They choose a large random sample of trees in county \(B\). Each tree in the sample is tested and it is found that the result is positive for \(43 \%\) of these trees.
  • By carrying out a calculation, determine an estimate of the proportion of trees in county \(B\) that have the disease.
    [0pt] [BLANK PAGE]
    [0pt] [BLANK PAGE]