WJEC Further Unit 2 (Further Unit 2) 2022 June

Question 1
View details
  1. The probability distribution for the prize money, \(\pounds X\) per ticket, in a local fundraising lottery is shown below.
\(x\)021001000
\(\mathrm { P } ( X = x )\)0.90.09\(p\)0.0001
  1. Calculate the value of \(p\).
  2. Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
    1. What is the minimum lottery ticket price that the organiser should set in order to make a profit in the long run?
    2. Suggest why, in practice, people would be prepared to pay more than this minimum price.
Question 2
View details
2. An economist suggested the rate of unemployment and the rate of wage inflation are independent. Amy sets about investigating this suggestion. She collects unemployment data and wage inflation data from a random sample of regions in the UK and decides that it is appropriate to carry out a significance test on Pearson's product moment correlation coefficient. Amy's summary statistics for percentage unemployment, \(x\), and percentage wage inflation, \(y\), are shown below. $$\begin{array} { l l l } \sum x = 62 \cdot 8 & \sum y = 19 \cdot 4 & n = 10
\sum x ^ { 2 } = 413 \cdot 44 & \sum y ^ { 2 } = 46 \cdot 16 & \sum x y = 113 \cdot 16 \end{array}$$
  1. Calculate Pearson's product moment correlation coefficient for these data.
  2. Carry out Amy's test at the \(5 \%\) level of significance and state whether the economist's suggestion is reasonable. Amy also collects unemployment data and wage inflation data from a random sample of 10 regions in Spain and calculates Pearson's product moment correlation coefficient to be - 0.2525 .
  3. Should this change Amy's opinion on the economist's suggestion above? What could she do to improve her investigation?
  4. What assumption has Amy made in deciding that it is appropriate to carry out a significance test on Pearson's product moment correlation coefficient?
Question 3
View details
3. Two basketball players, Steph and Klay, score baskets at random at a rate of \(2 \cdot 1\) and \(1 \cdot 9\) respectively per quarter of a game. Assume that baskets are scored independently, and that Steph and Klay each play all four quarters of the game.
  1. Stating the model that you are using, find the probability that they will score a combined total of exactly 20 baskets in a randomly selected game.
  2. A quarter of a game lasts 12 minutes.
    1. State the distribution of the time between baskets for Steph. Give the mean and standard deviation of this distribution.
    2. Given that Klay scores at the end of the third minute in a quarter of a game, find the probability that Klay doesn't score for the rest of the quarter.
  3. When practising, Klay misses \(4 \%\) of the free throws he takes. One week he takes 530 free throws. Calculate the probability that he misses more than 25 free throws.
Question 4
View details
4. The continuous random variable \(R\) has probability density function \(f ( r )\) given by $$f ( r ) = \begin{cases} k r ( b - r ) & \text { for } 1 \leqslant r \leqslant 4 ,
0 & \text { otherwise } , \end{cases}$$ where \(k\) and \(b\) are positive constants.
  1. Explain why \(b \geqslant 4\).
  2. Given that \(b = 4\),
    1. show that \(k = \frac { 1 } { 9 }\),
    2. find an expression for \(F ( r )\), valid for \(1 \leqslant r \leqslant 4\), where \(F\) denotes the cumulative distribution function of \(R\),
    3. find the probability that \(R\) lies between 2 and 3 .
Question 5
View details
5. John has a game that involves throwing a set of three identical, cubical dice with faces numbered 1 to 6 . He wishes to investigate whether these dice are fair in terms of the number of sixes obtained when they are thrown. John throws the set of three dice 1100 times and records the number of sixes obtained for each throw. The results are shown in the table below.
Number of sixes0123
Frequency6253848110
Using these results, conduct a goodness of fit test and draw an appropriate conclusion.
Question 6
View details
6. An online survey on the use of social media asked the following question: \begin{displayquote} "Do you use any form of social media?" \end{displayquote} The results for a total of 1953 respondents are shown in the table below.
Age in years
Use social media18-2930-4950-6465 or olderTotal
Yes3104123481961266
No42116196333687
Total3525285445291953
To test whether there is a relationship between social media use and age, a significance test is carried out at the \(5 \%\) level.
  1. State the null and alternative hypotheses.
  2. Show how the expected frequency \(228 \cdot 18\) is calculated in the table below.
    Expected valuesAge in years
    Use social media18-2930-4950-6465 or older
    Yes\(228 \cdot 18\)\(342 \cdot 27\)352.64342.92
    No123.82185.73191.36186.08
  3. Determine the value of \(s\) in the table below.
    Chi-squared contributionsAge in years
    Use social media18-2930-4950-6465 or older
    Yes29.34\(s\)0.0662.94
    No54.0726-180.11115.99
  4. Complete the significance test, showing all your working.
  5. A student, analysing these data on a spreadsheet, obtains the following output.
    \includegraphics[max width=\textwidth, alt={}, center]{77fd7ad7-f5a3-4947-afc6-e5ef45bef7a8-5_202_1271_445_415} Explain why the student must have made an error in calculating the \(p\)-value.
Question 7
View details
7. Data from a large dataset shows the percentage of children enrolled in secondary education and the percentage of the adult population who are literate. The following graphs show data from 30 randomly selected regions from each of the Arab World, Africa and Asia. In each case, the least squares regression line of '\% Literacy' on '\% Enrolled in Secondary Education' is shown.
\includegraphics[max width=\textwidth, alt={}, center]{77fd7ad7-f5a3-4947-afc6-e5ef45bef7a8-6_682_1200_584_395} \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{Africa} \includegraphics[alt={},max width=\textwidth]{77fd7ad7-f5a3-4947-afc6-e5ef45bef7a8-6_623_1191_1548_397}
\end{figure} \includegraphics[max width=\textwidth, alt={}, center]{77fd7ad7-f5a3-4947-afc6-e5ef45bef7a8-7_665_1200_331_434}
  1. Calculate the equation of the least squares regression line of '\% Literacy' ( \(y\) ) on '\% Enrolled in Secondary Education' ( \(x\) ) for Asia, given the following summary statistics. $$\begin{array} { l l l } \sum x = 2850.836 & \sum y = 2738.656 & S _ { x x } = 88.42142
    S _ { y y } = 204.733 & S _ { x y } = 96.60984 & n = 30 \end{array}$$
  2. The Arab World, Africa and Asia each contain a region where \(70 \%\) are enrolled in secondary education. The three regression lines are used to estimate the corresponding \% Literacy. Which of these estimates is likely to be the most reliable? Clearly explain your reasoning. \section*{END OF PAPER}