OCR MEI Further Statistics Minor (Further Statistics Minor) 2020 November

Question 1
View details
1 A quiz team of 4 students is to be selected from a group of 7 girls and 5 boys. The team is selected at random from the students in the group. The number of girls in the team is denoted by the random variable \(X\).
  1. Show that \(\mathrm { P } ( X = 4 ) = \frac { 7 } { 99 }\). Table 1 shows the probability distribution of \(X\). \begin{table}[h]
    \(r\)01234
    \(\mathrm { P } ( X = r )\)\(\frac { 1 } { 99 }\)\(\frac { 14 } { 99 }\)\(\frac { 42 } { 99 }\)\(\frac { 35 } { 99 }\)\(\frac { 7 } { 99 }\)
    \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
  2. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    It is decided that the quiz team must have at least 1 girl and at least 1 boy, but the team is still otherwise selected at random.
  3. Explain whether \(\mathrm { E } ( X )\) would be smaller than, equal to or larger than the value which you found in part (b).
Question 2
View details
2 On computer monitor screens there are often one or more tiny dots which are permanently dark and do not display any of the image. Such dots are known as 'dead pixels'. Dead pixels occur on screens randomly and independently of each other. A company manufactures three types of monitor, Types A, B and C. For a monitor of Type A, the screen has a total of 2304000 pixels. For this type of monitor, the probability of a randomly chosen pixel being dead is 1 in 500000 . Let \(X\) represent the number of dead pixels on a monitor screen of this type.
  1. Explain why you could use either a binomial distribution or a Poisson distribution to model the distribution of \(X\).
  2. Use a Poisson distribution to calculate estimates of each of the following probabilities.
    • \(\mathrm { P } ( X = 4 )\)
    • \(\mathrm { P } ( X > 4 )\)
    • In this question you must show detailed reasoning.
    For a monitor of Type B, the probability of a randomly chosen pixel being dead is also 1 in 500 000. The screen of a monitor of Type B has a total of \(n\) pixels. Use a binomial distribution to find the least value of \(n\) for which the probability of finding at least 1 dead pixel is greater than 0.99 . Give your answer in millions correct to 3 significant figures. For a monitor of Type C, the number of dead pixels on the screen is modelled by a Poisson distribution with mean \(\lambda\).
  3. Given that the probability of finding at least one dead pixel is 0.8 , find \(\lambda\).
Question 3
View details
3 In this question you must show detailed reasoning. In a survey into pet ownership, one of the questions was 'Do you own either a cat or a dog (or both)?’. A total of 121 people took part in the survey and you should assume that they form a random sample of people in a particular town. The results, classified by the age of the person being surveyed, are shown in Table 3. \begin{table}[h]
\multirow{2}{*}{}Ownership of cat or dog
Does ownDoes not own
\multirow{2}{*}{Age}Over 45 years3829
Under 45 years2331
\captionsetup{labelformat=empty} \caption{Table 3}
\end{table} Carry out a test at the 10\% significance level to investigate whether, for people in this town, there is any association between age and ownership of a cat or dog.
Question 4
View details
4 Cards are drawn at random from a standard pack of 52 cards, one at a time, until one of the 4 aces is drawn. After each card is drawn, it is replaced in the pack before the next one is drawn. The random variable \(X\) represents the number of draws required to draw the first ace.
  1. State fully the distribution of \(X\).
  2. Find \(\mathrm { P } ( X = 10 )\).
  3. Find each of the following.
    • \(\mathrm { E } ( X )\)
    • \(\operatorname { Var } ( X )\)
    A further \(k\) aces are added to the full pack and the process described above is repeated. The random variable \(Y\) represents the number of draws required to draw the first ace.
  4. In this question you must show detailed reasoning. Given that \(\mathrm { P } ( Y = 2 ) = \frac { 8 } { 81 }\), find the two possible values of \(k\).
Question 5
View details
5 A student is investigating immunisation. He wonders if there is any relationship between the percentage of young children who have been given measles vaccine and the percentage who have been given BCG vaccine in various countries. He takes a random sample of 8 countries and finds the data for the two variables. The spreadsheet in Fig. 5.1 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-5_910_1653_541_246} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. The student decides that a test based on Pearson's product moment correlation coefficient is not valid. Explain why he comes to this conclusion. The student carries out a test based on Spearman’s rank correlation coefficient.
  2. Calculate the value of Spearman’s rank correlation coefficient.
  3. Carry out a test based on this coefficient at the \(5 \%\) significance level to investigate whether there is any association between measles and BCG vaccination levels. The student then decides to investigate the relationship between number of doctors per 1000 people in a country and unemployment rate in that country (unemployment rate is the percentage of the working age population who are not in paid work). He selects a random sample of 6 countries. The spreadsheet in Fig. 5.2 shows the values obtained, together with a scatter diagram which illustrates the data. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{882f9f3c-40d8-4abb-822a-49bd505a33ea-6_776_1649_495_248} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  4. Use your calculator to write down the equation of the regression line of unemployment rate on doctors per 1000.
  5. Use the regression line to estimate the unemployment rate for a country with 2.00 doctors per 1000.
  6. Comment briefly on the reliability of your answer to part (e). The student decides to add the data for another country with 3.99 doctors per 1000 and unemployment rate 11.42 to his diagram.
  7. Add this point to the scatter diagram in the Printed Answer Booklet.
  8. Without doing any further calculations, comment on what difference, if any, including this extra data point would make to the usefulness of a regression line of unemployment rate on doctors per 1000.
Question 6
View details
6
  1. The random variable \(X\) has a uniform distribution over the values \(\{ 1,2 , \ldots , n \}\). Show that \(\operatorname { Var } ( X )\) is given by \(\frac { 1 } { 12 } \left( n ^ { 2 } - 1 \right)\).
  2. The random variable \(Y\) has a uniform distribution over the values \(\{ 1,3,5 , \ldots , 2 n - 1 \}\). Using the result in part (a) or otherwise, show that \(\operatorname { Var } ( Y )\) is given by \(\frac { 1 } { 3 } \left( n ^ { 2 } - 1 \right)\).
  3. Given that \(n = 100\), find the least value of \(k\) for which \(\mathrm { P } ( \mu - k \sigma \leqslant Y \leqslant \mu + k \sigma ) = 1\), where the mean and standard deviation of \(Y\) are represented by \(\mu\) and \(\sigma\) respectively.