OCR S1 (Statistics 1) 2008 January

Question 1
View details
1
  1. The letters \(\mathrm { A } , \mathrm { B } , \mathrm { C } , \mathrm { D }\) and E are arranged in a straight line.
    (a) How many different arrangements are possible?
    (b) In how many of these arrangements are the letters A and B next to each other?
  2. From the letters \(\mathrm { A } , \mathrm { B } , \mathrm { C } , \mathrm { D }\) and E , two different letters are selected at random. Find the probability that these two letters are A and B .
Question 2
View details
2 A random variable \(T\) has the distribution \(\operatorname { Geo } \left( \frac { 1 } { 5 } \right)\). Find
  1. \(\mathrm { P } ( T = 4 )\),
  2. \(\mathrm { P } ( T > 4 )\),
  3. \(\mathrm { E } ( T )\).
Question 3
View details
3 A sample of bivariate data was taken and the results were summarised as follows. $$n = 5 \quad \Sigma x = 24 \quad \Sigma x ^ { 2 } = 130 \quad \Sigma y = 39 \quad \Sigma y ^ { 2 } = 361 \quad \Sigma x y = 212$$
  1. Show that the value of the product moment correlation coefficient \(r\) is 0.855 , correct to 3 significant figures.
  2. The ranks of the data were found. One student calculated Spearman's rank correlation coefficient \(r _ { s }\), and found that \(r _ { s } = 0.7\). Another student calculated the product moment coefficient, \(R\), of these ranks. State which one of the following statements is true, and explain your answer briefly.
    (A) \(R = 0.855\)
    (B) \(R = 0.7\)
    (C) It is impossible to give the value of \(R\) without carrying out a calculation using the original data.
  3. All the values of \(x\) are now multiplied by a scaling factor of 2 . State the new values of \(r\) and \(r _ { s }\).
Question 4
View details
4 A supermarket has a large stock of eggs. 40\% of the stock are from a firm called Eggzact. 12\% of the stock are brown eggs from Eggzact. An egg is chosen at random from the stock. Calculate the probability that
  1. this egg is brown, given that it is from Eggzact,
  2. this egg is from Eggzact and is not brown.
Question 5
View details
5
  1. \(20 \%\) of people in the large town of Carnley support the Residents' Party. 12 people from Carnley are selected at random. Out of these 12 people, the number who support the Residents' Party is denoted by \(U\). Find
    (a) \(\mathrm { P } ( U \leqslant 5 )\),
    (b) \(\quad \mathrm { P } ( U \geqslant 3 )\).
  2. \(30 \%\) of people in Carnley support the Commerce Party. 15 people from Carnley are selected at random. Out of these 15 people, the number who support the Commerce Party is denoted by \(V\). Find \(\mathrm { P } ( V = 4 )\).
Question 6
View details
6 The probability distribution for a random variable \(Y\) is shown in the table.
\(y\)123
\(\mathrm { P } ( Y = y )\)0.20.30.5
  1. Calculate \(\mathrm { E } ( Y )\) and \(\operatorname { Var } ( Y )\). Another random variable, \(Z\), is independent of \(Y\). The probability distribution for \(Z\) is shown in the table.
    \(z\)123
    \(\mathrm { P } ( Z = z )\)0.10.250.65
    One value of \(Y\) and one value of \(Z\) are chosen at random. Find the probability that
  2. \(Y + Z = 3\),
  3. \(Y \times Z\) is even.
Question 7
View details
7
  1. Andrew plays 10 tennis matches. In each match he either wins or loses.
    (a) State, in this context, two conditions needed for a binomial distribution to arise.
    (b) Assuming these conditions are satisfied, define a variable in this context which has a binomial distribution.
  2. The random variable \(X\) has the distribution \(\mathrm { B } ( 21 , p )\), where \(0 < p < 1\). Given that \(\mathrm { P } ( X = 10 ) = \mathrm { P } ( X = 9 )\), find the value of \(p\).
Question 8
View details
8 The stem-and-leaf diagram shows the age in completed years of the members of a sports club. \section*{Male} \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Female}
8876166677889
7655332121334578899
98443323347
5214018
9050
\end{table} Key: 1 | 4 | 0 represents a male aged 41 and a female aged 40.
  1. Find the median and interquartile range for the males.
  2. The median and interquartile range for the females are 27 and 15 respectively. Make two comparisons between the ages of the males and the ages of the females.
  3. The mean age of the males is 30.7 and the mean age of the females is 27.5 , each correct to 1 decimal place. Give one advantage of using the median rather than the mean to compare the ages of the males with the ages of the females. A record was kept of the number of hours, \(X\), spent by each member at the club in a year. The results were summarised by $$n = 49 , \quad \Sigma ( x - 200 ) = 245 , \quad \Sigma ( x - 200 ) ^ { 2 } = 9849 .$$
  4. Calculate the mean and standard deviation of \(X\).
Question 9
View details
9 It is thought that the pH value of sand (a measure of the sand's acidity) may affect the extent to which a particular species of plant will grow in that sand. A botanist wished to determine whether there was any correlation between the pH value of the sand on certain sand dunes, and the amount of each of two plant species growing there. She chose random sections of equal area on each of eight sand dunes and measured the pH values. She then measured the area within each section that was covered by each of the two species. The results were as follows.
\cline { 2 - 10 } \multicolumn{1}{c|}{}Dune\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
\cline { 2 - 10 } \multicolumn{1}{c|}{}pH value, \(x\)8.58.59.58.56.57.58.59.0
\multirow{2}{*}{
Area, \(y \mathrm {~cm} ^ { 2 }\)
covered
}
Species \(P\)1501505753304515340330
\cline { 2 - 10 }Species \(Q\)1701580230752500
The results for species \(P\) can be summarised by $$n = 8 , \quad \Sigma x = 66.5 , \quad \Sigma x ^ { 2 } = 558.75 , \quad \Sigma y = 1935 , \quad \Sigma y ^ { 2 } = 711275 , \quad \Sigma x y = 17082.5 .$$
  1. Give a reason why it might be appropriate to calculate the equation of the regression line of \(y\) on \(x\) rather than \(x\) on \(y\) in this situation.
  2. Calculate the equation of the regression line of \(y\) on \(x\) for species \(P\), in the form \(y = a + b x\), giving the values of \(a\) and \(b\) correct to 3 significant figures.
  3. Estimate the value of \(y\) for species \(P\) on sand where the pH value is 7.0 . The values of the product moment correlation coefficient between \(x\) and \(y\) for species \(P\) and \(Q\) are \(r _ { P } = 0.828\) and \(r _ { Q } = 0.0302\).
  4. Describe the relationship between the area covered by species \(Q\) and the pH value.
  5. State, with a reason, whether the regression line of \(y\) on \(x\) for species \(P\) will provide a reliable estimate of the value of \(y\) when the pH value is
    (a) 8,
    (b) 4 .
  6. Assume that the equation of the regression line of \(y\) on \(x\) for species \(Q\) is also known. State, with a reason, whether this line will provide a reliable estimate of the value of \(y\) when the pH value is 8 .