OCR S1 (Statistics 1) 2006 June

Question 1
View details
1 Some observations of bivariate data were made and the equations of the two regression lines were found to be as follows. $$\begin{array} { c c } y \text { on } x : & y = - 0.6 x + 13.0
x \text { on } y : & x = - 1.6 y + 21.0 \end{array}$$
  1. State, with a reason, whether the correlation between \(x\) and \(y\) is negative or positive.
  2. Neither variable is controlled. Calculate an estimate of the value of \(x\) when \(y = 7.0\).
  3. Find the values of \(\bar { x }\) and \(\bar { y }\).
Question 2
View details
2 A bag contains 5 black discs and 3 red discs. A disc is selected at random from the bag. If it is red it is replaced in the bag. If it is black, it is not replaced. A second disc is now selected at random from the bag. Find the probability that
  1. the second disc is black, given that the first disc was black,
  2. the second disc is black,
  3. the two discs are of different colours.
Question 3
View details
3 Each of the 7 letters in the word DIVIDED is printed on a separate card. The cards are arranged in a row.
  1. How many different arrangements of the letters are possible?
  2. In how many of these arrangements are all three Ds together? The 7 cards are now shuffled and 2 cards are selected at random, without replacement.
  3. Find the probability that at least one of these 2 cards has D printed on it.
Question 4
View details
4
  1. The random variable \(X\) has the distribution \(\mathrm { B } ( 25,0.2 )\). Using the tables of cumulative binomial probabilities, or otherwise, find \(\mathrm { P } ( X \geqslant 5 )\).
  2. The random variable \(Y\) has the distribution \(\mathrm { B } ( 10,0.27 )\). Find \(\mathrm { P } ( Y = 3 )\).
  3. The random variable \(Z\) has the distribution \(\mathrm { B } ( n , 0.27 )\). Find the smallest value of \(n\) such that \(\mathrm { P } ( Z \geqslant 1 ) > 0.95\).
Question 5
View details
5 The probability distribution of a discrete random variable, \(X\), is given in the table.
\(x\)0123
\(\mathrm { P } ( X = x )\)\(\frac { 1 } { 3 }\)\(\frac { 1 } { 4 }\)\(p\)\(q\)
It is given that the expectation, \(\mathrm { E } ( X )\), is \(1 \frac { 1 } { 4 }\).
  1. Calculate the values of \(p\) and \(q\).
  2. Calculate the standard deviation of \(X\).
Question 6
View details
6 The table shows the total distance travelled, in thousands of miles, and the amount of commission earned, in thousands of pounds, by each of seven sales agents in 2005.
Agent\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
Distance travelled18151214162413
Commission earned18451924272223
  1. (a) Calculate Spearman's rank correlation coefficient, \(r _ { s }\), for these data.
    (b) Comment briefly on your value of \(r _ { s }\) with reference to this context.
    (c) After these data were collected, agent \(A\) found that he had made a mistake. He had actually travelled 19000 miles in 2005. State, with a reason, but without further calculation, whether the value of Spearman's rank correlation coefficient will increase, decrease or stay the same. The agents were asked to indicate their level of job satisfaction during 2005. A score of 0 represented no job satisfaction, and a score of 10 represented high job satisfaction. Their scores, \(y\), together with the data for distance travelled, \(x\), are illustrated in the scatter diagram below.
    \includegraphics[max width=\textwidth, alt={}, center]{b37239e4-5d63-47e4-844a-f01b79f8dd67-3_691_981_1231_587}
  2. For this scatter diagram, what can you say about the value of
    (a) Spearman's rank correlation coefficient,
    (b) the product moment correlation coefficient?
Question 7
View details
7 In a UK government survey in 2000, smokers were asked to estimate the time between their waking and their having the first cigarette of the day. For heavy smokers, the results were as follows.
Time between waking
and first cigarette
1 to 4
minutes
5 to 14
minutes
15 to 29
minutes
30 to 59
minutes
At least 60
minutes
Percentage of smokers312719149
Times are given correct to the nearest minute.
  1. Assuming that 'At least 60 minutes' means 'At least 60 minutes but less than 240 minutes', calculate estimates for the mean and standard deviation of the time between waking and first cigarette for these smokers.
  2. Find an estimate for the interquartile range of the time between waking and first cigarette for these smokers. Give your answer correct to the nearest minute.
  3. The meaning of 'At least 60 minutes' is now changed to 'At least 60 minutes but less than 480 minutes'. Without further calculation, state whether this would cause an increase, a decrease or no change in the estimated value of
    (a) the mean,
    (b) the standard deviation,
    (c) the interquartile range.
Question 8
View details
8 Henry makes repeated attempts to light his gas fire. He makes the modelling assumption that the probability that the fire will light on any attempt is \(\frac { 1 } { 3 }\). Let \(X\) be the number of attempts at lighting the fire, up to and including the successful attempt.
  1. Name the distribution of \(X\), stating a further modelling assumption needed. In the rest of this question, you should use the distribution named in part (i).
  2. Calculate
    (a) \(\mathrm { P } ( X = 4 )\),
    (b) \(\mathrm { P } ( X < 4 )\).
  3. State the value of \(\mathrm { E } ( X )\).
  4. Henry has to light the fire once a day, starting on March 1st. Calculate the probability that the first day on which fewer than 4 attempts are needed to light the fire is March 3rd.