OCR S1 (Statistics 1) 2007 January

Question 1
View details
1 Part of the probability distribution of a variable, \(X\), is given in the table.
\(x\)0123
\(\mathrm { P } ( X = x )\)\(\frac { 3 } { 10 }\)\(\frac { 1 } { 5 }\)\(\frac { 2 } { 5 }\)
  1. Find \(\mathrm { P } ( X = 0 )\).
  2. Find \(\mathrm { E } ( X )\).
Question 2
View details
2 The table contains data concerning five households selected at random from a certain town.
Number of people in the household23357
Number of cars belonging to people in the household11324
  1. Calculate the product moment correlation coefficient, \(r\), for the data in the table.
  2. Give a reason why it would not be sensible to use your answer to draw a conclusion about all the households in the town.
Question 3
View details
3 The digits 1, 2, 3, 4 and 5 are arranged in random order, to form a five-digit number.
  1. How many different five-digit numbers can be formed?
  2. Find the probability that the five-digit number is
    (a) odd,
    (b) less than 23000 .
Question 4
View details
4 Each of the variables \(W , X , Y\) and \(Z\) takes eight integer values only. The probability distributions are illustrated in the following diagrams.
\includegraphics[max width=\textwidth, alt={}, center]{43f7e091-9ae7-4373-a209-e2ebdba5260f-3_437_394_397_280}
\includegraphics[max width=\textwidth, alt={}, center]{43f7e091-9ae7-4373-a209-e2ebdba5260f-3_433_380_397_685}
\includegraphics[max width=\textwidth, alt={}, center]{43f7e091-9ae7-4373-a209-e2ebdba5260f-3_428_383_402_1082}
\includegraphics[max width=\textwidth, alt={}, center]{43f7e091-9ae7-4373-a209-e2ebdba5260f-3_425_376_402_1482}
  1. For which one or more of these variables is
    (a) the mean equal to the median,
    (b) the mean greater than the median?
  2. Give a reason why none of these diagrams could represent a geometric distribution.
  3. Which one of these diagrams could not represent a binomial distribution? Explain your answer briefly.
Question 5
View details
5 A chemical solution was gradually heated. At five-minute intervals the time, \(x\) minutes, and the temperature, \(y ^ { \circ } \mathrm { C }\), were noted.
\(x\)05101520253035
\(y\)0.83.06.810.915.619.623.426.7
$$\left[ n = 8 , \Sigma x = 140 , \Sigma y = 106.8 , \Sigma x ^ { 2 } = 3500 , \Sigma y ^ { 2 } = 2062.66 , \Sigma x y = 2685.0 . \right]$$
  1. Calculate the equation of the regression line of \(y\) on \(x\).
  2. Use your equation to estimate the temperature after 12 minutes.
  3. It is given that the value of the product moment correlation coefficient is close to + 1 . Comment on the reliability of using your equation to estimate \(y\) when
    (a) \(x = 17\),
    (b) \(x = 57\).
Question 6
View details
6 A coin is biased so that the probability that it will show heads on any throw is \(\frac { 2 } { 3 }\). The coin is thrown repeatedly. The number of throws up to and including the first head is denoted by \(X\). Find
  1. \(\mathrm { P } ( X = 4 )\),
  2. \(\mathrm { P } ( X < 4 )\),
  3. \(\mathrm { E } ( X )\).
Question 7
View details
7 A bag contains three 1 p coins and seven 2 p coins. Coins are removed at random one at a time, without replacement, until the total value of the coins removed is at least 3p. Then no more coins are removed.
  1. Copy and complete the probability tree diagram. First coin
    \includegraphics[max width=\textwidth, alt={}, center]{43f7e091-9ae7-4373-a209-e2ebdba5260f-4_350_317_1279_568} Find the probability that
  2. exactly two coins are removed,
  3. the total value of the coins removed is 4p.
Question 8
View details
8 In the 2001 census, the household size (the number of people living in each household) was recorded. The percentages of households of different sizes were then calculated. The table shows the percentages for two wards, Withington and Old Moat, in Manchester.
\cline { 2 - 8 } \multicolumn{1}{c|}{}Household size
\cline { 2 - 8 } \multicolumn{1}{c|}{}1234567 or more
Withington34.126.112.712.88.24.02.1
Old Moat35.127.114.711.47.62.81.3
  1. Calculate the median and interquartile range of the household size for Withington.
  2. Making an appropriate assumption for the last class, which should be stated, calculate the mean and standard deviation of the household size for Withington. Give your answers to an appropriate degree of accuracy. The corresponding results for Old Moat are as follows.
    Median
    Interquartile
    range
    Mean
    Standard
    deviation
    222.41.5
  3. State one advantage of using the median rather than the mean as a measure of the average household size.
  4. By comparing the values for Withington with those for Old Moat, explain briefly why the interquartile range may be less suitable than the standard deviation as a measure of the variation in household size.
  5. For one of the above wards, the value of Spearman's rank correlation coefficient between household size and percentage is - 1 . Without any calculation, state which ward this is. Explain your answer.
Question 9
View details
9 A variable \(X\) has the distribution \(\mathrm { B } ( 11 , p )\).
  1. Given that \(p = \frac { 3 } { 4 }\), find \(\mathrm { P } ( X = 5 )\).
  2. Given that \(\mathrm { P } ( X = 0 ) = 0.05\), find \(p\).
  3. Given that \(\operatorname { Var } ( X ) = 1.76\), find the two possible values of \(p\).