OCR MEI S2 (Statistics 2) 2007 January

Question 1
View details
1 In a science investigation into energy conservation in the home, a student is collecting data on the time taken for an electric kettle to boil as the volume of water in the kettle is varied. The student's data are shown in the table below, where \(v\) litres is the volume of water in the kettle and \(t\) seconds is the time taken for the kettle to boil (starting with the water at room temperature in each case). Also shown are summary statistics and a scatter diagram on which the regression line of \(t\) on \(v\) is drawn.
\(v\)0.20.40.60.81.0
\(t\)4478114156172
$$n = 5 , \Sigma v = 3.0 , \Sigma t = 564 , \Sigma v ^ { 2 } = 2.20 , \Sigma v t = 405.2 .$$ \includegraphics[max width=\textwidth, alt={}, center]{7ba30ff3-af90-4741-aab1-576efcbcb0b2-2_563_1376_742_386}
  1. Calculate the equation of the regression line of \(t\) on \(v\), giving your answer in the form \(t = a + b v\).
  2. Use this equation to predict the time taken for the kettle to boil when the amount of water which it contains is
    (A) 0.5 litres,
    (B) 1.5 litres. Comment on the reliability of each of these predictions.
  3. In the equation of the regression line found in part (i), explain the role of the coefficient of \(v\) in the relationship between time taken and volume of water.
  4. Calculate the values of the residuals for \(v = 0.8\) and \(v = 1.0\).
  5. Explain how, on a scatter diagram with the regression line drawn accurately on it, a residual could be measured and its sign determined.
    (a) A farmer grows Brussels sprouts. The diameter of sprouts in a particular batch, measured in mm , is Normally distributed with mean 28 and variance 16. Sprouts that are between 24 mm and 33 mm in diameter are sold to a supermarket.
  6. Find the probability that the diameter of a randomly selected sprout will be within this range.
  7. The farmer sells the sprouts in this range to the supermarket for 10 pence per kilogram. The farmer sells sprouts under 24 mm in diameter to a frozen food factory for 5 pence per kilogram. Sprouts over 33 mm in diameter are thrown away. Estimate the total income received by the farmer for the batch, which weighs 25000 kg .
  8. By harvesting sprouts earlier, the mean diameter for another batch can be reduced to \(k \mathrm {~mm}\). Find the value of \(k\) for which only \(5 \%\) of the sprouts will be above 33 mm in diameter. You may assume that the variance is still 16 .
    (b) The farmer also grows onions. The weight in kilograms of the onions is Normally distributed with mean 0.155 and variance 0.005 . He is trying out a new variety, which he hopes will yield a higher mean weight. In order to test this, he takes a random sample of 25 onions of the new variety and finds that their total weight is 4.77 kg . You should assume that the weight in kilograms of the new variety is Normally distributed with variance 0.005 .
  9. Write down suitable null and alternative hypotheses for the test in terms of \(\mu\). State the meaning of \(\mu\) in this case.
  10. Carry out the test at the \(1 \%\) level.
Question 3
View details
3 An electrical retailer gives customers extended guarantees on washing machines. Under this guarantee all repairs in the first 3 years are free. The retailer records the numbers of free repairs made to 80 machines.
Number of repairs0123\(> 3\)
Frequency5320610
  1. Show that the sample mean is 0.4375 .
  2. The sample standard deviation \(s\) is 0.6907 . Explain why this supports a suggestion that a Poisson distribution may be a suitable model for the distribution of the number of free repairs required by a randomly chosen washing machine. The random variable \(X\) denotes the number of free repairs required by a randomly chosen washing machine. For the remainder of this question you should assume that \(X\) may be modelled by a Poisson distribution with mean 0.4375.
  3. Find \(\mathrm { P } ( X = 1 )\). Comment on your answer in relation to the data in the table.
  4. The manager decides to monitor 8 washing machines sold on one day. Find the probability that there are at least 12 free repairs in total on these 8 machines. You may assume that the 8 machines form an independent random sample.
  5. A launderette with 8 washing machines has needed 12 free repairs. Why does your answer to part (iv) suggest that the Poisson model with mean 0.4375 is unlikely to be a suitable model for free repairs on the machines in the launderette? Give a reason why the model may not be appropriate for the launderette. The retailer also sells tumble driers with the same guarantee. The number of free repairs on a tumble drier in three years can be modelled by a Poisson distribution with mean 0.15 . A customer buys a tumble drier and a washing machine.
  6. Assuming that free repairs are required independently, find the probability that
    (A) the two appliances need a total of 3 free repairs between them,
    (B) each appliance needs exactly one free repair.
Question 4
View details
4 Two educational researchers are investigating the relationship between personal ambitions and home location of students. The researchers classify students into those whose main personal ambition is good academic results and those who have some other ambition. A random sample of 480 students is selected.
  1. One researcher summarises the data as follows.
    ObservedHome location
    \cline { 3 - 4 }CityNon-city
    \multirow{2}{*}{Ambition}Good results102147
    \cline { 2 - 4 }Other75156
    Carry out a test at the \(5 \%\) significance level to examine whether there is any association between home location and ambition. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
  2. The other researcher summarises the same data in a different way as follows.
    ObservedHome location
    \cline { 3 - 5 }CityTownCountry
    \multirow{2}{*}{Ambition}Good results1028364
    \cline { 2 - 5 }Other756492
    (A) Calculate the expected frequencies for both 'Country' cells.
    (B) The test statistic for these data is 10.94 . Carry out a test at the \(5 \%\) level based on this table, using the same hypotheses as in part (i).
    (C) The table below gives the contribution of each cell to the test statistic. Discuss briefly how personal ambitions are related to home location.
    Home location
    \cline { 2 - 5 }CityTownCountry
    \multirow{2}{*}{Ambition}Good results1.1290.5963.540
    \cline { 2 - 5 }Other1.2170.6433.816
  3. Comment briefly on whether the analysis in part (ii) means that the conclusion in part (i) is invalid.