Edexcel S1 (Statistics 1) 2007 June

Question 1
View details
  1. A young family were looking for a new 3 bedroom semi-detached house. A local survey recorded the price \(x\), in \(\pounds 1000\), and the distance \(y\), in miles, from the station of such houses. The following summary statistics were provided
$$S _ { x x } = 113573 , \quad S _ { y y } = 8.657 , \quad S _ { x y } = - 808.917$$
  1. Use these values to calculate the product moment correlation coefficient.
  2. Give an interpretation of your answer to part (a). Another family asked for the distances to be measured in km rather than miles.
  3. State the value of the product moment correlation coefficient in this case.
Question 2
View details
2. The box plot in Figure 1 shows a summary of the weights of the luggage, in kg, for each musician in an orchestra on an overseas tour. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{045e10d2-1766-4399-aa0a-5619dd0cce0f-03_346_1452_324_228} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure} The airline's recommended weight limit for each musician's luggage was 45 kg . Given that none of the musicians' luggage weighed exactly 45 kg ,
  1. state the proportion of the musicians whose luggage was below the recommended weight limit. A quarter of the musicians had to pay a charge for taking heavy luggage.
  2. State the smallest weight for which the charge was made.
  3. Explain what you understand by the + on the box plot in Figure 1, and suggest an instrument that the owner of this luggage might play.
  4. Describe the skewness of this distribution. Give a reason for your answer. One musician of the orchestra suggests that the weights of luggage, in kg, can be modelled by a normal distribution with quartiles as given in Figure 1.
  5. Find the standard deviation of this normal distribution.
Question 3
View details
3. A student is investigating the relationship between the price ( \(y\) pence) of 100 g of chocolate and the percentage ( \(x \%\) ) of cocoa solids in the chocolate.
The following data is obtained
Chocolate brandABC\(D\)\(E\)\(F\)G\(H\)
\(x\) (\% cocoa)1020303540506070
\(y\) (pence)3555401006090110130
(You may use: \(\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750\) )
  1. On the graph paper on page 9 draw a scatter diagram to represent these data.
  2. Show that \(S _ { x y } = 4337.5\) and find \(S _ { x x }\). The student believes that a linear relationship of the form \(y = a + b x\) could be used to describe these data.
  3. Use linear regression to find the value of \(a\) and the value of \(b\), giving your answers to 1 decimal place.
  4. Draw the regression line on your scatter diagram. The student believes that one brand of chocolate is overpriced.
  5. Use the scatter diagram to
    1. state which brand is overpriced,
    2. suggest a fair price for this brand. Give reasons for both your answers.
      \includegraphics[max width=\textwidth, alt={}]{045e10d2-1766-4399-aa0a-5619dd0cce0f-06_2454_1485_282_228}
      The data on page 8 has been repeated here to help you
      Chocolate brandA\(B\)\(C\)D\(E\)\(F\)G\(H\)
      \(x\) (\% cocoa)1020303540506070
      \(y\) (pence)3555401006090110130
      (You may use: \(\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750\) )
Question 4
View details
  1. A survey of the reading habits of some students revealed that, on a regular basis, \(25 \%\) read quality newspapers, 45\% read tabloid newspapers and 40\% do not read newspapers at all.
    1. Find the proportion of students who read both quality and tabloid newspapers.
    2. In the space on page 13 draw a Venn diagram to represent this information.
    A student is selected at random. Given that this student reads newspapers on a regular basis,
  2. find the probability that this student only reads quality newspapers.
Question 5
View details
5. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{045e10d2-1766-4399-aa0a-5619dd0cce0f-10_726_1509_255_278} \captionsetup{labelformat=empty} \caption{Figure 2}
\end{figure} Figure 2 shows a histogram for the variable \(t\) which represents the time taken, in minutes, by a group of people to swim 500 m .
  1. Complete the frequency table for \(t\).
    \(t\)\(5 - 10\)\(10 - 14\)\(14 - 18\)\(18 - 25\)\(25 - 40\)
    Frequency101624
  2. Estimate the number of people who took longer than 20 minutes to swim 500 m .
  3. Find an estimate of the mean time taken.
  4. Find an estimate for the standard deviation of \(t\).
  5. Find the median and quartiles for \(t\). One measure of skewness is found using \(\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }\).
  6. Evaluate this measure and describe the skewness of these data.
Question 6
View details
6. The random variable \(X\) has a normal distribution with mean 20 and standard deviation 4 .
  1. Find \(\mathrm { P } ( X > 25 )\).
  2. Find the value of \(d\) such that \(\mathrm { P } ( 20 < X < d ) = 0.4641\)
Question 7
View details
7. The random variable \(X\) has probability distribution
\(x\)13579
\(\mathrm { P } ( X = x )\)0.2\(p\)0.2\(q\)0.15
  1. Given that \(\mathrm { E } ( X ) = 4.5\), write down two equations involving \(p\) and \(q\). Find
  2. the value of \(p\) and the value of \(q\),
  3. \(\mathrm { P } ( 4 < X \leqslant 7 )\). Given that \(\mathrm { E } \left( X ^ { 2 } \right) = 27.4\), find
  4. \(\operatorname { Var } ( X )\),
  5. \(\mathrm { E } ( 19 - 4 X )\),
  6. \(\operatorname { Var } ( 19 - 4 X )\).