Draw scatter diagram from data

Question provides numerical data in a table and asks the student to draw or plot a scatter diagram.

4 questions

Edexcel S1 2015 June Q7
7. A doctor is investigating the correlation between blood protein, \(p\), and body mass index, \(b\). He takes a random sample of 8 patients and the data are shown in the table below.
Patient\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
\(b\)3236404442212737
\(p\)1821313921121970
  1. Draw a scatter diagram of these data on the axes provided.
    \includegraphics[max width=\textwidth, alt={}, center]{36cf6341-1957-45b9-9f7d-0914506f5919-13_938_673_785_614} The doctor decides to leave out patient \(H\) from his calculations.
  2. Give a reason for the doctor's decision. For the 7 patients \(A , B , C , D , E , F\) and \(G\), $$S _ { b p } = 369 , \quad S _ { p p } = 490 \text { and } S _ { b b } = 423 \frac { 5 } { 7 }$$
  3. Find the product moment correlation coefficient, \(r\), for these 7 patients.
  4. Without any further calculations, state how \(r\) would differ from your answer in part (c) if it was calculated for all 8 patients. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{36cf6341-1957-45b9-9f7d-0914506f5919-15_1322_1593_207_173} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} The histogram in Figure 1 summarises the times, in minutes, that 200 people spent shopping in a supermarket.
  5. Give a reason to justify the use of a histogram to represent these data. Given that 40 people spent between 11 and 21 minutes shopping in the supermarket, estimate
  6. the number of people that spent between 18 and 25 minutes shopping in the supermarket,
  7. the median time spent shopping in the supermarket by these 200 people. The mid-point of each bar is represented by \(x\) and the corresponding frequency by f .
  8. Show that \(\sum \mathrm { f } x = 6390\) Given that \(\sum \mathrm { f } x ^ { 2 } = 238430\)
  9. for the data shown in the histogram, calculate estimates of
    1. the mean,
    2. the standard deviation. A coefficient of skewness is given by \(\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }\)
  10. Calculate this coefficient of skewness for these data. The manager of the supermarket decides to model these data with a normal distribution.
  11. Comment on the manager's decision. Give a justification for your answer.
Edexcel S1 2005 January Q3
3. The following table shows the height \(x\), to the nearest cm , and the weight \(y\), to the nearest kg , of a random sample of 12 students.
\(x\)148164156172147184162155182165175152
\(y\)395956774477654980727052
  1. On graph paper, draw a scatter diagram to represent these data.
  2. Write down, with a reason, whether the correlation coefficient between \(x\) and \(y\) is positive or negative. The data in the table can be summarised as follows. $$\Sigma x = 1962 , \quad \Sigma y = 740 , \quad \Sigma y ^ { 2 } = 47746 , \quad \Sigma x y = 122783 , \quad S _ { x x } = 1745 .$$
  3. Find \(S _ { x y }\). The equation of the regression line of \(y\) on \(x\) is \(y = - 106.331 + b x\).
  4. Find, to 3 decimal places, the value of \(b\).
  5. Find, to 3 significant figures, the mean \(\bar { y }\) and the standard deviation \(s\) of the weights of this sample of students.
  6. Find the values of \(\bar { y } \pm 1.96 s\).
  7. Comment on whether or not you think that the weights of these students could be modelled by a normal distribution.
AQA S1 2006 January Q5
5 [Figure 1, printed on the insert, is provided for use in this question.]
The table shows the times, in seconds, taken by a random sample of 10 boys from a junior swimming club to swim 50 metres freestyle and 50 metres backstroke.
BoyABCDEFGHIJ
Freestyle ( \(\boldsymbol { x }\) seconds)30.232.825.131.831.235.632.438.036.134.1
Backstroke ( \(y\) seconds)33.535.437.427.234.738.237.741.442.338.4
  1. On Figure 1, complete the scatter diagram for these data.
  2. Hence:
    1. give two distinct comments on what your scatter diagram reveals;
    2. state, without calculation, which of the following 3 values is most likely to be the value of the product moment correlation coefficient for the data in your scatter diagram. $$0.912 \quad 0.088 \quad 0.462$$
  3. In the sample of 10 boys, one boy is a junior-champion freestyle swimmer and one boy is a junior-champion backstroke swimmer. Identify the two most likely boys.
  4. Removing the data for the two boys whom you identified in part (c):
    1. calculate the value of the product moment correlation coefficient for the remaining 8 pairs of values of \(x\) and \(y\);
    2. comment, in context, on the value that you obtain.
AQA S1 2008 June Q3
3 [Figure 1, printed on the insert, is provided for use in this question.]
The table shows, for each of a sample of 12 handmade decorative ceramic plaques, the length, \(x\) millimetres, and the width, \(y\) millimetres.
Plaque\(\boldsymbol { x }\)\(\boldsymbol { y }\)
A232109
B235112
C236114
D234118
E230117
F230113
G246121
H240125
I244128
J241122
K246126
L245123
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of this question.
  3. On Figure 1, complete the scatter diagram for these data.
  4. In fact, the 6 plaques \(\mathrm { A } , \mathrm { B } , \ldots , \mathrm { F }\) are from a different source to the 6 plaques \(\mathrm { G } , \mathrm { H } , \ldots , \mathrm { L }\). With reference to your scatter diagram, but without further calculations, estimate the value of the product moment correlation coefficient between \(x\) and \(y\) for each source of plaque.