- Following some school examinations, Chetna is studying the results of the 16 students in her class. The mark for paper \(1 , x\), and the mark for paper \(2 , y\), for each student are summarised in the following statistics.
$$\bar { x } = 35.75 \quad \bar { y } = 25.75 \quad \sigma _ { x } = 7.79 \quad \sigma _ { y } = 11.91 \quad \sum x y = 15837$$
- Comment on the differences between the marks of the students on paper 1 and paper 2
Chetna decides to examine these data in more detail and plots the marks for each of the 16 students on the scatter diagram opposite.
- Explain why the circled point \(( 38,0 )\) is possibly an outlier.
- Suggest a possible reason for this result.
Chetna decides to omit the data point \(( 38,0 )\) and examine the other 15 students' marks.
- Find the value of \(\bar { x }\) and the value of \(\bar { y }\) for these 15 students.
For these 15 students
- explain why \(\sum x y\) is still 15837
- show that \(\mathrm { S } _ { x y } = 1169.8\)
For these 15 students, Chetna calculates \(\mathrm { S } _ { x x } = 965.6\) and \(\mathrm { S } _ { y y } = 1561.7\) correct to 1 decimal place.
- Calculate the product moment correlation coefficient for these 15 students.
- Calculate the equation of the line of regression of \(y\) on \(x\) for these 15 students, giving your answer in the form \(y = a + b x\)
The product moment correlation coefficient between \(x\) and \(y\) for all 16 students is 0.746
- Explain how your calculation in part (e) supports Chetna's decision to omit the point \(( 38,0 )\) before calculating the equation of the linear regression line.
(1) - Estimate the mark in the second paper for a student who scored 38 marks in the first paper.
\includegraphics[max width=\textwidth, alt={}]{d3f4450d-60eb-49b6-be1b-d2fcfad0451f-17_1127_1146_301_406}
\includegraphics[max width=\textwidth, alt={}]{d3f4450d-60eb-49b6-be1b-d2fcfad0451f-20_2630_1828_121_121}