OCR MEI Paper 2 2023 June — Question 14 8 marks

Exam BoardOCR MEI
ModulePaper 2 (Paper 2)
Year2023
SessionJune
Marks8
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicBivariate data
TypeInterpret census or real-world data
DifficultyModerate -0.8 This is a straightforward data interpretation question requiring basic statistical literacy. Part (a) asks students to identify missing data (#N/A) that needs removal—a simple data cleaning concept. Part (b) requires recognizing that weak correlation (0.37) and non-linear scatter pattern suggest linear modeling is inappropriate, both standard A-level observations requiring no complex calculation or novel insight.
Spec2.02c Scatter diagrams and regression lines2.02j Clean data: missing data, errors5.08a Pearson correlation: calculate pmcc

14 The pre-release material contains information concerning the median income of taxpayers in \(\pounds\) and the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C, including English and Maths, for different areas of London. Some of the data for 2014/15 is shown in Fig. 14.1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Fig. 14.1}
Median Income of Taxpayers in £Percentage of Pupils Achieving 5 or more A*-C, including English and Maths
City of London61100\#N/A
Barking and Dagenham2180054.0
Barnet2710070.1
Bexley2440055.0
Brent2270060.0
Bromley2810068.0
\end{table} A student investigated whether there is any relationship between median income of taxpayers and percentage of pupils achieving 5 or more GCSEs at grade A*-C, including English and Maths.
  1. With reference to Fig. 14.1, explain how the data should be cleaned before any analysis can take place. After the data was cleaned, the student used software to draw the scatter diagram shown in Fig. 14.2. Scatter diagram to show percentage of pupils achieving 5 A*-C grades against median income of taxpayers \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 14.2} \includegraphics[alt={},max width=\textwidth]{11788aaf-98fb-4a78-8a40-a40743b1fe15-10_574_1481_1900_241}
    \end{figure} The student calculated that the product moment correlation coefficient for these data is 0.3743 .
  2. Give two reasons why it may not be appropriate to use a linear model for the relationship between median income of taxpayers in \(\pounds\) and the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C. The student carried out some further analysis. The results are shown in Fig. 14.3. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 14.3}
    median income of
    taxpayers in \(\pounds\)
    percentage of pupils
    achieving \(5 + \mathrm { A } ^ { * } - \mathrm { C }\)
    mean2721661.0
    standard deviation4177.55.32
    \end{table} The student identified three outliers in total.
    The student decided to remove these outliers and recalculate the product moment correlation coefficient.
  3. Explain whether the new value of the product moment correlation coefficient would be between 0.3743 and 1 or between 0 and 0.3743 .

Question 14(a):
AnswerMarks Guidance
AnswerMarks Guidance
discard City of London (as part of the data not available) or discard any regions where one or more pieces of data are missing oeB1 LDS advantage; do not allow if answer spoiled; eg because it's an anomaly; eg because it's an outlier
Question 14(b):
AnswerMarks Guidance
AnswerMarks Guidance
scatter does not look linear oeB1 ignore extra comments unless they contradict an otherwise correct answer
pmcc not close to 1 oeB1 ignore extra comments unless they contradict an otherwise correct answer
Question 14(c):
AnswerMarks Guidance
AnswerMarks Guidance
\(27216 \pm 2 \times 4177.5\) or \(61.0 \pm 2 \times 5.32\)M1 use of 2 standard deviation check for one of the 4 calculations soi
\(m < 18861\) or \(m > 35571\)A1 allow \(\leq\) and \(\geq\)
percentage \(< 50.36\) or percentage \(> 71.64\)A1 allow \(\leq\) and \(\geq\); if M1A0A0 allow M1 SCB1 for all 4 correct values seen
[scatter diagram with outliers circled]A1
Question 14(d):
AnswerMarks Guidance
AnswerMarks Guidance
between 0 and 0.3743 since eg outliers gave a false impression of linearity; eg scatter will be more like a circleB1 need to refer to the shape of the scatter oe
## Question 14(a):

| Answer | Marks | Guidance |
|--------|-------|----------|
| discard City of London (as part of the data not available) or discard any regions where one or more pieces of data are missing oe | B1 | **LDS** advantage; do not allow if answer spoiled; eg because it's an anomaly; eg because it's an outlier |

---

## Question 14(b):

| Answer | Marks | Guidance |
|--------|-------|----------|
| **scatter** does not look linear oe | B1 | ignore extra comments unless they contradict an otherwise correct answer |
| pmcc not close to 1 oe | B1 | ignore extra comments unless they contradict an otherwise correct answer |

---

## Question 14(c):

| Answer | Marks | Guidance |
|--------|-------|----------|
| $27216 \pm 2 \times 4177.5$ or $61.0 \pm 2 \times 5.32$ | M1 | use of 2 standard deviation check for one of the 4 calculations soi |
| $m < 18861$ or $m > 35571$ | A1 | allow $\leq$ and $\geq$ |
| percentage $< 50.36$ or percentage $> 71.64$ | A1 | allow $\leq$ and $\geq$; if **M1A0A0** allow **M1 SCB1** for all 4 correct values seen |
| [scatter diagram with outliers circled] | A1 | |

---

## Question 14(d):

| Answer | Marks | Guidance |
|--------|-------|----------|
| between 0 and 0.3743 since eg outliers gave a false impression of linearity; eg scatter will be more like a circle | B1 | need to refer to the shape of the scatter oe |

---
14 The pre-release material contains information concerning the median income of taxpayers in $\pounds$ and the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C, including English and Maths, for different areas of London.

Some of the data for 2014/15 is shown in Fig. 14.1.

\begin{table}[h]
\begin{center}
\captionsetup{labelformat=empty}
\caption{Fig. 14.1}
\begin{tabular}{|l|l|l|}
\hline
 & Median Income of Taxpayers in £ & Percentage of Pupils Achieving 5 or more A*-C, including English and Maths \\
\hline
City of London & 61100 & \#N/A \\
\hline
Barking and Dagenham & 21800 & 54.0 \\
\hline
Barnet & 27100 & 70.1 \\
\hline
Bexley & 24400 & 55.0 \\
\hline
Brent & 22700 & 60.0 \\
\hline
Bromley & 28100 & 68.0 \\
\hline
\end{tabular}
\end{center}
\end{table}

A student investigated whether there is any relationship between median income of taxpayers and percentage of pupils achieving 5 or more GCSEs at grade A*-C, including English and Maths.
\begin{enumerate}[label=(\alph*)]
\item With reference to Fig. 14.1, explain how the data should be cleaned before any analysis can take place.

After the data was cleaned, the student used software to draw the scatter diagram shown in Fig. 14.2.

Scatter diagram to show percentage of pupils achieving 5 A*-C grades against median income of taxpayers

\begin{figure}[h]
\begin{center}
\captionsetup{labelformat=empty}
\caption{Fig. 14.2}
  \includegraphics[alt={},max width=\textwidth]{11788aaf-98fb-4a78-8a40-a40743b1fe15-10_574_1481_1900_241}
\end{center}
\end{figure}

The student calculated that the product moment correlation coefficient for these data is 0.3743 .
\item Give two reasons why it may not be appropriate to use a linear model for the relationship between median income of taxpayers in $\pounds$ and the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C.

The student carried out some further analysis. The results are shown in Fig. 14.3.

\begin{table}[h]
\begin{center}
\captionsetup{labelformat=empty}
\caption{Fig. 14.3}
\begin{tabular}{ | l | c | c | }
\hline
 & \begin{tabular}{ l }
median income of \\
taxpayers in $\pounds$ \\
\end{tabular} & \begin{tabular}{ l }
percentage of pupils \\
achieving $5 + \mathrm { A } ^ { * } - \mathrm { C }$ \\
\end{tabular} \\
\hline
mean & 27216 & 61.0 \\
\hline
standard deviation & 4177.5 & 5.32 \\
\hline
\end{tabular}
\end{center}
\end{table}

The student identified three outliers in total.
\item \begin{itemize}
  \item Use the information in Fig. 14.3 to determine the range of values of the median income of taxpayers in $\pounds$ which are outliers.
  \item Use the information in Fig. 14.3 to determine the range of values of the percentage of all pupils at the end of KS4 achieving 5 or more GCSEs at grade A*-C which are outliers.
  \item On the copy of Fig. 14.2 in the Printed Answer Booklet, circle the three outliers identified by the student.
\end{itemize}

The student decided to remove these outliers and recalculate the product moment correlation coefficient.
\item Explain whether the new value of the product moment correlation coefficient would be between 0.3743 and 1 or between 0 and 0.3743 .
\end{enumerate}

\hfill \mbox{\textit{OCR MEI Paper 2 2023 Q14 [8]}}