| Exam Board | OCR MEI |
|---|---|
| Module | Further Statistics Major (Further Statistics Major) |
| Year | 2024 |
| Session | June |
| Marks | 14 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Interpret features of scatter diagram |
| Difficulty | Moderate -0.3 This is a straightforward linear regression question requiring standard calculations (finding regression line from summary statistics, making predictions, interpreting correlation). Part (a) tests basic scatter diagram interpretation, parts (b)-(c) are routine formula application, and parts (d)-(e) require standard commentary on correlation strength and appropriate use of regression lines. While it's a multi-part question worth several marks, all components are textbook exercises with no novel problem-solving required, making it slightly easier than average. |
| Spec | 5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Answer | Marks | Guidance |
|---|---|---|
| 8 | (a) | Label flat A at the point approx. (120, 600) |
| Label flat B at the point approx. (90, 1000) | B1 |
| Answer | Marks |
|---|---|
| [2] | 3.3 |
| 1.1 | B0 unless point labelled A |
| Answer | Marks | Guidance |
|---|---|---|
| 8 | (b) | 652.5 5067 |
| Answer | Marks |
|---|---|
| π¦ = 4.6806π₯+182.99 | M1 |
| Answer | Marks |
|---|---|
| [5] | 1.1a |
| Answer | Marks |
|---|---|
| 1.1 | For attempt at gradient (b) |
| Answer | Marks | Guidance |
|---|---|---|
| 8 | (c) | Area 40 ο Β£370 thousand |
| Area 110 ο Β£698 thousand | B1 |
| Answer | Marks |
|---|---|
| [2] | 1.1 |
| 1.1 | FT provided y on x. Allow B1B0 if answers given to more |
| Answer | Marks | Guidance |
|---|---|---|
| 8 | (d) | Although prediction for 40 m2 lies within the data |
| Answer | Marks |
|---|---|
| extrapolation. | B1 |
| Answer | Marks |
|---|---|
| [3] | 2.2a |
| Answer | Marks |
|---|---|
| 3.5b | Allow first B1 for any correct comment about 40 m2 |
| Answer | Marks | Guidance |
|---|---|---|
| 8 | (e) | The regression line of x on y would be needed. |
| Answer | Marks |
|---|---|
| the average cost for a given area and not the reverse | B1 |
| Answer | Marks |
|---|---|
| [2] | 3.5b |
| 3.5c | Any suitable context |
Question 8:
8 | (a) | Label flat A at the point approx. (120, 600)
Label flat B at the point approx. (90, 1000) | B1
B1
[2] | 3.3
1.1 | B0 unless point labelled A
B0 unless point labelled B
8 | (b) | 652.5 5067
DR NB: π₯Μ
= = 59.318, Μ
π¦ = = 460.63
11 11
ππ₯π¦ 315928.2β(652.5Γ5067/11) 15362.97
π = = =
ππ₯π₯ 41987.35β652.52/11 3282.236
= 4.6806...
For correct line (y on x) so equation is π¦βπ¦Μ
= π(π₯βπ₯Μ
)
π¦β460.63 = 4.6806(π₯β59.318)
π¦ = 4.6806π₯+182.99 | M1
A1
B1
M1
A1
[5] | 1.1a
1.1
3.3
1.1
1.1 | For attempt at gradient (b)
Use of 13 instead of 11 can get Max M1A0B1M1A0 which
would lead to π¦ = 6.669π₯+55.05
Allow 4.7 or better
For equation of line
Condone use of x on y regression line for Max
M1A0B0M1A0
8 | (c) | Area 40 ο Β£370 thousand
Area 110 ο Β£698 thousand | B1
B1
[2] | 1.1
1.1 | FT provided y on x. Allow B1B0 if answers given to more
than nearest whole number of thousands or if thousands
omitted and B0B0 if both
FT provided y on x.
n = 13 leads to Β£321 thousands and Β£788 thousands
8 | (d) | Although prediction for 40 m2 lies within the data
(interpolation), the points do not lie too close to the line, so it is
not too reliable.
and the value of r2 is 0.585 which is not close to 1 which
further suggests that the estimate is only moderately reliable.
The prediction for 110 m2 is even less reliable since it is an
extrapolation. | B1
B1
B1
[3] | 2.2a
3.5b
3.5b | Allow first B1 for any correct comment about 40 m2
Condone βNear the centre of the dataβ
Condone comment about the PMCC for first B1
Allow second B1 for all 3 correct comments about 40 m2
and must use r2 rather than r
Allow r2 is reasonably close to 1 and the points are fairly
close to a straight line
Max 2 out of 3 if any wrong comments seen
8 | (e) | The regression line of x on y would be needed.
It would not be sensible since the line in part (b) only measures
the average cost for a given area and not the reverse | B1
B1
[2] | 3.5b
3.5c | Any suitable context
The regression line of floor area on price would be needed
gets B1B1.
Condone βthe regression coefficient will be calculated using
ππ₯π¦
so the line found in part (b) cannot be usedβ for B1
ππ₯π₯
8 An estate agent collects data for a random selection of 13 flats in order to investigate the link between the floor areas of flats and their price. The scatter diagram shows the floor areas, $x \mathrm {~m} ^ { 2 }$, and prices, $\pounds y$ thousand, of the 13 flats.\\
\includegraphics[max width=\textwidth, alt={}, center]{bab116b3-6e5f-44db-ac86-670e4040d649-07_613_1246_386_242}
\begin{enumerate}[label=(\alph*)]
\item The estate agent notes that two of the data points are outliers. One is Flat A which has a large floor area but is in poor condition. The other is Flat B which has a balcony with a desirable view overlooking the sea.
Label these two data points on the copy of the scatter diagram in the Printed Answer Booklet.
The estate agent decides to remove these two data points from the analysis. Summary statistics for the remaining 11 flats are as follows.
$$\sum x = 652.5 \quad \sum y = 5067 \quad \sum x ^ { 2 } = 41987.35 \quad \sum y ^ { 2 } = 2456813 \quad \sum x y = 315928.2$$
\item In this question you must show detailed reasoning.
Calculate the equation of a regression line which is suitable for estimating the price of a flat from its floor area.
\item Use the regression line to estimate the price for the following floor areas.
\begin{itemize}
\item $40 \mathrm {~m} ^ { 2 }$
\item $110 \mathrm {~m} ^ { 2 }$
\item Given that the value of the product moment correlation coefficient for these 11 data items is 0.765 , comment on the reliability of your estimates.
\item The estate agent thinks that he can predict the floor area of a flat from its price, using the equation of the regression line found in part (b).
\end{itemize}
Comment briefly on the estate agent's idea.
\end{enumerate}
\hfill \mbox{\textit{OCR MEI Further Statistics Major 2024 Q8 [14]}}