| Exam Board | OCR |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2007 |
| Session | June |
| Marks | 12 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Bivariate data |
| Type | Calculate regression line equation |
| Difficulty | Moderate -0.3 This is a standard S1 regression question requiring routine application of the correlation coefficient formula and interpretation of bivariate data. While it has multiple parts, each involves either straightforward calculation (computing r using given summations) or basic conceptual understanding (recognizing perfect rank correlation, understanding extrapolation reliability). No novel problem-solving or deep insight is required—just methodical application of A-level statistics formulas and standard interpretation principles. |
| Spec | 5.08a Pearson correlation: calculate pmcc5.08e Spearman rank correlation5.09c Calculate regression line |
| x | 0 | 1 | 2 | 3 | 4 | 7 | 13 | 30 |
| y | 0 | 4 | 8 | 10 | 11 | 12 | 13 | 14 |
| Answer | Marks | Guidance |
|---|---|---|
| \(767 - \frac{60×72}{8}\) or \(\frac{227}{\sqrt{698\text{⁄}162}}\) | M1 | Any version |
| All correct. Or \(\frac{767-8x7.5x9}{\sqrt{(1148-8x7.5^2)(810-8x9^2)}}\) | ||
| M1 | or correct substin in any correct formula for \(r\) | |
| \(\sqrt{(1148-\frac{60^2}{8})(810-\frac{72^2}{8})} = 0.675\) (3 sfs) | A1 | .3. |
| B1 |
| Answer | Marks | Guidance |
|---|---|---|
| y always increases with x or ranks same | B1 | +ve grad thro'out. Increase in steps. |
| Same order. Both ascending order | ||
| B1 | 2 | Perfect RANK corr'n |
| Ignore extra | ||
| NOT Increasing proportionately |
| Answer | Marks | Guidance |
|---|---|---|
| Closer to 1, or increases because nearer to st line | B1 | Corr'n stronger. |
| B1 | 2 | Fewer outliers. "They" are outliers |
| Ignore extra |
| Answer | Marks | Guidance |
|---|---|---|
| Because y still increasing with x oe | B1 | \(\Sigma d^2\) still 0, Still same order, Ignore extra |
| B1 | 2 | NOT differences still the same. |
| NOT ft (i)(b) |
| Answer | Marks | Guidance |
|---|---|---|
| 13.8 to 14.0 | B1 | .1.. |
| Answer | Marks | Guidance |
|---|---|---|
| (iii) or graph or diag or my est | B1 | Must be clear which est. Can be implied. |
| "This est" probably ⇒ using equn of line | ||
| Takes account of curve | B1 | 2 |
| Corr'n not strong. |
### Part ia
Correct subsit in ≥ two S formulae
$767 - \frac{60×72}{8}$ or $\frac{227}{\sqrt{698\text{⁄}162}}$ | M1 | Any version
| | All correct. Or $\frac{767-8x7.5x9}{\sqrt{(1148-8x7.5^2)(810-8x9^2)}}$
| M1 | or correct substin in any correct formula for $r$
$\sqrt{(1148-\frac{60^2}{8})(810-\frac{72^2}{8})} = 0.675$ (3 sfs) | A1 | .3.
| B1 |
### Part b
1
y always increases with x or ranks same | B1 | +ve grad thro'out. Increase in steps.
| | Same order. Both ascending order
| B1 | 2 | Perfect RANK corr'n
| | Ignore extra
| | NOT Increasing proportionately
### Part iia
Closer to 1, or increases because nearer to st line | B1 | Corr'n stronger.
| B1 | 2 | Fewer outliers. "They" are outliers
| | Ignore extra
### Part b
None, or remains at 1
Because y still increasing with x oe | B1 | $\Sigma d^2$ still 0, Still same order, Ignore extra
| B1 | 2 | NOT differences still the same.
| | NOT ft (i)(b)
### Part iii
13.8 to 14.0 | B1 | .1..
### Part iv
(iii) or graph or diag or my est | B1 | Must be clear which est. Can be implied.
| | "This est" probably ⇒ using equn of line
Takes account of curve | B1 | 2 | Straight line is not good fit. Not linear.
| | Corr'n not strong.
**Total: 12**
---
6 A machine with artificial intelligence is designed to improve its efficiency rating with practice. The table shows the values of the efficiency rating, y , after the machine has carried out its task various numbers of times, $x$
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | c | }
\hline
x & 0 & 1 & 2 & 3 & 4 & 7 & 13 & 30 \\
\hline
y & 0 & 4 & 8 & 10 & 11 & 12 & 13 & 14 \\
\hline
\end{tabular}
\end{center}
$$\left[ n = 8 , \Sigma x = 60 , \Sigma y = 72 , \Sigma x ^ { 2 } = 1148 , \Sigma y ^ { 2 } = 810 , \Sigma x y = 767 . \right]$$
These data are illustrated in the scatter diagram.\\
\includegraphics[max width=\textwidth, alt={}, center]{dfad6626-75ca-4dbd-9c45-42f809c163f3-4_769_1328_760_411}\\
(i) (a) Calculate the value of r , the product moment correlation coefficient.\\
(b) Without calculation, state with a reason the value of $\mathrm { r } _ { \mathrm { s } ^ { \prime } }$ Spearman's rank correlation coefficient.\\
(ii) A researcher suggests that the data for $\mathrm { x } = 0$ and $\mathrm { x } = 1$ should be ignored. Without cal culation, state with a reason what effect this would have on the value of\\
(a) $r$,\\
(b) $r _ { s }$.\\
(iii) Use the diagram to estimate the value of y when $\mathrm { x } = 29$.\\
(iv) Jack finds the equation of the regression line of y on xf for all the data, and uses it to estimate the value of $y$ when $x = 29$. Without calculation, state with a reason whether this estimate or the one found in part (iii) will be the more reliable.
\hfill \mbox{\textit{OCR S1 2007 Q6 [12]}}