| Exam Board | Edexcel |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2007 |
| Session | June |
| Marks | 15 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.3 This is a standard S1 linear regression question with all summations provided. Students must apply memorized formulas for Sxy, Sxx, and regression coefficients, then interpret residuals. The calculations are straightforward with no conceptual challenges, making it slightly easier than average but still requiring proper technique and interpretation. |
| Spec | 2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| Chocolate brand | A | B | C | \(D\) | \(E\) | \(F\) | G | \(H\) |
| \(x\) (\% cocoa) | 10 | 20 | 30 | 35 | 40 | 50 | 60 | 70 |
| \(y\) (pence) | 35 | 55 | 40 | 100 | 60 | 90 | 110 | 130 |
| Chocolate brand | A | \(B\) | \(C\) | D | \(E\) | \(F\) | G | \(H\) |
| \(x\) (\% cocoa) | 10 | 20 | 30 | 35 | 40 | 50 | 60 | 70 |
| \(y\) (pence) | 35 | 55 | 40 | 100 | 60 | 90 | 110 | 130 |
| Answer | Marks | Guidance |
|---|---|---|
| Use overlay | B2 (2 marks) | Points B2, within 1 small square of correct point, subtract 1 mark each error minimum 0. |
| Answer | Marks | Guidance |
|---|---|---|
| \(S_{xy} = 28750 - \frac{315 \times 620}{8} = 4337.5\) answer given so award for method | M1 | |
| \(S_{xx} = 15225 - \frac{315^2}{8} = 2821.875\) | M1A1 (3 marks) | Anything rounding to 2820 for A1. |
| Answer | Marks | Guidance |
|---|---|---|
| \(b = \frac{4377.5}{S_{xx}} = 1.537... \approx 1.5\) | M1, A1 | |
| \(a = \bar{y} - b\bar{x} = \frac{620}{8} - b\frac{315}{8} = 16.97... \approx 17.0\) | M1, A1 (4 marks) | Anything rounding to 1.5 and 17.0 (accept 17). |
| Answer | Marks | Guidance |
|---|---|---|
| Use overlay | B1, B1 (2 marks) | Follow through for intercept for first B1. Correct slope of straight line for second B1. |
| Answer | Marks | Guidance |
|---|---|---|
| Brand D, since a long way above/from the line (dependent on 'Brand D' above) | B1, B1 | |
| Using line: \(y = 17 + 35 \times 1.5 = 69.5\) | M1A1 (4 marks) | Anything rounding to 69p–71p for final A1. Reading from graph acceptable for M1A1. If value read from graph at \(x = 35\) is answer given but out of range, award M1A0. |
## Question 3:
**Part (a):**
Use overlay | B2 (2 marks) | Points B2, within 1 small square of correct point, subtract 1 mark each error minimum 0.
**Part (b):**
$S_{xy} = 28750 - \frac{315 \times 620}{8} = 4337.5$ **answer given** so award for method | M1 |
$S_{xx} = 15225 - \frac{315^2}{8} = 2821.875$ | M1A1 (3 marks) | Anything rounding to 2820 for A1.
**Part (c):**
$b = \frac{4377.5}{S_{xx}} = 1.537... \approx 1.5$ | M1, A1 |
$a = \bar{y} - b\bar{x} = \frac{620}{8} - b\frac{315}{8} = 16.97... \approx 17.0$ | M1, A1 (4 marks) | Anything rounding to 1.5 and 17.0 (accept 17).
**Part (d):**
Use overlay | B1, B1 (2 marks) | Follow through for intercept for first B1. Correct slope of straight line for second B1.
**Part (e):**
Brand D, since a long way above/from the line (dependent on 'Brand D' above) | B1, B1 |
Using line: $y = 17 + 35 \times 1.5 = 69.5$ | M1A1 (4 marks) | Anything rounding to 69p–71p for final A1. Reading from graph acceptable for M1A1. If value read from graph at $x = 35$ is answer given but out of range, award M1A0.
---
3. A student is investigating the relationship between the price ( $y$ pence) of 100 g of chocolate and the percentage ( $x \%$ ) of cocoa solids in the chocolate.\\
The following data is obtained
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
Chocolate brand & A & B & C & $D$ & $E$ & $F$ & G & $H$ \\
\hline
$x$ (\% cocoa) & 10 & 20 & 30 & 35 & 40 & 50 & 60 & 70 \\
\hline
$y$ (pence) & 35 & 55 & 40 & 100 & 60 & 90 & 110 & 130 \\
\hline
\end{tabular}
\end{center}
(You may use: $\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750$ )
\begin{enumerate}[label=(\alph*)]
\item On the graph paper on page 9 draw a scatter diagram to represent these data.
\item Show that $S _ { x y } = 4337.5$ and find $S _ { x x }$.
The student believes that a linear relationship of the form $y = a + b x$ could be used to describe these data.
\item Use linear regression to find the value of $a$ and the value of $b$, giving your answers to 1 decimal place.
\item Draw the regression line on your scatter diagram.
The student believes that one brand of chocolate is overpriced.
\item Use the scatter diagram to
\begin{enumerate}[label=(\roman*)]
\item state which brand is overpriced,
\item suggest a fair price for this brand.
Give reasons for both your answers.
\begin{center}
\includegraphics[max width=\textwidth, alt={}]{045e10d2-1766-4399-aa0a-5619dd0cce0f-06_2454_1485_282_228}
\end{center}
The data on page 8 has been repeated here to help you
\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
Chocolate brand & A & $B$ & $C$ & D & $E$ & $F$ & G & $H$ \\
\hline
$x$ (\% cocoa) & 10 & 20 & 30 & 35 & 40 & 50 & 60 & 70 \\
\hline
$y$ (pence) & 35 & 55 & 40 & 100 & 60 & 90 & 110 & 130 \\
\hline
\end{tabular}
\end{center}
(You may use: $\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750$ )
\end{enumerate}\end{enumerate}
\hfill \mbox{\textit{Edexcel S1 2007 Q3 [15]}}