Edexcel S1 2007 June — Question 3 15 marks

Exam BoardEdexcel
ModuleS1 (Statistics 1)
Year2007
SessionJune
Marks15
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicLinear regression
TypeCalculate y on x from raw data table
DifficultyModerate -0.3 This is a standard S1 linear regression question with all summations provided. Students must apply memorized formulas for Sxy, Sxx, and regression coefficients, then interpret residuals. The calculations are straightforward with no conceptual challenges, making it slightly easier than average but still requiring proper technique and interpretation.
Spec2.02c Scatter diagrams and regression lines5.08a Pearson correlation: calculate pmcc5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context

3. A student is investigating the relationship between the price ( \(y\) pence) of 100 g of chocolate and the percentage ( \(x \%\) ) of cocoa solids in the chocolate.
The following data is obtained
Chocolate brandABC\(D\)\(E\)\(F\)G\(H\)
\(x\) (\% cocoa)1020303540506070
\(y\) (pence)3555401006090110130
(You may use: \(\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750\) )
  1. On the graph paper on page 9 draw a scatter diagram to represent these data.
  2. Show that \(S _ { x y } = 4337.5\) and find \(S _ { x x }\). The student believes that a linear relationship of the form \(y = a + b x\) could be used to describe these data.
  3. Use linear regression to find the value of \(a\) and the value of \(b\), giving your answers to 1 decimal place.
  4. Draw the regression line on your scatter diagram. The student believes that one brand of chocolate is overpriced.
  5. Use the scatter diagram to
    1. state which brand is overpriced,
    2. suggest a fair price for this brand. Give reasons for both your answers.
      \includegraphics[max width=\textwidth, alt={}]{045e10d2-1766-4399-aa0a-5619dd0cce0f-06_2454_1485_282_228}
      The data on page 8 has been repeated here to help you
      Chocolate brandA\(B\)\(C\)D\(E\)\(F\)G\(H\)
      \(x\) (\% cocoa)1020303540506070
      \(y\) (pence)3555401006090110130
      (You may use: \(\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750\) )

Question 3:
Part (a):
AnswerMarks Guidance
Use overlayB2 (2 marks) Points B2, within 1 small square of correct point, subtract 1 mark each error minimum 0.
Part (b):
AnswerMarks Guidance
\(S_{xy} = 28750 - \frac{315 \times 620}{8} = 4337.5\) answer given so award for methodM1
\(S_{xx} = 15225 - \frac{315^2}{8} = 2821.875\)M1A1 (3 marks) Anything rounding to 2820 for A1.
Part (c):
AnswerMarks Guidance
\(b = \frac{4377.5}{S_{xx}} = 1.537... \approx 1.5\)M1, A1
\(a = \bar{y} - b\bar{x} = \frac{620}{8} - b\frac{315}{8} = 16.97... \approx 17.0\)M1, A1 (4 marks) Anything rounding to 1.5 and 17.0 (accept 17).
Part (d):
AnswerMarks Guidance
Use overlayB1, B1 (2 marks) Follow through for intercept for first B1. Correct slope of straight line for second B1.
Part (e):
AnswerMarks Guidance
Brand D, since a long way above/from the line (dependent on 'Brand D' above)B1, B1
Using line: \(y = 17 + 35 \times 1.5 = 69.5\)M1A1 (4 marks) Anything rounding to 69p–71p for final A1. Reading from graph acceptable for M1A1. If value read from graph at \(x = 35\) is answer given but out of range, award M1A0.
## Question 3:

**Part (a):**
Use overlay | B2 (2 marks) | Points B2, within 1 small square of correct point, subtract 1 mark each error minimum 0.

**Part (b):**
$S_{xy} = 28750 - \frac{315 \times 620}{8} = 4337.5$ **answer given** so award for method | M1 |

$S_{xx} = 15225 - \frac{315^2}{8} = 2821.875$ | M1A1 (3 marks) | Anything rounding to 2820 for A1.

**Part (c):**
$b = \frac{4377.5}{S_{xx}} = 1.537... \approx 1.5$ | M1, A1 |

$a = \bar{y} - b\bar{x} = \frac{620}{8} - b\frac{315}{8} = 16.97... \approx 17.0$ | M1, A1 (4 marks) | Anything rounding to 1.5 and 17.0 (accept 17).

**Part (d):**
Use overlay | B1, B1 (2 marks) | Follow through for intercept for first B1. Correct slope of straight line for second B1.

**Part (e):**
Brand D, since a long way above/from the line (dependent on 'Brand D' above) | B1, B1 |

Using line: $y = 17 + 35 \times 1.5 = 69.5$ | M1A1 (4 marks) | Anything rounding to 69p–71p for final A1. Reading from graph acceptable for M1A1. If value read from graph at $x = 35$ is answer given but out of range, award M1A0.

---
3. A student is investigating the relationship between the price ( $y$ pence) of 100 g of chocolate and the percentage ( $x \%$ ) of cocoa solids in the chocolate.\\
The following data is obtained

\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
Chocolate brand & A & B & C & $D$ & $E$ & $F$ & G & $H$ \\
\hline
$x$ (\% cocoa) & 10 & 20 & 30 & 35 & 40 & 50 & 60 & 70 \\
\hline
$y$ (pence) & 35 & 55 & 40 & 100 & 60 & 90 & 110 & 130 \\
\hline
\end{tabular}
\end{center}

(You may use: $\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750$ )
\begin{enumerate}[label=(\alph*)]
\item On the graph paper on page 9 draw a scatter diagram to represent these data.
\item Show that $S _ { x y } = 4337.5$ and find $S _ { x x }$.

The student believes that a linear relationship of the form $y = a + b x$ could be used to describe these data.
\item Use linear regression to find the value of $a$ and the value of $b$, giving your answers to 1 decimal place.
\item Draw the regression line on your scatter diagram.

The student believes that one brand of chocolate is overpriced.
\item Use the scatter diagram to
\begin{enumerate}[label=(\roman*)]
\item state which brand is overpriced,
\item suggest a fair price for this brand.

Give reasons for both your answers.

\begin{center}
\includegraphics[max width=\textwidth, alt={}]{045e10d2-1766-4399-aa0a-5619dd0cce0f-06_2454_1485_282_228}
\end{center}

The data on page 8 has been repeated here to help you

\begin{center}
\begin{tabular}{|l|l|l|l|l|l|l|l|l|}
\hline
Chocolate brand & A & $B$ & $C$ & D & $E$ & $F$ & G & $H$ \\
\hline
$x$ (\% cocoa) & 10 & 20 & 30 & 35 & 40 & 50 & 60 & 70 \\
\hline
$y$ (pence) & 35 & 55 & 40 & 100 & 60 & 90 & 110 & 130 \\
\hline
\end{tabular}
\end{center}

(You may use: $\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750$ )
\end{enumerate}\end{enumerate}

\hfill \mbox{\textit{Edexcel S1 2007 Q3 [15]}}