| Exam Board | AQA |
|---|---|
| Module | S1 (Statistics 1) |
| Year | 2010 |
| Session | January |
| Marks | 8 |
| Paper | Download PDF ↗ |
| Mark scheme | Download PDF ↗ |
| Topic | Linear regression |
| Type | Calculate y on x from raw data table |
| Difficulty | Moderate -0.3 This is a standard S1 regression calculation requiring computation of means, Sxx, Sxy, then b and a, followed by substitution and interpretation. While tedious arithmetically, it follows a completely routine algorithm with no conceptual challenges or novel problem-solving—slightly easier than average due to its mechanical nature. |
| Spec | 5.09b Least squares regression: concepts5.09c Calculate regression line5.09e Use regression: for estimation in context |
| \(\boldsymbol { x }\) | 230 | 184 | 165 | 147 | 241 | 174 | 210 |
| \(\boldsymbol { y }\) | 4551 | 3410 | 3252 | 3756 | 3787 | 4024 | 4254 |
| Answer | Marks | Guidance |
|---|---|---|
| Accept \(a\) & \(b\) interchanged only if identified correctly by a clearly shown equation (stated answers are not sufficient) in (b) | B2, (B1), B2, (B1), (M1), (m1), (A1), (A1) | AWRT (7.05134). AWFW. Treat rounding of correct stated answers as ISW. 1351 268047 27034 & 5269065 (105653202) (all 4 attempted). 7304 & 51503 (1247894) (both attempted). If \(a\) and \(b\) are not identified anywhere in solution, then: 7.05 \(\Rightarrow\) B1; 2500 to 2502 \(\Rightarrow\) B1 |
| Answer | Marks | Guidance |
|---|---|---|
| \(y_{200} = a + b \times 200 = 3890 \text{ to } 3930\) | M1, A1 | Used. May be implied by correct answer |
| Answer | Marks | Guidance |
|---|---|---|
| Special Case: If B0 B0dep then: Involves interpolation. Does not involve extrapolation. Within observed range | B1, B1dep, B1, B1dep, (B1) | (unreliable) requires (10% or equivalent) |
## Part (a)
$b$ (gradient) = 7.05, $b$ (gradient) = 7(00) to 7.1(0)
$a$ (intercept) = 2500 to 2502, $a$ (intercept) = 2490 to 2510
or
Attempt at $\sum x, \sum x^2, \sum y, \sum y^2 \& \sum xy$
or
Attempt at $S_{xx} \& S_{yy} (S_{xy})$
or
Attempt at correct formula for $b$ (gradient)
$b$ (gradient) = 7.05
$a$ (intercept) = 2500 to 2502
Accept $a$ & $b$ interchanged only if identified correctly by a clearly shown equation (stated answers are not sufficient) in (b) | B2, (B1), B2, (B1), (M1), (m1), (A1), (A1) | AWRT (7.05134). AWFW. Treat rounding of correct stated answers as ISW. 1351 268047 27034 & 5269065 (105653202) (all 4 attempted). 7304 & 51503 (1247894) (both attempted). If $a$ and $b$ are not identified anywhere in solution, then: 7.05 $\Rightarrow$ B1; 2500 to 2502 $\Rightarrow$ B1 | 4
## Part (b)
$y_{200} = a + b \times 200 = 3890 \text{ to } 3930$ | M1, A1 | Used. May be implied by correct answer | 2
## Part (c)
Large residuals / residual range suggest estimate may be unreliable
or
Largest residuals only small in relation to $y$-values (10%)
so estimate may be reliable (unreliable)
Special Case: If B0 B0dep then: Involves interpolation. Does not involve extrapolation. Within observed range | B1, B1dep, B1, B1dep, (B1) | (unreliable) requires (10% or equivalent) | 2
---
3 The table shows, for each of a random sample of 7 weeks, the number of customers, $x$, who purchased fuel from a filling station, together with the total volume, $y$ litres, of fuel purchased by these customers.
\begin{center}
\begin{tabular}{ | c | c | c | c | c | c | c | c | }
\hline
$\boldsymbol { x }$ & 230 & 184 & 165 & 147 & 241 & 174 & 210 \\
\hline
$\boldsymbol { y }$ & 4551 & 3410 & 3252 & 3756 & 3787 & 4024 & 4254 \\
\hline
\end{tabular}
\end{center}
\begin{enumerate}[label=(\alph*)]
\item Calculate the equation of the least squares regression line of $y$ on $x$.
\item Estimate the volume of fuel sold during a week in which 200 customers purchase fuel.
\item Comment on the likely reliability of your estimate in part (b), given that, for the regression line calculated in part (a), the values of the 7 residuals lie between approximately - 415 litres and + 430 litres.
\end{enumerate}
\hfill \mbox{\textit{AQA S1 2010 Q3 [8]}}