Question 4 - A-Level Maths

Edexcel S1 2007 January — Question 4 14 marks

Exam Board	Edexcel
Module	S1 (Statistics 1)
Year	2007
Session	January
Marks	14
Paper	Download PDF ↗
Mark scheme	Download PDF ↗
Topic	Measures of Location and Spread
Type	Histogram from discrete rounded data
Difficulty	Moderate -0.8 This is a routine S1 statistics question testing standard procedures: describing distribution shape, linear interpolation for median, calculating mean/SD from summary statistics, and applying a given skewness formula. All parts follow textbook methods with no problem-solving or novel insight required, making it easier than average A-level questions.
Spec	2.02a Interpret single variable data: tables and diagrams 2.02f Measures of average and spread 2.02g Calculate mean and standard deviation

Summarised below are the distances, to the nearest mile, travelled to work by a random sample of 120 commuters.

Distance (to the nearest mile)	Number of commuters
0-9	10
10-19	19
20-29	43
30-39	25
40-49	8
50-59	6
60-69	5
70-79	3
80-89	1

For this distribution,

describe its shape,
use linear interpolation to estimate its median. The mid-point of each class was represented by $x$ and its corresponding frequency by $f$ giving $$\Sigma f x = 3550 \text { and } \Sigma f x ^ { 2 } = 138020$$
Estimate the mean and the standard deviation of this distribution. One coefficient of skewness is given by $$\frac { 3 ( \text { mean - median } ) } { \text { standard deviation } } .$$
Evaluate this coefficient for this distribution.
State whether or not the value of your coefficient is consistent with your description in part (a). Justify your answer.
State, with a reason, whether you should use the mean or the median to represent the data in this distribution.
State the circumstance under which it would not matter whether you used the mean or the median to represent a set of data.

Show mark scheme Show mark scheme source

Answer	Marks	Guidance
(a) Positive skew	(both bits)	B1
(b) $19.5 + \frac{(60 - 29)}{43} \times 10 = 26.7093....$	M1, A1	awrt 26.7
(N.B. Use of 60.5 gives 26.825... so allow awrt 26.8)	(2 marks)
(c) $\mu = \frac{3550}{120} = 29.5833...$ or $29\frac{7}{12}$	B1	awrt 29.6
$\sigma^2 = \frac{138020}{120} - \mu^2$ or $\sigma = \sqrt{\frac{138020}{120} - \mu^2}$	M1
$\sigma = 16.5829...$ or ($s = 16.652...$)	A1	awrt 16.6 (or $s = 16.7$)
(d) $\frac{3(29.6 - 26.7)}{16.6} = 0.52....$	M1 A1 f.t.	awrt 0.520 (or with $s$ awrt 0.518)

(N.B. 60.5 in (b) ...awrt 0.499 [or with $s$ awrt 0.497])

Answer	Marks	Guidance
(e) $0.520 > 0$	B1 f.t
So it is consistent with their (d) being >0 or <0	dB1 f.t	ft their (d)
(f) Use Median	B1
Since the data is skewed or less affected by outliers/extreme values	dB1	(2 marks)
(g) If the data are symmetrical or skewness is zero or normal/uniform distribution ("mean = median" or "no outliers" or "evenly distributed" all score B0)	B1	(1 mark)

Total: 14 marks

**(a)** Positive skew | (both bits) | B1 | (1 mark)

**(b)** $19.5 + \frac{(60 - 29)}{43} \times 10 = 26.7093....$ | M1, A1 | awrt 26.7 |
(N.B. Use of 60.5 gives 26.825... so allow awrt 26.8) | (2 marks)

**(c)** $\mu = \frac{3550}{120} = 29.5833...$ or $29\frac{7}{12}$ | B1 | awrt **29.6** |
$\sigma^2 = \frac{138020}{120} - \mu^2$ or $\sigma = \sqrt{\frac{138020}{120} - \mu^2}$ | M1 |
$\sigma = 16.5829...$ or ($s = 16.652...$) | A1 | awrt **16.6** (or $s = 16.7$) | (3 marks)

**(d)** $\frac{3(29.6 - 26.7)}{16.6} = 0.52....$ | M1 A1 f.t. | awrt **0.520** (or with $s$ awrt **0.518**) | A1 | (3 marks)
(N.B. 60.5 in (b) ...awrt 0.499 [or with $s$ awrt 0.497])

**(e)** $0.520 > 0$ | B1 f.t |
So it is consistent with their (d) being >0 or <0 | dB1 f.t | ft their (d) | (2 marks)

**(f)** Use Median | B1 |
Since the data is skewed or less affected by outliers/extreme values | dB1 | (2 marks)

**(g)** If the data are symmetrical or skewness is zero or normal/uniform distribution ("mean = median" or "no outliers" or "evenly distributed" all score B0) | B1 | (1 mark)

**Total: 14 marks**

---

Show LaTeX source

\begin{enumerate}
  \item Summarised below are the distances, to the nearest mile, travelled to work by a random sample of 120 commuters.
\end{enumerate}

\begin{center}
\begin{tabular}{|l|l|}
\hline
Distance (to the nearest mile) & Number of commuters \\
\hline
0-9 & 10 \\
\hline
10-19 & 19 \\
\hline
20-29 & 43 \\
\hline
30-39 & 25 \\
\hline
40-49 & 8 \\
\hline
50-59 & 6 \\
\hline
60-69 & 5 \\
\hline
70-79 & 3 \\
\hline
80-89 & 1 \\
\hline
\end{tabular}
\end{center}

For this distribution,\\
(a) describe its shape,\\
(b) use linear interpolation to estimate its median.

The mid-point of each class was represented by $x$ and its corresponding frequency by $f$ giving

$$\Sigma f x = 3550 \text { and } \Sigma f x ^ { 2 } = 138020$$

(c) Estimate the mean and the standard deviation of this distribution.

One coefficient of skewness is given by

$$\frac { 3 ( \text { mean - median } ) } { \text { standard deviation } } .$$

(d) Evaluate this coefficient for this distribution.\\
(e) State whether or not the value of your coefficient is consistent with your description in part (a). Justify your answer.\\
(f) State, with a reason, whether you should use the mean or the median to represent the data in this distribution.\\
(g) State the circumstance under which it would not matter whether you used the mean or the median to represent a set of data.\\

\hfill \mbox{\textit{Edexcel S1 2007 Q4 [14]}}

This paper (7 questions)

View full paper

Q1 15 Q2 11 Q3 13 Q4 14 Q5 7 Q6 5 Q7 10

Answer	Marks	Guidance
(a) Positive skew	(both bits)	B1
(b) \(19.5 + \frac{(60 - 29)}{43} \times 10 = 26.7093....\)	M1, A1	awrt 26.7
(N.B. Use of 60.5 gives 26.825... so allow awrt 26.8)	(2 marks)
(c) \(\mu = \frac{3550}{120} = 29.5833...\) or \(29\frac{7}{12}\)	B1	awrt 29.6
\(\sigma^2 = \frac{138020}{120} - \mu^2\) or \(\sigma = \sqrt{\frac{138020}{120} - \mu^2}\)	M1
\(\sigma = 16.5829...\) or (\(s = 16.652...\))	A1	awrt 16.6 (or \(s = 16.7\))
(d) \(\frac{3(29.6 - 26.7)}{16.6} = 0.52....\)	M1 A1 f.t.	awrt 0.520 (or with \(s\) awrt 0.518)

Answer	Marks	Guidance
(e) \(0.520 > 0\)	B1 f.t
So it is consistent with their (d) being >0 or <0	dB1 f.t	ft their (d)
(f) Use Median	B1
Since the data is skewed or less affected by outliers/extreme values	dB1	(2 marks)
(g) If the data are symmetrical or skewness is zero or normal/uniform distribution ("mean = median" or "no outliers" or "evenly distributed" all score B0)	B1	(1 mark)

Edexcel S1 2007 January — Question 4 14 marks

(N.B. 60.5 in (b) ...awrt 0.499 [or with \(s\) awrt 0.497])

Total: 14 marks

This paper (7 questions)