Edexcel AS Paper 2 2019 June — Question 4 8 marks

Exam BoardEdexcel
ModuleAS Paper 2 (AS Paper 2)
Year2019
SessionJune
Marks8
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicMeasures of Location and Spread
TypeClean or interpret large data set structure
DifficultyModerate -0.8 This is a straightforward AS-level statistics question testing basic knowledge of the large data set (cleaning data for missing/trace values), linear interpolation for quartiles, and standard deviation from grouped data. All techniques are routine recall with no problem-solving insight required, though part (d) requires some understanding of data distribution. Easier than average due to being mostly procedural with given summations.
Spec2.01d Select/critique sampling: in context2.02b Histogram: area represents frequency2.02f Measures of average and spread2.02g Calculate mean and standard deviation2.02j Clean data: missing data, errors

  1. Joshua is investigating the daily total rainfall in Hurn for May to October 2015
Using the information from the large data set, Joshua wishes to calculate the mean of the daily total rainfall in Hurn for May to October 2015
  1. Using your knowledge of the large data set, explain why Joshua needs to clean the data before calculating the mean. Using the information from the large data set, he produces the grouped frequency table below.
    Daily total rainfall ( \(r \mathrm {~mm}\) )FrequencyMidpoint ( \(\boldsymbol { x } \mathbf { m m }\) )
    \(0 \leqslant r < 0.5\)1210.25
    \(0.5 \leqslant r < 1.0\)100.75
    \(1.0 \leqslant r < 5.0\)243.0
    \(5.0 \leqslant r < 10.0\)127.5
    \(10.0 \leqslant r < 30.0\)1720.0
    $$\text { You may use } \sum \mathrm { f } x = 539.75 \text { and } \sum \mathrm { f } x ^ { 2 } = 7704.1875$$
  2. Use linear interpolation to calculate an estimate for the upper quartile of the daily total rainfall.
  3. Calculate an estimate for the standard deviation of the daily total rainfall in Hurn for May to October 2015
    1. State the assumption involved with using class midpoints to calculate an estimate of a mean from a grouped frequency table.
    2. Using your knowledge of the large data set, explain why this assumption does not hold in this case.
    3. State, giving a reason, whether you would expect the actual mean daily total rainfall in Hurn for May to October 2015 to be larger than, smaller than or the same as an estimate based on the grouped frequency table.

Question 4:
Part (a):
AnswerMarks Guidance
\(\text{Tr(ace) (data needs to be converted to numbers before the calculation can be carried out)}\)B1 Identifying tr(ace) data; ignore comments about n/a, missing data, anomalies etc.
Part (b):
AnswerMarks Guidance
\(\left[1+\right]\frac{138-131}{24} \times 4\)M1 Correct fraction \(\frac{7}{24} \times 4\); allow working down \([5] - \frac{155-138}{24} \times 4\); allow a correct equation leading to a correct fraction e.g. \(\frac{x-1}{5-1} = \frac{138-131}{155-131}\) for M1; use of \((n+1)\) with 138.75 allow \(\frac{7.75}{24} \times 4\)
\(= 2.1666\ldots\) awrt \(\mathbf{2.17}\)A1 awrt 2.17 (condone \(\frac{13}{6}\)); awrt 2.29 from \((n+1)\) (condone \(\frac{55}{24}\))
Part (c):
AnswerMarks Guidance
\(\sigma = \sqrt{\dfrac{7704.1875}{184} - \left(\dfrac{539.75}{184}\right)^2} = 5.7676\ldots \quad \sigma = \text{awrt } \mathbf{5.77}\)M1 A1 M1: Correct expression for standard deviation (allow mean = awrt 2.93); A1: awrt 5.77 correct answer only scores M1A1 (allow \(s = 5.78\)); SC: 5.76 with no working scores M1A0
Part (d)(i):
AnswerMarks Guidance
Using class midpoints to estimate the mean assumes that the values are uniformly distributed within the class(es).B1 Explaining that data assumed to be spread evenly across each class (o.e.); e.g. the midpoint of each class is the mean of each class; or all the values in the class are located at the midpoint; condone normally distributed within each class
Parts (ii) & (iii):
AnswerMarks Guidance
This is not the case here as the majority of the data (in the first class) are 0.B1 Demonstrating an understanding of the LDS that the majority of data values (in the first class) are at 0 or close to 0 (trace)
The actual mean is likely to be smaller than the estimate (since the first group has more values at 0 and close to 0)dB1 (dependent upon 2nd B1) Correct inference based on knowledge of the LDS; SC: If B1 is scored in (i) for 'The data are spread evenly across each class,' then in (ii) 'The data are not evenly distributed in the classes' scores B1 but in (iii) 'the actual mean is smaller' with no further justification scores B0
# Question 4:

## Part (a):
$\text{Tr(ace) (data needs to be converted to numbers before the calculation can be carried out)}$ | B1 | Identifying tr(ace) data; ignore comments about n/a, missing data, anomalies etc.

## Part (b):
$\left[1+\right]\frac{138-131}{24} \times 4$ | M1 | Correct fraction $\frac{7}{24} \times 4$; allow working down $[5] - \frac{155-138}{24} \times 4$; allow a correct equation leading to a correct fraction e.g. $\frac{x-1}{5-1} = \frac{138-131}{155-131}$ for M1; use of $(n+1)$ with 138.75 allow $\frac{7.75}{24} \times 4$

$= 2.1666\ldots$ awrt $\mathbf{2.17}$ | A1 | awrt 2.17 (condone $\frac{13}{6}$); awrt 2.29 from $(n+1)$ (condone $\frac{55}{24}$)

## Part (c):
$\sigma = \sqrt{\dfrac{7704.1875}{184} - \left(\dfrac{539.75}{184}\right)^2} = 5.7676\ldots \quad \sigma = \text{awrt } \mathbf{5.77}$ | M1 A1 | M1: Correct expression for standard deviation (allow mean = awrt 2.93); A1: awrt 5.77 correct answer only scores M1A1 (allow $s = 5.78$); SC: 5.76 with no working scores M1A0

## Part (d)(i):
Using class midpoints to estimate the mean assumes that the values are uniformly distributed **within the class(es)**. | B1 | Explaining that data assumed to be spread evenly across each class (o.e.); e.g. the midpoint of each class is the mean of each class; or all the values in the class are located at the midpoint; condone normally distributed within each class

## Parts (ii) & (iii):
This is not the case here as the majority of the data (in the first class) are 0. | B1 | Demonstrating an understanding of the LDS that the majority of data values (in the first class) are at 0 or close to 0 (trace)

The actual mean is likely to be smaller than the estimate (since the first group has more values at 0 and close to 0) | dB1 | (dependent upon 2nd B1) Correct inference based on knowledge of the LDS; SC: If B1 is scored in (i) for 'The data are spread evenly across each class,' then in (ii) 'The data are not evenly distributed in the classes' scores B1 but in (iii) 'the actual mean is smaller' with no further justification scores B0

---
\begin{enumerate}
  \item Joshua is investigating the daily total rainfall in Hurn for May to October 2015
\end{enumerate}

Using the information from the large data set, Joshua wishes to calculate the mean of the daily total rainfall in Hurn for May to October 2015\\
(a) Using your knowledge of the large data set, explain why Joshua needs to clean the data before calculating the mean.

Using the information from the large data set, he produces the grouped frequency table below.

\begin{center}
\begin{tabular}{|l|l|l|}
\hline
Daily total rainfall ( $r \mathrm {~mm}$ ) & Frequency & Midpoint ( $\boldsymbol { x } \mathbf { m m }$ ) \\
\hline
$0 \leqslant r < 0.5$ & 121 & 0.25 \\
\hline
$0.5 \leqslant r < 1.0$ & 10 & 0.75 \\
\hline
$1.0 \leqslant r < 5.0$ & 24 & 3.0 \\
\hline
$5.0 \leqslant r < 10.0$ & 12 & 7.5 \\
\hline
$10.0 \leqslant r < 30.0$ & 17 & 20.0 \\
\hline
\end{tabular}
\end{center}

$$\text { You may use } \sum \mathrm { f } x = 539.75 \text { and } \sum \mathrm { f } x ^ { 2 } = 7704.1875$$

(b) Use linear interpolation to calculate an estimate for the upper quartile of the daily total rainfall.\\
(c) Calculate an estimate for the standard deviation of the daily total rainfall in Hurn for May to October 2015\\
(d) (i) State the assumption involved with using class midpoints to calculate an estimate of a mean from a grouped frequency table.\\
(ii) Using your knowledge of the large data set, explain why this assumption does not hold in this case.\\
(iii) State, giving a reason, whether you would expect the actual mean daily total rainfall in Hurn for May to October 2015 to be larger than, smaller than or the same as an estimate based on the grouped frequency table.

\hfill \mbox{\textit{Edexcel AS Paper 2 2019 Q4 [8]}}