Estimate from grouped frequency data

OCR H240/02 Q8

8 A market gardener records the masses of a random sample of 100 of this year's crop of plums. The table shows his results.

Mass,

\(m\) grams

\(m < 25\)

\(25 \leq m < 35\)

\(35 \leq m < 45\)

\(45 \leq m < 55\)

\(55 \leq m < 65\)

\(65 \leq m < 75\)

\(m \geq 75\)

Number

of plums

0

3

29

36

30

2

0

Explain why the normal distribution might be a reasonable model for this distribution. The market gardener models the distribution of masses by \(\mathrm { N } \left( 47.5,10 ^ { 2 } \right)\).
Find the number of plums in the sample that this model would predict to have masses in the range:
1. \(35 \leq m < 45\)
2. \(m < 25\).
Use your answers to parts (b)(i) and (b)(ii) to comment on the suitability of this model. The market gardener plans to use this model to predict the distribution of the masses of next year's crop of plums.
Comment on this plan.

OCR MEI Paper 2 2020 November Q8

8 Rosella is carrying out an investigation into the age at which adults retire from work in the city where she lives. She collects a sample of size 50 , ensuring this comprises of 25 randomly selected retired men and 25 randomly selected retired women.

State the name of the sampling method she uses. Fig. 8.1 shows the data she obtains in a frequency table and Fig. 8.2 shows these data displayed in a histogram. \begin{table}[h]
Age in years at retirement \(45 -\) \(50 -\) \(55 -\) \(60 -\) \(65 -\) \(70 -\) \(75 - 80\)
Frequency density 0.4 1.8 2.4 2.2 1.8 1.2 0.2
\captionsetup{labelformat=empty} \caption{Fig. 8.1}
\end{table} \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{cea67565-8074-4703-8e1a-09b98e380baf-08_805_1006_1160_244} \captionsetup{labelformat=empty} \caption{Fig. 8.2}
\end{figure}
How many people in the sample are aged between 50 and 55? Rosella obtains a list of the names of all 4960 people who have retired in the city during the previous month.
Describe how Rosella could collect a sample of size 200 from her list using
- systematic sampling such that every item on the list could be selected,
- simple random sampling.
Rosella collects two simple random samples, one of size 200 and one of size 500, from her list. The histograms in Fig. 8.3 show the data from the sample of size 200 on the left and the data from the sample of 500 on the right. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{cea67565-8074-4703-8e1a-09b98e380baf-09_659_1909_388_77} \captionsetup{labelformat=empty} \caption{Fig. 8.3}
\end{figure}
With reference to the histograms shown in Fig. 8.2 and Fig. 8.3, explain why it appears reasonable to model the age of retirement in this city using the Normal distribution. Summary statistics for the sample of 500 are shown in Fig. 8.4. \begin{table}[h]
Statistics
n 500
Mean 60.0515
\(\sigma\) 6.5717
s 6.5783
\(\Sigma x\) 30025.7601
\(\Sigma \mathrm { x } ^ { 2 }\) 1824686.322
Min 36.0793
Q1 55.2573
Median 59.9202
Q3 64.4239
Max 81.742
\captionsetup{labelformat=empty} \caption{Fig. 8.4}
\end{table}
Use an appropriate Normal model based on the information in Fig. 8.4 to estimate the number of people aged over 65 who retired in the city in the previous month.
Identify a limitation in using this model to predict the number of people aged over 65 retiring in the following month.

AQA S1 2011 January Q3

3 The volume, \(X\) litres, of orange juice in a 1-litre carton may be modelled by a normal distribution with unknown mean \(\mu\). The volumes, \(x\) litres, recorded to the nearest 0.01 litre, in a random sample of 100 cartons are shown in the table.

Volume ( \(\boldsymbol { x }\) litres)	Number of cartons (f)
0.95-0.97	2
0.98-1.00	7
1.01-1.03	15
1.04-1.06	32
1.07-1.09	22
1.10-1.12	14
1.13-1.15	7
1.16-1.18	1
Total	100

For the group ' \(0.98 - 1.00\) ':
1. show that it has a mid-point of 0.99 litres;
2. state the minimum and the maximum values of \(x\) that could be included in this group.
Calculate, to three decimal places, estimates of the mean and the standard deviation of these 100 volumes.
1. Construct an approximate \(99 \%\) confidence interval for \(\mu\).
2. State why use of the Central Limit Theorem was not required when calculating this confidence interval.
3. Give a reason why the confidence interval is approximate rather than exact.
Give a reason in support of the claim that:
1. \(\mu > 1\);
2. \(\mathrm { P } ( 0.94 < X < 1.16 )\) is approximately 1 .
  \includegraphics[max width=\textwidth, alt={}]{156f9453-ebc6-4406-b5bc-08d1918ebc62-10_2486_1714_221_153}
  
  \includegraphics[max width=\textwidth, alt={}]{156f9453-ebc6-4406-b5bc-08d1918ebc62-11_2486_1714_221_153}

OCR Stats 1 2018 September Q9

9 The finance department of a retail firm recorded the daily income each day for 300 days. The results are summarised in the histogram.
\includegraphics[max width=\textwidth, alt={}, center]{85de9a39-f8be-40ee-b0c8-e2e632be93d8-6_689_1575_488_246}

Find the number of days on which the daily income was between \(\pounds 4000\) and \(\pounds 6000\).
Calculate an estimate of the number of days on which the daily income was between \(\pounds 2700\) and \(\pounds 3600\).
Use the midpoints of the classes to show that an estimate of the mean daily income is \(\pounds 3275\). An estimate of the standard deviation of the daily income is \(\pounds 1060\). The finance department uses the distribution \(\mathrm { N } \left( 3275,1060 ^ { 2 } \right)\) to model the daily income, in pounds.
Calculate the number of days on which, according to this model, the daily income would be between \(\pounds 4000\) and \(\pounds 6000\).
It is given that approximately \(95 \%\) of values of the distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) lie within the range \(\mu \pm 2 \sigma\). Without further calculation, use this fact to comment briefly on whether the proposed model is a good fit to the data illustrated in the histogram.

OCR H240/02 2020 November Q11

11 As part of a research project, the masses, \(m\) grams, of a random sample of 1000 pebbles from a certain beach were recorded. The results are summarised in the table.

Mass \(( \mathrm { g } )\)	\(50 \leqslant m < 150\)	\(150 \leqslant m < 200\)	\(200 \leqslant m < 250\)	\(250 \leqslant m < 350\)
Frequency	162	318	355	165

Calculate estimates of the mean and standard deviation of these masses. The masses, \(x\) grams, of a random sample of 1000 pebbles on a different beach were also found. It was proposed that the distribution of these masses should be modelled by the random variable \(X \sim \mathrm {~N} ( 200,3600 )\).
Use the model to find \(\mathrm { P } ( 150 < X < 210 )\).
Use the model to determine \(x _ { 1 }\) such that \(\mathrm { P } \left( 160 < X < x _ { 1 } \right) = 0.6\), giving your answer correct to five significant figures. It was found that the smallest and largest masses of the pebbles in this second sample were 112 g and 288 g respectively.
Use these results to show that the model may not be appropriate.
Suggest a different value of a parameter of the model in the light of these results.

Age in years at retirement	\(45 -\)	\(50 -\)	\(55 -\)	\(60 -\)	\(65 -\)	\(70 -\)	\(75 - 80\)
Frequency density	0.4	1.8	2.4	2.2	1.8	1.2	0.2

Statistics
n	500
Mean	60.0515
\(\sigma\)	6.5717
s	6.5783
\(\Sigma x\)	30025.7601
\(\Sigma \mathrm { x } ^ { 2 }\)	1824686.322
Min	36.0793
Q1	55.2573
Median	59.9202
Q3	64.4239
Max	81.742