Standard CI with summary statistics

Questions that provide sample sizes and either sample means/variances directly or summary statistics (Σx, Σx²) from which they must be calculated, using the standard normal approximation or t-distribution for the difference of means.

19 questions

CAIE Further Paper 4 2020 June Q2
2 A random sample of 40 observations of a random variable \(X\) and a random sample of 50 observations of a random variable \(Y\) are taken. The resulting values for the sample means, \(\bar { x }\) and \(\bar { y }\), and the unbiased estimates, \(\mathrm { s } _ { \mathrm { x } } ^ { 2 }\) and \(\mathrm { s } _ { \mathrm { y } } ^ { 2 }\), for the population variances are as follows. $$\bar { x } = 24.4 \quad \bar { y } = 17.2 \quad s _ { x } ^ { 2 } = 10.2 \quad s _ { y } ^ { 2 } = 11.1$$ Find a \(90 \%\) confidence interval for the difference between the population means of \(X\) and \(Y\).
CAIE Further Paper 4 2021 June Q3
3 The heights, \(x \mathrm {~m}\), of a random sample of 50 adult males from country \(A\) were recorded. The heights, \(y \mathrm {~m}\), of a random sample of 40 adult males from country \(B\) were also recorded. The results are summarised as follows. $$\Sigma x = 89.0 \quad \Sigma x ^ { 2 } = 159.4 \quad \Sigma y = 67.2 \quad \Sigma y ^ { 2 } = 113.1$$ Find a 95\% confidence interval for the difference between the mean heights of adult males from country \(A\) and adult males from country \(B\).
\(4 X\) is a discrete random variable which takes the values \(0,2,4 , \ldots\). The probability generating function of \(X\) is given by $$G _ { X } ( t ) = \frac { 1 } { 3 - 2 t ^ { 2 } }$$
  1. Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
  2. Find \(\mathrm { P } ( X = 4 )\).
CAIE Further Paper 4 2024 June Q6
6 Seva is investigating the lengths of the tails of adult wallabies in two regions of Australia, \(X\) and \(Y\). He chooses a random sample of 50 adult wallabies from region \(X\) and records the lengths, \(x \mathrm {~cm}\), of their tails. He also chooses a random sample of 40 adult wallabies from region \(Y\) and records the lengths, \(y \mathrm {~cm}\), of their tails. His results are summarised as follows. $$\sum x = 1080 \quad \sum x ^ { 2 } = 23480 \quad \sum y = 940 \quad \sum y ^ { 2 } = 22220$$ It cannot be assumed that the population variances of the two distributions are the same.
  1. Find a \(90 \%\) confidence interval for the difference between the population mean lengths of the tails of adult wallabies in regions \(X\) and \(Y\).
    \includegraphics[max width=\textwidth, alt={}, center]{b5ff998a-fcb6-4a1b-ae86-ec66b0dccc3c-10_2718_38_141_2010} The population mean lengths of the tails of adult wallabies in regions \(X\) and \(Y\) are \(\mu _ { X } \mathrm {~cm}\) and \(\mu _ { Y } \mathrm {~cm}\) respectively.
  2. Test, at the \(10 \%\) significance level, the null hypothesis \(\mu _ { Y } - \mu _ { X } = 1.1\) against the alternative hypothesis \(\mu _ { Y } - \mu _ { X } > 1.1\). State your conclusion in the context of the question.
    If you use the following page to complete the answer to any question, the question number must be clearly shown.
CAIE Further Paper 4 2020 November Q1
1 Kayla is investigating the lengths of the leaves of a certain type of tree found in two forests \(X\) and \(Y\). She chooses a random sample of 40 leaves of this type from forest \(X\) and records their lengths, \(x \mathrm {~cm}\). She also records the lengths, \(y \mathrm {~cm}\), for a random sample of 60 leaves of this type from forest \(Y\). Her results are summarised as follows. $$\sum x = 242.0 \quad \sum x ^ { 2 } = 1587.0 \quad \sum y = 373.2 \quad \sum y ^ { 2 } = 2532.6$$ Find a \(90 \%\) confidence interval for the difference between the population mean lengths of leaves in forests \(X\) and \(Y\).
CAIE Further Paper 4 2021 November Q1
1 The number, \(x\), of pine trees was counted in each of 40 randomly chosen regions of equal size in country \(A\). The number, \(y\), of pine trees was counted in each of 60 randomly chosen regions of the same equal size in country \(B\). The results are summarised as follows. $$\sum x = 752 \quad \sum x ^ { 2 } = 14320 \quad \sum y = 1548 \quad \sum y ^ { 2 } = 40200$$ Find a 95\% confidence interval for the difference between the mean number of pine trees in regions of this size in countries \(A\) and \(B\).
CAIE Further Paper 4 2022 November Q1
1 Jasmine is researching the heights of pine trees in forests in two regions \(A\) and \(B\). She chooses a random sample of 50 pine trees in region \(A\) and records their heights, \(x \mathrm {~m}\). She also chooses a random sample of 60 pine trees in region \(B\) and records their heights, \(y \mathrm {~m}\). Her results are summarised as follows. $$\sum x = 1625 \quad \sum x ^ { 2 } = 53200 \quad \sum y = 1854 \quad \sum y ^ { 2 } = 57900$$ Find a \(95 \%\) confidence interval for the difference between the population mean heights of pine trees in regions \(A\) and \(B\).
CAIE Further Paper 4 2023 November Q1
1 A factory produces small bottles of natural spring water. Two different machines, \(X\) and \(Y\), are used to fill empty bottles with the water. A quality control engineer checks the volumes of water in the bottles filled by each of the machines. He chooses a random sample of 60 bottles filled by machine \(X\) and a random sample of 75 bottles filled by machine \(Y\). The volumes of water, \(x\) and \(y\) respectively, in millilitres, are summarised as follows. $$\sum x = 6345 \quad \sum ( x - \bar { x } ) ^ { 2 } = 243.8 \quad \sum y = 7614 \quad \sum ( y - \bar { y } ) ^ { 2 } = 384.9$$ \(\bar { x }\) and \(\bar { y }\) are the sample means of the volume of water in the bottles filled by machines \(X\) and \(Y\) respectively. Find a \(95 \%\) confidence interval for the difference between the mean volume of water in bottles filled by machine \(X\) and the mean volume of water in bottles filled by machine \(Y\).
CAIE Further Paper 4 2024 November Q1
1 Ellie is investigating the heights of two types of beech tree, \(A\) and \(B\), in a certain region. She has chosen a random sample of 60 beech trees of type \(A\) in the region, recorded their heights, \(x \mathrm {~m}\), and calculated unbiased estimates for the population mean and population variance as 35.6 m and \(4.95 \mathrm {~m} ^ { 2 }\) respectively. Ellie also chooses a random sample of 50 beech trees of type \(B\) in the region and records their heights, \(y \mathrm {~m}\). Her results are summarised as follows. $$\sum y = 1654 \quad \sum y ^ { 2 } = 54850$$ Find a \(95 \%\) confidence interval for the difference between the population mean heights of type \(A\) and type \(B\) beech trees in the region.
OCR S3 2010 January Q3
3 It is given that \(X _ { 1 }\) and \(X _ { 2 }\) are independent random variables with \(X _ { 1 } \sim \mathrm {~N} \left( \mu _ { 1 } , 2.47 \right)\) and \(X _ { 2 } \sim \mathrm {~N} \left( \mu _ { 2 } , 4.23 \right)\). Random samples of \(n _ { 1 }\) observations of \(X _ { 1 }\) and \(n _ { 2 }\) observations of \(X _ { 2 }\) are taken. The sample means are denoted by \(\bar { X } _ { 1 }\) and \(\bar { X } _ { 2 }\).
  1. State the distribution of \(\bar { X } _ { 1 } - \bar { X } _ { 2 }\), giving its parameters. For two particular samples, \(n _ { 1 } = 5 , \Sigma x _ { 1 } = 48.25 , n _ { 2 } = 10\) and \(\Sigma x _ { 2 } = 72.30\).
  2. Test at the \(2 \%\) significance level whether \(\mu _ { 1 }\) differs from \(\mu _ { 2 }\). A student stated that because of the Central Limit Theorem the sample means will have normal distributions so it is unnecessary for \(X _ { 1 }\) and \(X _ { 2 }\) to have normal distributions.
  3. Comment on the student's statement.
CAIE FP2 2010 June Q8
8 An examination involved writing an essay. In order to compare the time taken to write the essay by students in two large colleges, a sample of 12 students from college \(A\) and a sample of 8 students from college \(B\) were randomly selected. The times, \(t _ { A }\) and \(t _ { B }\), taken for these students to write the essay were measured, correct to the nearest minute, and are summarised by $$n _ { A } = 12 , \quad \Sigma t _ { A } = 257 , \quad \Sigma t _ { A } ^ { 2 } = 5629 , \quad n _ { B } = 8 , \quad \Sigma t _ { B } = 206 , \quad \Sigma t _ { B } ^ { 2 } = 5359$$ Stating any required assumptions, calculate a \(95 \%\) confidence interval for the difference in the population means. State, giving a reason, whether your confidence interval supports the statement that the population means, for the two colleges, are equal.
CAIE FP2 2012 June Q10
10 Engineers are investigating the speed of the internet connection received by households in two towns \(P\) and \(Q\). The speeds, in suitable units, in \(P\) and \(Q\) are denoted by \(x\) and \(y\) respectively. For a random sample of 50 houses in town \(P\) and a random sample of 40 houses in town \(Q\) the results are summarised as follows. $$\Sigma x = 240 \quad \Sigma x ^ { 2 } = 1224 \quad \Sigma y = 168 \quad \Sigma y ^ { 2 } = 754$$ Calculate a \(95 \%\) confidence interval for \(\mu _ { P } - \mu _ { Q }\), where \(\mu _ { P }\) and \(\mu _ { Q }\) are the population mean speeds for \(P\) and \(Q\). Test, at the \(1 \%\) significance level, whether \(\mu _ { P }\) is greater than \(\mu _ { Q }\).
CAIE FP2 2013 June Q8
8 The number, \(x\), of a certain type of sea shell was counted at 60 randomly chosen sites, each one metre square, along the coastline in country \(A\). The number, \(y\), of the same type of shell was counted at 50 randomly chosen sites, each one metre square, along the coastline in country \(B\). The results are summarised as follows. $$\Sigma x = 1752 \quad \Sigma x ^ { 2 } = 55500 \quad \Sigma y = 1220 \quad \Sigma y ^ { 2 } = 33500$$ Find a 95\% confidence interval for the difference between the mean number of sea shells, per square metre, on the coastlines in country \(A\) and in country \(B\).
CAIE FP2 2017 June Q8
8 The number, \(x\), of beech trees was counted in each of 50 randomly chosen regions of equal size in beech forests in country \(A\). The number, \(y\), of beech trees was counted in each of 40 randomly chosen regions of the same equal size in beech forests in country \(B\). The results are summarised as follows. $$\Sigma x = 1416 \quad \Sigma x ^ { 2 } = 41100 \quad \Sigma y = 888 \quad \Sigma y ^ { 2 } = 20140$$ Find a 95\% confidence interval for the difference between the mean number of beech trees in regions of this size in country \(A\) and in country \(B\).
AQA S3 2014 June Q4
8 marks
4 A sample of 50 male Eastern Grey kangaroos had a mean weight of 42.6 kg and a standard deviation of 6.2 kg . A sample of 50 male Western Grey kangaroos had a mean weight of 39.7 kg and a standard deviation of 5.3 kg .
  1. Construct a 98\% confidence interval for the difference between the mean weight of male Eastern Grey kangaroos and that of male Western Grey kangaroos.
    [0pt] [5 marks]
    1. What assumption about the selection of each of the two samples was it necessary to make in order that the confidence interval constructed in part (a) was valid?
      [0pt] [1 mark]
    2. Why was it not necessary to assume anything about the distributions of the weights of male kangaroos in order that the confidence interval constructed in part (a) was valid?
      [0pt] [2 marks]
Edexcel S4 2006 January Q6
6. A tree is cut down and sawn into pieces. Half of the pieces are stored outside and half of the pieces are stored inside. After a year, a random sample of pieces is taken from each location and the hardness is measured. The hardness \(x\) units are summarised in the following table.
Number of
pieces sampled
\(\Sigma x\)\(\Sigma x ^ { 2 }\)
Stored outside202340274050
Stored inside374884645282
  1. Show that unbiased estimates for the variance of the values of hardness for wood stored outside and for the wood stored inside are 14.2 and 16.5 , to 1 decimal place, respectively.
    (2) The hardness of wood stored outside and the hardness of wood stored inside can be assumed to be normally distributed with equal variances.
  2. Calculate \(95 \%\) confidence limits for the difference in mean hardness between the wood that was stored outside and the wood that was stored inside.
    (8)
  3. Using your answer to part (b), comment on the means of the hardness of wood stored outside and inside. Give a reason for your answer.
    (2)
    (Total 12 marks)
Edexcel S4 2005 June Q6
6. Brickland and Goodbrick are two manufacturers of bricks. The lengths of the bricks produced by each manufacturer can be assumed to be normally distributed. A random sample of 20 bricks is taken from Brickland and the length, \(x \mathrm {~mm}\), of each brick is recorded. The mean of this sample is 207.1 mm and the variance is \(3.2 \mathrm {~mm} ^ { 2 }\).
  1. Calculate the \(98 \%\) confidence interval for the mean length of brick from Brickland. A random sample of 10 bricks is selected from those manufactured by Goodbrick. The length of each brick, \(y \mathrm {~mm}\), is recorded. The results are summarised as follows. $$\sum y = 2046.2 \quad \sum y ^ { 2 } = 418785.4$$ The variances of the length of brick for each manufacturer are assumed to be the same.
  2. Find a \(90 \%\) confidence interval for the value by which the mean length of brick made by Brickland exceeds the mean length of brick made by Goodbrick.
Edexcel S4 2017 June Q5
  1. Jamland and Goodjam are two suppliers of jars of jam. The weights of the jars of jam produced by each supplier can be assumed to be normally distributed with unknown, but equal, variances. A random sample of 20 jars of jam is taken from those supplied by Jamland.
Based on this sample, the 95\% confidence interval for the mean weight of a jar of Jamland jam, in grams, is
[0pt] [ 492, 507 ] A random sample of 10 jars of jam is selected from those supplied by Goodjam. The weight of each jar of Goodjam jam, \(y\) grams, is recorded. The results are summarised as follows $$\bar { y } = 480 \quad s _ { y } ^ { 2 } = 280$$ Find a 90\% confidence interval for the value by which the mean weight of a jar of jam supplied by Jamland exceeds the mean weight of a jar of jam supplied by Goodjam.
AQA S3 2006 June Q7
7 A shop sells cooked chickens in two sizes: medium and large.
The weights, \(X\) grams, of medium chickens may be assumed to be normally distributed with mean \(\mu _ { X }\) and standard deviation 45. The weights, \(Y\) grams, of large chickens may be assumed to be normally distributed with mean \(\mu _ { Y }\) and standard deviation 65. A random sample of 20 medium chickens had a mean weight, \(\bar { x }\) grams, of 936 .
A random sample of 10 large chickens had the following weights in grams: $$\begin{array} { l l l l l l l l l l } 1165 & 1202 & 1077 & 1144 & 1195 & 1275 & 1136 & 1215 & 1233 & 1288 \end{array}$$
  1. Calculate the mean weight, \(\bar { y }\) grams, of this sample of large chickens.
  2. Hence investigate, at the \(1 \%\) level of significance, the claim that the mean weight of large chickens exceeds that of medium chickens by more than 200 grams.
    1. Deduce that, for your test in part (b), the critical value of \(( \bar { y } - \bar { x } )\) is 253.24, correct to two decimal places.
    2. Hence determine the power of your test in part (b), given that \(\mu _ { Y } - \mu _ { X } = 275\).
    3. Interpret, in the context of this question, the value that you obtained in part (c)(ii).
      (3 marks)
AQA S3 2007 June Q1
1 As part of an investigation into the starting salaries of graduates in a European country, the following information was collected.
\multirow{2}{*}{}Starting salary (€)
Sample sizeSample meanSample standard deviation
Science graduates175192687321
Arts graduates225178968205
  1. Stating a necessary assumption about the samples, construct a \(98 \%\) confidence interval for the difference between the mean starting salary of science graduates and that of arts graduates.
  2. What can be concluded from your confidence interval?