Edexcel S4 (Statistics 4)

Question 3

The weights, in grams, of mice are normally distributed. A biologist takes a random sample of 10 mice. She weighs each mouse and records its weight.

The ten mice are then fed on a special diet. They are weighed again after two weeks.
Their weights in grams are as follows:

Mouse	A	$B$	C	D	$E$	$F$	G	$H$	$I$	$J$
Weight before diet	50.0	48.3	47.5	54.0	38.9	42.7	50.1	46.8	40.3	41.2
Weight after diet	52.1	47.6	50.1	52.3	42.2	44.3	51.8	48.0	41.9	43.6

Stating your hypotheses clearly, and using a $1 \%$ level of significance, test whether or not the diet causes an increase in the mean weight of the mice.

Question 5

View details

5. A machine is filling bottles of milk. A random sample of 16 bottles was taken and the volume of milk in each bottle was measured and recorded. The volume of milk in a bottle is normally distributed and the unbiased estimate of the variance, $s ^ { 2 }$, of the volume of milk in a bottle is 0.003

Find a 95\% confidence interval for the variance of the population of volumes of milk from which the sample was taken. The machine should fill bottles so that the standard deviation of the volumes is equal to 0.07
Comment on this with reference to your 95\% confidence interval.

Question 7

View details

An engineering firm buys steel rods. The steel rods from its present supplier are known to have a mean tensile strength of $230 \mathrm {~N} / \mathrm { mm } ^ { 2 }$.

A new supplier of steel rods offers to supply rods at a cheaper price than the present supplier. A random sample of ten rods from this new supplier gave tensile strengths, $x \mathrm { N } / \mathrm { mm } ^ { 2 }$, which are summarised below. Turn over

A company manufactures bolts with a mean diameter of 5 mm . The company wishes to check that the diameter of the bolts has not decreased. A random sample of 10 bolts is taken and the diameters, $x \mathrm {~mm}$, of the bolts are measured. The results are summarised below.

$$\sum x = 49.1 \quad \sum x ^ { 2 } = 241.2$$ Using a $1 \%$ level of significance, test whether or not the mean diameter of the bolts is less than 5 mm .
(You may assume that the diameter of the bolts follows a normal distribution.)
2. An emission-control device is tested to see if it reduces $\mathrm { CO } _ { 2 }$ emissions from cars. The emissions from 6 randomly selected cars are measured with and without the device. The results are as follows. Turn over
advancing learning, changing lives

A teacher wishes to test whether playing background music enables students to complete a task more quickly. The same task was completed by 15 students, divided at random into two groups. The first group had background music playing during the task and the second group had no background music playing.
The times taken, in minutes, to complete the task are summarised below.

(d) Find the value of $s$. The graph of the power function for the manager's test is shown in Figure 1. \begin{figure}[h]

\includegraphics[alt={},max width=\textwidth]{a1841cf5-93f3-4043-b6ed-651168b13b87-34_1157_1436_847_260} \captionsetup{labelformat=empty} \caption{Figure 1}

\end{figure} (e) On the same axes, draw the graph of the power function for the deputy's test.
(f) (i) State the value of $p$ where these graphs intersect.
(ii) Compare the effectiveness of the two tests if $p$ is greater than this value. The deputy suggests that they should use his sampling method rather than the manager's.
(g) Give a reason why the manager might not agree to this change.

A random sample of 15 strawberries is taken from a large field and the weight $x$ grams of each strawberry is recorded. The results are summarised below.

$$\sum x = 291 \quad \sum x ^ { 2 } = 5968$$ Assume that the weights of strawberries are normally distributed. Calculate a 95\% confidence interval for
(a) (i) the mean of the weights of the strawberries in the field,
(ii) the variance of the weights of the strawberries in the field. Strawberries weighing more than 23 g are considered to be less tasty.
(b) Use appropriate confidence limits from part (a) to find the highest estimate of the proportion of strawberries that are considered to be less tasty.

A car manufacturer claims that, on a motorway, the mean number of miles per gallon for the Panther car is more than 70 . To test this claim a car magazine measures the number of miles per gallon, $x$, of each of a random sample of 20 Panther cars and obtained the following statistics.

$$\bar { x } = 71.2 \quad s = 3.4$$ The number of miles per gallon may be assumed to be normally distributed.
(a) Stating your hypotheses clearly and using a $5 \%$ level of significance, test the manufacturer's claim. The standard deviation of the number of miles per gallon for the Tiger car is 4 .
(b) Stating your hypotheses clearly, test, at the $5 \%$ level of significance, whether or not there is evidence that the variance of the number of miles per gallon for the Panther car is different from that of the Tiger car.

Faults occur in a roll of material at a rate of $\lambda$ per $\mathrm { m } ^ { 2 }$. To estimate $\lambda$, three pieces of material of sizes $3 \mathrm {~m} ^ { 2 } , 7 \mathrm {~m} ^ { 2 }$ and $10 \mathrm {~m} ^ { 2 }$ are selected and the number of faults $X _ { 1 } , X _ { 2 }$ and $X _ { 3 }$ respectively are recorded.

The estimator $\hat { \lambda }$, where $$\hat { \lambda } = k \left( X _ { 1 } + X _ { 2 } + X _ { 3 } \right)$$ is an unbiased estimator of $\lambda$.
(a) Write down the distributions of $X _ { 1 } , X _ { 2 }$ and $X _ { 3 }$ and find the value of $k$.
(b) Find $\operatorname { Var } ( \hat { \lambda } )$. A random sample of $n$ pieces of this material, each of size $4 \mathrm {~m} ^ { 2 }$, was taken. The number of faults on each piece, $Y$, was recorded.
(c) Show that $\frac { 1 } { 4 } \bar { Y }$ is an unbiased estimator of $\lambda$.
(d) Find $\operatorname { Var } \left( \frac { 1 } { 4 } \bar { Y } \right)$.
(e) Find the minimum value of $n$ for which $\frac { 1 } { 4 } \bar { Y }$ becomes a better estimator of $\lambda$ than $\hat { \lambda }$.
Turn over
advancing learning, changing lives

Find the value of the constant $a$ such that

$$\mathrm { P } \left( a < F _ { 8,10 } < 3.07 \right) = 0.94$$ 2. Two independent random samples $X _ { 1 } , X _ { 2 } , \ldots , X _ { 7 }$ and $Y _ { 1 } , Y _ { 2 } , Y _ { 3 } , Y _ { 4 }$ were taken from different normal populations with a common standard deviation $\sigma$. The following sample statistics were calculated. $$s _ { x } = 14.67 \quad s _ { y } = 12.07$$ Find the $99 \%$ confidence interval for $\sigma ^ { 2 }$ based on these two samples.
3. Manuel is planning to buy a new machine to squeeze oranges in his cafe and he has two models, at the same price, on trial. The manufacturers of machine $B$ claim that their machine produces more juice from an orange than machine $A$. To test this claim Manuel takes a random sample of 8 oranges, cuts them in half and puts one half in machine $A$ and the other half in machine $B$. The amount of juice, in ml , produced by each machine is given in the table below. \section*{Table 1} Figure 1 shows the graph of the power function of the test used by the consultant.
\includegraphics[max width=\textwidth, alt={}, center]{a1841cf5-93f3-4043-b6ed-651168b13b87-48_1722_1671_657_132} \section*{Figure 1} (e) On Figure 1 draw the graph of the power function of the manager's test.
(2)
(f) State, giving your reasons, which test you would recommend.
(2)

The weights of the contents of breakfast cereal boxes are normally distributed.

A manufacturer changes the style of the boxes but claims that the weight of the contents remains the same.
A random sample of 6 old style boxes had contents with the following weights (in grams). $$\begin{array} { l l l l l l } 512 & 503 & 514 & 506 & 509 & 515 \end{array}$$ The weights, $y$ grams, of the contents of an independent random sample of 5 new style boxes gave $$\bar { y } = 504.8 \text { and } s _ { y } = 3.420$$ (a) Use a two-tail test to show, at the $10 \%$ level of significance, that the variances of the weights of the contents of the old and new style boxes can be assumed to be equal. State your hypotheses clearly.
(b) Showing your working clearly, find a $90 \%$ confidence interval for $\mu _ { x } - \mu _ { y }$, where $\mu _ { x }$ and $\mu _ { y }$ are the mean weights of the contents of old and new style boxes respectively.
(c) With reference to your confidence interval comment on the manufacturer's claim. 6. A random sample $X _ { 1 } , X _ { 2 } , \ldots , X _ { n }$ is taken from a population where each of the $X _ { i }$ have a continuous uniform distribution over the interval $[ 0 , \beta ]$.
The random variable $Y = \max \left\{ X _ { 1 } , X _ { 2 } , \ldots , X _ { n } \right\}$.
The probability density function of $Y$ is given by $$\mathrm { f } ( y ) = \left\{ \begin{array} { c c } \frac { n } { \beta ^ { n } } y ^ { n - 1 } & 0 \leqslant y \leqslant \beta
0 & \text { otherwise } \end{array} \right.$$ (a) Show that $\mathrm { E } \left( Y ^ { m } \right) = \frac { n } { n + m } \beta ^ { m }$.
(b) Write down $\mathrm { E } ( Y )$.
(c) Using your answers to parts (a) and (b), or otherwise, show that $$\operatorname { Var } ( Y ) = \frac { n } { ( n + 1 ) ^ { 2 } ( n + 2 ) } \beta ^ { 2 }$$ (d) State, giving your reasons, whether or not $Y$ is a consistent estimator of $\beta$. The random variables $M = 2 \bar { X }$, where $\bar { X } = \frac { 1 } { n } \left( X _ { 1 } + X _ { 2 } + \ldots + X _ { n } \right)$, and $S = k Y$, where $k$ is a constant, are both unbiased estimators of $\beta$.
(e) Find the value of $k$ in terms of $n$.
(f) State, giving your reasons, which of $M$ and $S$ is the better estimator of $\beta$ in this case. Five observations of $X$ are: $\quad \begin{array} { l l l l l } 8.5 & 6.3 & 5.4 & 9.1 & 7.6 \end{array}$
(g) Calculate the better estimate of $\beta$. 7. A machine produces components whose lengths are normally distributed with mean 102.3 mm and standard deviation 2.8 mm . After the machine had been serviced, a random sample of 20 components were tested to see if the mean and standard deviation had changed. The lengths, $x \mathrm {~mm}$, of each of these 20 components are summarised as $$\sum x = 2072 \quad \sum x ^ { 2 } = 214856$$ (a) Stating your hypotheses clearly, test, at the $5 \%$ level of significance, whether or not there is evidence of a change in standard deviation.
(b) Stating your hypotheses clearly, test, at the $5 \%$ level of significance, whether or not the mean length of the components has changed from the original value of 102.3 mm using
(i) a normal distribution,
(ii) a $t$ distribution.
(c) Comment on the mean length of components produced after the service in the light of the tests from part (a) and part (b). Give a reason for your answer. Turn over

A medical student is investigating whether there is a difference in a person's blood pressure when sitting down and after standing up. She takes a random sample of 12 people and measures their blood pressure, in mmHg , when sitting down and after standing up.

The results are shown below.

Question 8

View details

A random sample $W _ { 1 } , W _ { 2 } \ldots , W _ { n }$ is taken from a distribution with mean $\mu$ and variance $\sigma ^ { 2 }$
1. Write down $\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } \right)$ and show that $\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } \right) = n \left( \sigma ^ { 2 } + \mu ^ { 2 } \right)$
An estimator for $\mu$ is $$\bar { X } = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i }$$
Show that $\bar { X }$ is a consistent estimator for $\mu$. An estimator of $\sigma ^ { 2 }$ is $$U = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } - \left( \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } \right) ^ { 2 }$$
Find the bias of $U$.
Write down an unbiased estimator of $\sigma ^ { 2 }$ in the form $k U$, where $k$ is in terms of $n$. Turn over
1. George owns a garage and he records the mileage of cars, $x$ thousands of miles, between services. The results from a random sample of 10 cars are summarised below.
$$\sum x = 113.4 \quad \sum x ^ { 2 } = 1414.08$$ The mileage of cars between services is normally distributed and George believes that the standard deviation is 2.4 thousand miles. Stating your hypotheses clearly, test, at the $5 \%$ level of significance, whether or not these data support George’s belief.
2. Every 6 months some engineers are tested to see if their times, in minutes, to assemble a particular component have changed. The times taken to assemble the component are normally distributed. A random sample of 8 engineers was chosen and their times to assemble the component were recorded in January and in July. The data are given in the table below. \end{table} Table 1 Figure 1 shows a graph of the power function for the scientist's test.
On the same axes draw the graph of the power function for the statistician's test. Given that it takes 20 minutes to collect and test a 20 ml sample and 15 minutes to collect and test a 10 ml sample
show that the expected time of the statistician's test is slower than the scientist's test for $\lambda \mathrm { e } ^ { - \lambda } > \frac { 1 } { 3 }$
By considering the times when $\lambda = 1$ and $\lambda = 2$ together with the power curves in part (e) suggest, giving a reason, which test you would use.
(2) \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{a1841cf5-93f3-4043-b6ed-651168b13b87-93_1179_1152_1455_395} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
1. The carbon content, measured in suitable units, of steel is normally distributed. Two independent random samples of steel were taken from a refining plant at different times and their carbon content recorded. The results are given below.
Sample $A : \quad 1.5 \quad 0.9 \quad 1.3 \quad 1.2$
$\begin{array} { l l l l l l l } \text { Sample } B : & 0.4 & 0.6 & 0.8 & 0.3 & 0.5 & 0.4 \end{array}$
Stating your hypotheses clearly, carry out a suitable test, at the $10 \%$ level of significance, to show that both samples can be assumed to have come from populations with a common variance $\sigma ^ { 2 }$.
Showing your working clearly, find the $99 \%$ confidence interval for $\sigma ^ { 2 }$ based on both samples.

Mouse	A	\(B\)	C	D	\(E\)	\(F\)	G	\(H\)	\(I\)	\(J\)
Weight before diet	50.0	48.3	47.5	54.0	38.9	42.7	50.1	46.8	40.3	41.2
Weight after diet	52.1	47.6	50.1	52.3	42.2	44.3	51.8	48.0	41.9	43.6