Edexcel S3 (Statistics 3) 2014 June

Question 1
View details
  1. A journalist is investigating factors which influence people when they buy a new car. One possible factor is fuel efficiency. The journalist randomly selects 8 car models. Each model's annual sales and fuel efficiency, in km/litre, are shown in the table below.
Car model\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Annual sales18005400181007100930048001220010700
Fuel efficiency5.218.614.813.218.311.916.517.7
  1. Calculate Spearman's rank correlation coefficient for these data. The journalist believes that car models with higher fuel efficiency will achieve higher sales.
  2. Stating your hypotheses clearly, test whether or not the data support the journalist's belief. Use a \(5 \%\) level of significance.
  3. State the assumption necessary for a product moment correlation coefficient to be valid in this case.
  4. The mean and median fuel efficiencies of the car models in the random sample are 14.5 km /litre and 15.65 km /litre respectively. Considering these statistics, as well as the distribution of the fuel efficiency data, state whether or not the data suggest that the assumption in part (c) might be true in this case. Give a reason for your answer. (No further calculations are required.)
Question 2
View details
  1. A survey asked a random sample of 200 people their age and the main use of their mobile phone.
The results are shown in Table 1 below. \begin{table}[h]
\multirow{2}{*}{}Main use of their mobile phone
InternetTextsPhone calls
\multirow{3}{*}{Age}Under 2027149
From 20 to 40323429
Over 40151921
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} The data are to be used to test whether or not age and main use of their mobile phone are independent. Table 2 shows the expected frequencies for each group, assuming people's age and main use of their mobile phone are independent. \begin{table}[h]
\multirow{2}{*}{}Main use of their mobile phone
InternetTextsPhone calls
\multirow{3}{*}{Age}Under 2018.516.7514.75
From 20 to 4035.1531.82528.025
Over 4020.3518.42516.225
\captionsetup{labelformat=empty} \caption{Table 2}
\end{table}
  1. For users under 20 choosing the Internet as the main use of their mobile phone,
    1. verify that the expected frequency is 18.5
    2. show that the contribution to the \(\chi ^ { 2 }\) test statistic is 3.91 to 3 significant figures.
  2. Given that the \(\chi ^ { 2 }\) test statistic for the data is 9.893 to 3 decimal places, test at the \(5 \%\) level of significance whether or not age and main use of their mobile phone are independent. State your hypotheses clearly.
Question 3
View details
  1. A company produces two types of milk powder, 'Semi-Skimmed' and 'Full Cream'. In tests, each type of milk powder is used to make a large number of cups of coffee. The mass, \(S\) grams, of 'Semi-Skimmed' milk powder used in one cup of coffee is modelled by \(S \sim \mathrm {~N} \left( 4.9,0.8 ^ { 2 } \right)\). The mass, \(C\) grams, of 'Full Cream' milk powder used in one cup of coffee is modelled by \(C \sim \mathrm {~N} \left( 2.5,0.4 ^ { 2 } \right)\)
    1. Two cups of coffee, one with each type of milk powder, are to be selected at random. Find the probability that the mass of 'Semi-Skimmed' milk powder used will be at least double that of the 'Full Cream' milk powder used.
    2. 'Semi-Skimmed' milk powder is sold in 500 g packs. Find the probability that one pack will be sufficient for 100 cups of coffee.
Question 4
View details
4. A manufacturing company produces solar panels. The output of each solar panel is normally distributed with standard deviation 6 watts. It is thought that the mean output, \(\mu\), is 160 watts. A researcher believes that the mean output of the solar panels is greater than 160 watts. He writes down the output values of 5 randomly selected solar panels. He uses the data to carry out a hypothesis test at the \(5 \%\) level of significance. He tests \(\mathrm { H } _ { 0 } : \mu = 160\) against \(\mathrm { H } _ { 1 } : \mu > 160\)
On reporting to his manager, the researcher can only find 4 of the output values. These are shown below $$\begin{array} { l l l l } 168.2 & 157.4 & 173.3 & 161.1 \end{array}$$ Given that the result of the hypothesis test is that there is significant evidence to reject \(\mathrm { H } _ { 0 }\) at the \(5 \%\) level of significance, calculate the minimum possible missing output value, \(\alpha\). Give your answer correct to 1 decimal place.
Question 5
View details
5. A student believes that there is a difference in the mean lengths of English and French films. He goes to the university video library and randomly selects a sample of 120 English films and a sample of 70 French films. He notes the length, \(x\) minutes, of each of the films in his samples. His data are summarised in the table below.
\(\Sigma x\)\(\Sigma x ^ { 2 }\)\(s ^ { 2 }\)\(n\)
English films1065095690998.5120
French films651061584915170
  1. Verify that the unbiased estimate of the variance, \(s ^ { 2 }\), of the lengths of English films is 98.5 minutes \({ } ^ { 2 }\)
  2. Stating your hypotheses clearly, test, at the 1\% level of significance, whether or not the mean lengths of English and French films are different.
  3. Explain the significance of the Central Limit Theorem to the test in part (b).
  4. The university video library contained 724 English films and 473 French films. Explain how the student could have taken a stratified sample of 190 of these films.
Question 6
View details
6. Bags of \(\pounds 1\) coins are paid into a bank. Each bag contains 20 coins. The bank manager believes that \(5 \%\) of the \(\pounds 1\) coins paid into the bank are fakes. He decides to use the distribution \(X \sim \mathrm {~B} ( 20,0.05 )\) to model the random variable \(X\), the number of fake \(\pounds 1\) coins in each bag.
  1. State the assumptions necessary for the binomial distribution to be an appropriate model in this case. The bank manager checks a random sample of 150 bags of \(\pounds 1\) coins and records the number of fake coins found in each bag. His results are summarised in Table 1. \begin{table}[h]
    Number of fake coins in each bag01234 or more
    Observed frequency436226136
    Expected frequency53.856.6\(r\)8.9\(s\)
    \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
  2. Calculate the values of \(r\) and \(s\), giving your answers to 1 decimal place.
  3. Carry out a hypothesis test, at the \(5 \%\) significance level, to see if the data supports the bank manager's statistical model. State your hypotheses clearly. Question 6 parts (d) and (e) are continued on page 24 The assistant manager thinks that a binomial distribution is a good model but suggests that the proportion of fake coins is higher than \(5 \%\). She calculates the actual proportion of fake coins in the sample and uses this value to carry out a new hypothesis test on the data. Her expected frequencies are shown in Table 2. \begin{table}[h]
    Number of fake coins in each bag01234 or more
    Observed frequency436226136
    Expected frequency44.555.733.212.54.1
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  4. Explain why there are 2 degrees of freedom in this case.
  5. Given that she obtains a \(\chi ^ { 2 }\) test statistic of 2.67 , test the assistant manager's hypothesis that the binomial distribution is a good model for the number of fake coins in each bag. Use a \(5 \%\) level of significance and state your hypotheses clearly.
Question 7
View details
7. A petrol pump is tested regularly to check that the reading on its gauge is accurate. The random variable \(X\), in litres, is the quantity of petrol actually dispensed when the gauge reads 10.00 litres. \(X\) is known to have distribution \(X \sim \mathrm {~N} \left( \mu , 0.08 ^ { 2 } \right)\)
  1. Eight random tests gave the following values of \(x\) $$\begin{array} { l l l l l l l l } 10.01 & 9.97 & 9.93 & 9.99 & 9.90 & 9.95 & 10.13 & 9.94 \end{array}$$
    1. Find a 95\% confidence interval for \(\mu\) to 2 decimal places.
    2. Use your result to comment on the accuracy of the petrol gauge.
  2. A sample mean of 9.96 litres was obtained from a random sample of \(n\) tests. A \(90 \%\) confidence interval for \(\mu\) gave an upper limit of less than 10.00 litres. Find the minimum value of \(n\).