Questions (30179 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
AQA S3 2006 June Q2
7 marks Standard +0.3
2 The table below shows the heart rates, \(x\) beats per minute, and the systolic blood pressures, \(y\) milligrams of mercury, of a random sample of 10 patients undergoing kidney dialysis.
Patient\(\mathbf { 1 }\)\(\mathbf { 2 }\)\(\mathbf { 3 }\)\(\mathbf { 4 }\)\(\mathbf { 5 }\)\(\mathbf { 6 }\)\(\mathbf { 7 }\)\(\mathbf { 8 }\)\(\mathbf { 9 }\)\(\mathbf { 1 0 }\)
\(\boldsymbol { x }\)838688929498101111115121
\(\boldsymbol { y }\)157172161154171169179180192182
  1. Calculate the value of the product moment correlation coefficient for these data.
  2. Assuming that these data come from a bivariate normal distribution, investigate, at the \(1 \%\) level of significance, the claim that, for patients undergoing kidney dialysis, there is a positive correlation between heart rate and systolic blood pressure.
AQA S3 2006 June Q3
11 marks Moderate -0.3
3 Each enquiry received by a business support unit is dealt with by Ewan, Fay or Gaby. The probabilities of them dealing with an enquiry are \(0.2,0.3\) and 0.5 respectively. Of enquiries dealt with by Ewan, 60\% are answered immediately, 25\% are answered later the same day and the remainder are answered at a later date. Of enquiries dealt with by Fay, 75\% are answered immediately, 15\% are answered later the same day and the remainder are answered at a later date. Of enquiries dealt with by Gaby, 90\% are answered immediately and the remainder are answered at a later date.
  1. Determine the probability that an enquiry:
    1. is dealt with by Gaby and answered immediately;
    2. is answered immediately;
    3. is dealt with by Gaby, given that it is answered immediately.
  2. Determine the probability that an enquiry is dealt with by Ewan, given that it is answered later the same day.
AQA S3 2006 June Q4
6 marks Moderate -0.3
4 The table below shows the probability distribution for the number of students, \(R\), attending classes for a particular mathematics module.
\(\boldsymbol { r }\)678
\(\mathbf { P } ( \boldsymbol { R } = \boldsymbol { r } )\)0.10.60.3
  1. Find values for \(\mathrm { E } ( R )\) and \(\operatorname { Var } ( R )\).
  2. The number of students, \(S\), attending classes for a different mathematics module is such that $$\mathrm { E } ( S ) = 10.9 , \quad \operatorname { Var } ( S ) = 1.69 \quad \text { and } \quad \rho _ { R S } = \frac { 2 } { 3 }$$ Find values for the mean and variance of:
    1. \(T = R + S\);
    2. \(\quad D = S - R\).
AQA S3 2006 June Q5
12 marks Standard +0.3
5 The number of letters per week received at home by Rosa may be modelled by a Poisson distribution with parameter 12.25.
  1. Using a normal approximation, estimate the probability that, during a 4 -week period, Rosa receives at home at least 42 letters but at most 54 letters.
  2. Rosa also receives letters at work. During a 16-week period, she receives at work a total of 248 letters.
    1. Assuming that the number of letters received at work by Rosa may also be modelled by a Poisson distribution, calculate a \(98 \%\) confidence interval for the average number of letters per week received at work by Rosa.
    2. Hence comment on Rosa's belief that she receives, on average, fewer letters at home than at work.
AQA S3 2006 June Q6
8 marks Challenging +1.2
6 The random variable \(X\) has a Poisson distribution with parameter \(\lambda\).
  1. Prove that \(\mathrm { E } ( X ) = \lambda\).
  2. By first proving that \(\mathrm { E } ( X ( X - 1 ) ) = \lambda ^ { 2 }\), or otherwise, prove that \(\operatorname { Var } ( X ) = \lambda\).
AQA S3 2006 June Q7
19 marks Challenging +1.2
7 A shop sells cooked chickens in two sizes: medium and large.
The weights, \(X\) grams, of medium chickens may be assumed to be normally distributed with mean \(\mu _ { X }\) and standard deviation 45. The weights, \(Y\) grams, of large chickens may be assumed to be normally distributed with mean \(\mu _ { Y }\) and standard deviation 65. A random sample of 20 medium chickens had a mean weight, \(\bar { x }\) grams, of 936 .
A random sample of 10 large chickens had the following weights in grams: $$\begin{array} { l l l l l l l l l l } 1165 & 1202 & 1077 & 1144 & 1195 & 1275 & 1136 & 1215 & 1233 & 1288 \end{array}$$
  1. Calculate the mean weight, \(\bar { y }\) grams, of this sample of large chickens.
  2. Hence investigate, at the \(1 \%\) level of significance, the claim that the mean weight of large chickens exceeds that of medium chickens by more than 200 grams.
    1. Deduce that, for your test in part (b), the critical value of \(( \bar { y } - \bar { x } )\) is 253.24, correct to two decimal places.
    2. Hence determine the power of your test in part (b), given that \(\mu _ { Y } - \mu _ { X } = 275\).
    3. Interpret, in the context of this question, the value that you obtained in part (c)(ii).
      (3 marks)
AQA S3 2007 June Q1
8 marks Moderate -0.3
1 As part of an investigation into the starting salaries of graduates in a European country, the following information was collected.
\multirow{2}{*}{}Starting salary (€)
Sample sizeSample meanSample standard deviation
Science graduates175192687321
Arts graduates225178968205
  1. Stating a necessary assumption about the samples, construct a \(98 \%\) confidence interval for the difference between the mean starting salary of science graduates and that of arts graduates.
  2. What can be concluded from your confidence interval?
AQA S3 2007 June Q2
11 marks Moderate -0.8
2 A hill-top monument can be visited by one of three routes: road, funicular railway or cable car. The percentages of visitors using these routes are 25, 35 and 40 respectively. The age distribution, in percentages, of visitors using each route is shown in the table. For example, 15 per cent of visitors using the road were under 18 .
\multirow{2}{*}{}Percentage of visitors using
RoadFunicular railwayCable car
\multirow{3}{*}{Age (years)}Under 18152510
18 to 64806055
Over 6451535
Calculate the probability that a randomly selected visitor:
  1. who used the road is aged 18 or over;
  2. is aged between 18 and 64;
  3. used the funicular railway and is aged over 64;
  4. used the funicular railway, given that the visitor is aged over 64.
AQA S3 2007 June Q3
11 marks Standard +0.8
3 Kutz and Styler are two unisex hair salons. An analysis of a random sample of 150 customers at Kutz shows that 28 per cent are male. An analysis of an independent random sample of 250 customers at Styler shows that 34 per cent are male.
  1. Test, at the \(5 \%\) level of significance, the hypothesis that there is no difference between the proportion of male customers at Kutz and that at Styler.
  2. State, with a reason, the probability of making a Type I error in the test in part (a) if, in fact, the actual difference between the two proportions is 0.05 .
AQA S3 2007 June Q4
6 marks Standard +0.8
4 A machine is used to fill 5-litre plastic containers with vinegar. The volume, in litres, of vinegar in a container filled by the machine may be assumed to be normally distributed with mean \(\mu\) and standard deviation 0.08 . A quality control inspector requires a \(99 \%\) confidence interval for \(\mu\) to be constructed such that it has a width of at most 0.05 litres. Calculate, to the nearest 5, the sample size necessary in order to achieve the inspector's requirement.
AQA S3 2007 June Q5
7 marks Standard +0.3
5 The duration, \(X\) minutes, of a timetabled 1-hour lesson may be assumed to be normally distributed with mean 54 and standard deviation 2. The duration, \(Y\) minutes, of a timetabled \(1 \frac { 1 } { 2 }\)-hour lesson may be assumed to be normally distributed with mean 83 and standard deviation 3. Assuming the durations of lessons to be independent, determine the probability that the total duration of a random sample of three 1 -hour lessons is less than the total duration of a random sample of two \(1 \frac { 1 } { 2 }\)-hour lessons.
(7 marks)
AQA S3 2007 June Q6
20 marks Standard +0.3
6
  1. The random variable \(X\) has a binomial distribution with parameters \(n\) and \(p\).
    1. Prove that \(\mathrm { E } ( X ) = n p\).
    2. Given that \(\mathrm { E } \left( X ^ { 2 } \right) - \mathrm { E } ( X ) = n ( n - 1 ) p ^ { 2 }\), show that \(\operatorname { Var } ( X ) = n p ( 1 - p )\).
    3. Given that \(X\) is found to have a mean of 3 and a variance of 2.97, find values for \(n\) and \(p\).
    4. Hence use a distributional approximation to estimate \(\mathrm { P } ( X > 2 )\).
  2. Dressher is a nationwide chain of stores selling women's clothes. It claims that the probability that a customer who buys clothes from its stores uses a Dressher store card is 0.45 . Assuming this claim to be correct, use a distributional approximation to estimate the probability that, in a random sample of 500 customers who buy clothes from Dressher stores, at least half of them use a Dressher store card.
AQA S3 2007 June Q7
12 marks Standard +0.8
7 In a town, the total number, \(R\), of houses sold during a week by estate agents may be modelled by a Poisson distribution with a mean of 13 . A new housing development is completed in the town. During the first week in which houses on this development are offered for sale by the developer, the estate agents sell a total of 10 houses.
  1. Using the \(10 \%\) level of significance, investigate whether the offer for sale of houses by the developer has resulted in a reduction in the mean value of \(R\).
  2. Determine, for your test in part (a), the critical region for \(R\).
  3. Assuming that the offer for sale of houses on the new housing development has reduced the mean value of \(R\) to 6.5, determine, for a test at the 10\% level of significance, the probability of a Type II error.
    (4 marks)
OCR MEI S3 Q2
20 marks Standard +0.3
2 Geoffrey is a university lecturer. He has to prepare five questions for an examination. He knows by experience that it takes about 3 hours to prepare a question, and he models the time (in minutes) taken to prepare one by the Normally distributed random variable \(X\) with mean 180 and standard deviation 12, independently for all questions.
  1. One morning, Geoffrey has a gap of 2 hours 50 minutes ( 170 minutes) between other activities. Find the probability that he can prepare a question in this time.
  2. One weekend, Geoffrey can devote 14 hours to preparing the complete examination paper. Find the probability that he can prepare all five questions in this time. A colleague, Helen, has to check the questions.
  3. She models the time (in minutes) to check a question by the Normally distributed random variable \(Y\) with mean 50 and standard deviation 6, independently for all questions and independently of \(X\). Find the probability that the total time for Geoffrey to prepare a question and Helen to check it exceeds 4 hours.
  4. When working under pressure of deadlines, Helen models the time to check a question in a different way. She uses the Normally distributed random variable \(\frac { 1 } { 4 } X\), where \(X\) is as above. Find the length of time, as given by this model, which Helen needs to ensure that, with probability 0.9 , she has time to check a question. Ian, an educational researcher, suggests that a better model for the time taken to prepare a question would be a constant \(k\) representing "thinking time" plus a random variable \(T\) representing the time required to write the question itself, independently for all questions.
  5. Taking \(k\) as 45 and \(T\) as Normally distributed with mean 120 and standard deviation 10 (all units are minutes), find the probability according to Ian's model that a question can be prepared in less than 2 hours 30 minutes. Juliet, an administrator, proposes that the examination should be reduced in time and shorter questions should be used.
  6. Juliet suggests that Ian's model should be used for the time taken to prepare such shorter questions but with \(k = 30\) and \(T\) replaced by \(\frac { 3 } { 5 } T\). Find the probability as given by this model that a question can be prepared in less than \(1 \frac { 3 } { 4 }\) hours.
OCR MEI S3 Q4
20 marks Standard +0.3
4 Quality control inspectors in a factory are investigating the lengths of glass tubes that will be used to make laboratory equipment.
  1. Data on the observed lengths of a random sample of 200 glass tubes from one batch are available in the form of a frequency distribution as follows.
  2. Use a suitable statistical procedure to assess the goodness of fit of \(X\) to these data. Discuss your conclusions briefly. 2 A bus route runs from the centre of town A through the town's urban area to a point B on its boundary and then through the country to a small town C . Because of traffic congestion and general road conditions, delays occur on both the urban and the country sections. All delays may be considered independent. The scheduled time for the journey from A to B is 24 minutes. In fact, journey times over this section are given by the Normally distributed random variable \(X\) with mean 26 minutes and standard deviation 3 minutes. The scheduled time for the journey from B to C is 18 minutes. In fact, journey times over this section are given by the Normally distributed random variable \(Y\) with mean 15 minutes and standard deviation 2 minutes. Journey times on the two sections of route may be considered independent. The timetable published to the public does not show details of times at intermediate points; thus, if a bus is running early, it merely continues on its journey and is not required to wait.
  3. Find the probability that a journey from A to B is completed in less than the scheduled time of 24 minutes.
  4. Find the probability that a journey from A to C is completed in less than the scheduled time of 42 minutes.
  5. It is proposed to introduce a system of bus lanes in the urban area. It is believed that this would mean that the journey time from A to B would be given by the random variable \(0.85 X\). Assuming this to be the case, find the probability that a journey from A to B would be completed in less than the currently scheduled time of 24 minutes.
  6. An alternative proposal is to introduce an express service. This would leave out some bus stops on both sections of the route and its overall journey time from A to C would be given by the random variable \(0.9 X + 0.8 Y\). The scheduled time from A to C is to be given as a whole number of minutes. Find the least possible scheduled time such that, with probability 0.75 , buses would complete the journey on time or early.
  7. A programme of minor road improvements is undertaken on the country section. After their completion, it is thought that the random variable giving the journey time from B to C is still Normally distributed with standard deviation 2 minutes. A random sample of 15 journeys is found to have a sample mean journey time from B to C of 13.4 minutes. Provide a two-sided \(95 \%\) confidence interval for the population mean journey time from B to C . 3 An employer has commissioned an opinion polling organisation to undertake a survey of the attitudes of staff to proposed changes in the pension scheme. The staff are categorised as management, professional and administrative, and it is thought that there might be considerable differences of opinion between the categories. There are 60,140 and 300 staff respectively in the categories. The budget for the survey allows for a sample of 40 members of staff to be selected for in-depth interviews.
  8. Explain why it would be unwise to select a simple random sample from all the staff.
  9. Discuss whether it would be sensible to consider systematic sampling.
  10. What are the advantages of stratified sampling in this situation?
  11. State the sample sizes in each category if stratified sampling with as nearly as possible proportional allocation is used. The opinion polling organisation needs to estimate the average wealth of staff in the categories, in terms of property, savings, investments and so on. In a random sample of 11 professional staff, the sample mean is \(\pounds 345818\) and the sample standard deviation is \(\pounds 69241\).
  12. Assuming the underlying population is Normally distributed, test at the \(5 \%\) level of significance the null hypothesis that the population mean is \(\pounds 300000\) against the alternative hypothesis that it is greater than \(\pounds 300000\). Provide also a two-sided \(95 \%\) confidence interval for the population mean.
    [0pt] [10] 4 A company has many factories. It is concerned about incidents of trespassing and, in the hope of reducing if not eliminating these, has embarked on a programme of installing new fencing.
  13. Records for a random sample of 9 factories of the numbers of trespass incidents in typical weeks before and after installation of the new fencing are as follows.
  14. Find the probability that, on a randomly chosen visit, it takes less than 50 minutes to mow the lawns.
  15. Find the probability that, on a randomly chosen visit, the total time for hoeing and pruning is less than 50 minutes.
  16. If Bill mows the lawns while Ben does the hoeing and pruning, find the probability that, on a randomly chosen visit, Ben finishes first. Bill and Ben do my gardening twice a month and send me an invoice at the end of the month.
  17. Write down the mean and variance of the total time (in minutes) they spend on mowing, hoeing and pruning per month.
  18. The company charges for the total time spent at 15 pence per minute. There is also a fixed charge of \(\pounds 10\) per month. Find the probability that the total charge for a month does not exceed \(\pounds 40\). 4 (a) An amateur weather forecaster has been keeping records of air pressure, measured in atmospheres. She takes the measurement at the same time every day using a barometer situated in her garden. A random sample of 100 of her observations is summarised in the table below. The corresponding expected frequencies for a Normal distribution, with its two parameters estimated by sample statistics, are also shown in the table.
  19. Find the probability that the weekly takings for coaches are less than \(\pounds 40000\).
  20. Find the probability that the weekly takings for lorries exceed the weekly takings for cars.
  21. Find the probability that over a 4 -week period the total takings for cars exceed \(\pounds 225000\). What assumption must be made about the four weeks?
  22. Each week the operator allocates part of the takings for repairs. This is determined for each type of vehicle according to estimates of the long-term damage caused. It is calculated as follows: \(5 \%\) of takings for cars, \(10 \%\) for coaches and \(20 \%\) for lorries. Find the probability that in any given week the total amount allocated for repairs will exceed \(\pounds 20000\). 3 The management of a large chain of shops aims to reduce the level of absenteeism among its workforce by means of an incentive bonus scheme. In order to evaluate the effectiveness of the scheme, the management measures the percentage of working days lost before and after its introduction for each of a random sample of 11 shops. The results are shown below.
  23. Give three reasons why a \(t\) test would be appropriate.
  24. Carry out the test using a \(5 \%\) significance level. State your hypotheses and conclusion carefully.
  25. Find a 95\% confidence interval for the true mean temperature in the reaction chamber.
  26. Describe briefly one advantage and one disadvantage of having a 99\% confidence interval instead of a 95\% confidence interval. 4 (a) In Germany, towards the end of the nineteenth century, a study was undertaken into the distribution of the sexes in families of various sizes. The table shows some data about the numbers of girls in 500 families, each with 5 children. It is thought that the binomial distribution \(\mathrm { B } ( 5 , p )\) should model these data.
  27. The grower intends to perform a \(t\) test to examine whether there is any difference in the mean yield of the two types of plant. State the hypotheses he should use and also any necessary assumption.
  28. Carry out the test using a \(5 \%\) significance level.
    (b) The tea grower deals with many types of tea and employs tasters to rate them. The tasters do this by giving each tea a score out of 100. The tea grower wishes to compare the scores given by two of the tasters. Their scores for a random selection of 10 teas are as follows. A Wilcoxon signed rank test is to be used to decide whether there is any evidence of a preference for one of the uniforms.
  29. Explain why this test is appropriate in these circumstances and state the hypotheses that should be used.
  30. Carry out the test at the \(5 \%\) significance level. 4 A random variable \(X\) has probability density function \(\mathrm { f } ( x ) = \frac { 2 x } { \lambda ^ { 2 } }\) for \(0 < x < \lambda\), where \(\lambda\) is a positive constant.
  31. Show that, for any value of \(\lambda , \mathrm { f } ( x )\) is a valid probability density function.
  32. Find \(\mu\), the mean value of \(X\), in terms of \(\lambda\) and show that \(\mathrm { P } ( X < \mu )\) does not depend on \(\lambda\).
  33. Given that \(\mathrm { E } \left( X ^ { 2 } \right) = \frac { \lambda ^ { 2 } } { 2 }\), find \(\sigma ^ { 2 }\), the variance of \(X\), in terms of \(\lambda\). The random variable \(X\) is used to model the depth of the space left by the filling machine at the top of a jar of jam. The model gives the following probabilities for \(X\) (whatever the value of \(\lambda\) ).
  34. Initially it is assumed that the value of \(p\) is \(\frac { 1 } { 2 }\). Test at the \(5 \%\) level of significance whether it is reasonable to suppose that the model applies with \(p = \frac { 1 } { 2 }\).
  35. The model is refined by estimating \(p\) from the data. Find the mean of the observed data and hence an estimate of \(p\).
  36. Using the estimated value of \(p\), the value of the test statistic \(X ^ { 2 }\) turns out to be 2.3857 . Is it reasonable to suppose, at the \(5 \%\) level of significance, that this refined model applies?
  37. Discuss the reasons for the different outcomes of the tests in parts (i) and (iii). 2 (a) A continuous random variable, \(X\), has probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 72 } \left( 8 x - x ^ { 2 } \right) & 2 \leqslant x \leqslant 8 \\ 0 & \text { otherwise } \end{cases}$$
  38. Find \(\mathrm { F } ( x )\), the cumulative distribution function of \(X\).
  39. Sketch \(\mathrm { F } ( x )\).
  40. The median of \(X\) is \(m\). Show that \(m\) satisfies the equation \(m ^ { 3 } - 12 m ^ { 2 } + 148 = 0\). Verify that \(m \approx 4.42\).
    (b) The random variable in part (a) is thought to model the weights, in kilograms, of lambs at birth. The birth weights, in kilograms, of a random sample of 12 lambs, given in ascending order, are as follows. $$\begin{array} { l l l l l l l l l l l l } 3.16 & 3.62 & 3.80 & 3.90 & 4.02 & 4.72 & 5.14 & 6.36 & 6.50 & 6.58 & 6.68 & 6.78 \end{array}$$ Test at the 5\% level of significance whether a median of 4.42 is consistent with these data. 3 Cholesterol is a lipid (fat) which is manufactured by the liver from the fatty foods that we eat. It plays a vital part in allowing the body to function normally. However, when high levels of cholesterol are present in the blood there is a risk of arterial disease. Among the factors believed to assist with achieving and maintaining low cholesterol levels are weight loss and exercise. A doctor wishes to test the effectiveness of exercise in lowering cholesterol levels. For a random sample of 12 of her patients, she measures their cholesterol levels before and after they have followed a programme of exercise. The measurements obtained are as follows. This sample is to be tested to see whether the campaign appears to have been successful in raising the percentage receiving the booster.
  41. Explain why the use of paired data is appropriate in this context.
  42. Carry out an appropriate Wilcoxon signed rank test using these data, at the \(5 \%\) significance level.
    (b) Benford's Law predicts the following probability distribution for the first significant digit in some large data sets.
    Digit123456789
    Probability0.3010.1760.1250.0970.0790.0670.0580.0510.046
    On one particular day, the first significant digits of the stock market prices of the shares of a random sample of 200 companies gave the following results.
    Digit123456789
    Frequency55342716151712159
    Test at the \(10 \%\) level of significance whether Benford's Law provides a reasonable model in the context of share prices. 4 A random variable \(X\) has an exponential distribution with probability density function \(\mathrm { f } ( x ) = \lambda \mathrm { e } ^ { - \lambda x }\) for \(x \geqslant 0\), where \(\lambda\) is a positive constant.
  43. Verify that \(\int _ { 0 } ^ { \infty } \mathrm { f } ( x ) \mathrm { d } x = 1\) and sketch \(\mathrm { f } ( x )\).
  44. In this part of the question you may use the following result. $$\int _ { 0 } ^ { \infty } x ^ { r } \mathrm { e } ^ { - \lambda x } \mathrm {~d} x = \frac { r ! } { \lambda ^ { r + 1 } } \quad \text { for } r = 0,1,2 , \ldots$$ Derive the mean and variance of \(X\) in terms of \(\lambda\). The random variable \(X\) is used to model the lifetime, in years, of a particular type of domestic appliance. The manufacturer of the appliance states that, based on past experience, the mean lifetime is 6 years.
  45. Let \(\bar { X }\) denote the mean lifetime, in years, of a random sample of 50 appliances. Write down an approximate distribution for \(\bar { X }\).
  46. A random sample of 50 appliances is found to have a mean lifetime of 7.8 years. Does this cast any doubt on the model?
Edexcel S4 Q1
6 marks Standard +0.3
  1. A beach is divided into two areas \(A\) and \(B\). A random sample of pebbles is taken from each of the two areas and the length of each pebble is measured. A sample of size 26 is taken from area \(A\) and the unbiased estimate for the population variance is \(s _ { A } ^ { 2 } = 0.495 \mathrm {~mm} ^ { 2 }\). A sample of size 25 is taken from area \(B\) and the unbiased estimate for the population variance is \(s _ { B } ^ { 2 } = 1.04 \mathrm {~mm} ^ { 2 }\).
    1. Stating your hypotheses clearly test, at the \(10 \%\) significance level, whether or not there is a difference in variability of pebble length between area \(A\) and area \(B\).
    2. State the assumption you have made about the populations of pebble lengths in order to carry out the test.
      (1)
    3. A random sample of 10 mustard plants had the following heights, in mm , after 4 days growth.
    $$5.0,4.5,4.8,5.2,4.3,5.1,5.2,4.9,5.1,5.0$$ Those grown previously had a mean height of 5.1 mm after 4 days. Using a \(2.5 \%\) significance level, test whether or not the mean height of these plants is less than that of those grown previously.
    (You may assume that the height of mustard plants after 4 days follows a normal distribution.)
    (9)
    3. A train company claims that the probability \(p\) of one of its trains arriving late is \(10 \%\). A regular traveller on the company's trains believes that the probability is greater than \(10 \%\) and decides to test this by randomly selecting 12 trains and recording the number \(X\) of trains that were late. The traveller sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.1\) and \(\mathrm { H } _ { 1 } : p > 0.1\) and accepts the null hypothesis if \(x \leq 2\).
  2. Find the size of the test.
  3. Show that the power function of the test is $$1 - ( 1 - p ) ^ { 10 } \left( 1 + 10 p + 55 p ^ { 2 } \right)$$
  4. Calculate the power of the test when
    1. \(p = 0.2\),
    2. \(p = 0.6\).
  5. Comment on your results from part (c).
    4. A random sample of 15 tomatoes is taken and the weight \(x\) grams of each tomato is found. The results are summarised by \(\sum x = 208\) and \(\sum x ^ { 2 } = 2962\).
  6. Assuming that the weights of the tomatoes are normally distributed, calculate the \(90 \%\) confidence interval for the variance \(\sigma ^ { 2 }\) of the weights of the tomatoes.
  7. State with a reason whether or not the confidence interval supports the assertion \(\sigma ^ { 2 } = 3\).
    5. (a) Define
    1. a Type I error,
    2. a Type II error. A small aviary, that leaves the eggs with the parent birds, rears chicks at an average rate of 5 per year. In order to increase the number of chicks reared per year it is decided to remove the eggs from the aviary as soon as they are laid and put them in an incubator. At the end of the first year of using an incubator 7 chicks had been successfully reared.
  8. Assuming that the number of chicks reared per year follows a Poisson distribution test, at the \(5 \%\) significance level, whether or not there is evidence of an increase in the number of chicks reared per year. State your hypotheses clearly.
  9. Calculate the probability of the Type I error for this test.
  10. Given that the true average number of chicks reared per year when the eggs are hatched in an incubator is 8, calculate the probability of a Type II error.
    6. A random sample of three independent variables \(X _ { 1 } , X _ { 2 }\) and \(X _ { 3 }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\).
  11. Show that \(\frac { 2 } { 3 } X _ { 1 } - \frac { 1 } { 2 } X _ { 2 } + \frac { 5 } { 6 } X _ { 3 }\) is an unbiased estimator for \(\mu\). An unbiased estimator for \(\mu\) is given by \(\hat { \mu } = a X _ { 1 } + b X _ { 2 }\) where \(a\) and \(b\) are constants.
  12. Show that \(\operatorname { Var } ( \hat { \mu } ) = \left( 2 a ^ { 2 } - 2 a + 1 \right) \sigma ^ { 2 }\).
  13. Hence determine the value of \(a\) and the value of \(b\) for which \(\hat { \mu }\) has minimum variance.
    7. Two methods of extracting juice from an orange are to be compared. Eight oranges are halved. One half of each orange is chosen at random and allocated to Method \(A\) and the other half is allocated to Method \(B\). The amounts of juice extracted, in ml , are given in the table.
    \cline { 2 - 9 } \multicolumn{1}{c|}{}Orange
    \cline { 2 - 9 } \multicolumn{1}{c|}{}12345678
    Method \(A\)2930262526222328
    Method \(B\)2725282423262225
    One statistician suggests performing a two-sample \(t\)-test to investigate whether or not there is a difference between the mean amounts of juice extracted by the two methods.
  14. Stating your hypotheses clearly and using a \(5 \%\) significance level, carry out this test.
    (You may assume \(\bar { x } _ { A } = 26.125 , s _ { A } ^ { 2 } = 7.84 , \bar { x } _ { B } = 25 , s _ { B } ^ { 2 } = 4\) and \(\sigma _ { A } ^ { 2 } = \sigma _ { B } ^ { 2 }\) ) Another statistician suggests analysing these data using a paired \(t\)-test.
  15. Using a \(5 \%\) significance level, carry out this test.
  16. State which of these two tests you consider to be more appropriate. Give a reason for your choice.
    (1) \section*{END} \section*{Advanced/Advanced Subsidiary} Wednesday 16 June 2004 - Afternoon Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes Answer Book (AB16)
    Nil
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S4), the paper reference (6686), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. The random variable \(X\) has an \(F\)-distribution with 8 and 12 degrees of freedom.
    Find \(\mathrm { P } \left( \frac { 1 } { 5.67 } < X < 2.85 \right)\).
Edexcel S4 Q3
10 marks Challenging +1.2
3. A train company claims that the probability \(p\) of one of its trains arriving late is \(10 \%\). A regular traveller on the company's trains believes that the probability is greater than \(10 \%\) and decides to test this by randomly selecting 12 trains and recording the number \(X\) of trains that were late. The traveller sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.1\) and \(\mathrm { H } _ { 1 } : p > 0.1\) and accepts the null hypothesis if \(x \leq 2\).
  1. Find the size of the test.
  2. Show that the power function of the test is $$1 - ( 1 - p ) ^ { 10 } \left( 1 + 10 p + 55 p ^ { 2 } \right)$$
  3. Calculate the power of the test when
    1. \(p = 0.2\),
    2. \(p = 0.6\).
  4. Comment on your results from part (c).
Edexcel S4 Q5
12 marks Standard +0.3
5. (a) Define
  1. a Type I error,
  2. a Type II error. A small aviary, that leaves the eggs with the parent birds, rears chicks at an average rate of 5 per year. In order to increase the number of chicks reared per year it is decided to remove the eggs from the aviary as soon as they are laid and put them in an incubator. At the end of the first year of using an incubator 7 chicks had been successfully reared.
    (b) Assuming that the number of chicks reared per year follows a Poisson distribution test, at the \(5 \%\) significance level, whether or not there is evidence of an increase in the number of chicks reared per year. State your hypotheses clearly.
    (c) Calculate the probability of the Type I error for this test.
    (d) Given that the true average number of chicks reared per year when the eggs are hatched in an incubator is 8, calculate the probability of a Type II error.
Edexcel S4 Q7
16 marks Standard +0.3
7. Two methods of extracting juice from an orange are to be compared. Eight oranges are halved. One half of each orange is chosen at random and allocated to Method \(A\) and the other half is allocated to Method \(B\). The amounts of juice extracted, in ml , are given in the table. The lengths of components produced by the machines can be assumed to follow normal distributions.
  1. Use a two tail test to show, at the \(10 \%\) significance level, that the variances of the lengths of components produced by each machine can be assumed to be equal.
  2. Showing your working clearly, find a \(95 \%\) confidence interval for \(\mu _ { B } - \mu _ { A }\), where \(\mu _ { A }\) and \(\mu _ { B }\) are the mean lengths of the populations of components produced by machine \(A\) and machine \(B\) respectively. There are serious consequences for the production at the factory if the difference in mean lengths of the components produced by the two machines is more than 0.7 cm .
  3. State, giving your reason, whether or not the factory manager should be concerned.
    5. Rolls of cloth delivered to a factory contain defects at an average rate of \(\lambda\) per metre. A quality assurance manager selects a random sample of 15 metres of cloth from each delivery to test whether or not there is evidence that \(\lambda > 0.3\). The criterion that the manager uses for rejecting the hypothesis that \(\lambda = 0.3\) is that there are 9 or more defects in the sample.
  4. Find the size of the test. Table 1 gives some values, to 2 decimal places, of the power function of this test. \begin{table}[h]
  5. Use a paired \(t\)-test to determine, at the \(10 \%\) level of significance, whether or not there is a difference in the mean blood pressure measured using the two methods. State your hypotheses clearly.
  6. State an assumption about the underlying distribution of measured blood pressure required for this test.
    2. The value of orders, in \(\pounds\), made to a firm over the internet has distribution \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\). A random sample of \(n\) orders is taken and \(\bar { X }\) denotes the sample mean.
  7. Write down the mean and variance of \(\bar { X }\) in terms of \(\mu\) and \(\sigma ^ { 2 }\). A second sample of \(m\) orders is taken and \(\bar { Y }\) denotes the mean of this sample.
    An estimator of the population mean is given by $$U = \frac { n \bar { X } + m \bar { Y } } { n + m }$$
  8. Show that \(U\) is an unbiased estimator for \(\mu\).
  9. Show that the variance of \(U\) is \(\frac { \sigma ^ { 2 } } { n + m }\).
  10. State which of \(\bar { X }\) or \(U\) is a better estimator for \(\mu\). Give a reason for your answer.
    3. The lengths, \(x \mathrm {~mm}\), of the forewings of a random sample of male and female adult butterflies are measured. The following statistics are obtained from the data.
  11. Stating your hypotheses clearly, and using a \(10 \%\) level of significance, test whether or not there is evidence of a difference between the variances of the marks of the two groups.
  12. State clearly an assumption you have made to enable you to carry out the test in part (a).
  13. Use a two tailed test, with a \(5 \%\) level of significance, to determine if the playing of music during the test has made any difference in the mean marks of the two groups. State your hypotheses clearly.
  14. Write down what you can conclude about the effect of music on a student's performance during the test.
    3. The weights, in grams, of mice are normally distributed. A biologist takes a random sample of 10 mice. She weighs each mouse and records its weight. The ten mice are then fed on a special diet. They are weighed again after two weeks.
    Their weights in grams are as follows:
  15. State an assumption that needs to be made in order to carry out a \(t\)-test in this case.
  16. State why a paired \(t\)-test is suitable for use with these data.
  17. Using a \(5 \%\) level of significance, test whether or not there is evidence that the device reduces \(\mathrm { CO } _ { 2 }\) emissions from cars.
  18. Explain, in context, what a type II error would be in this case.
    3. Define, in terms of \(\mathrm { H } _ { 0 }\) and/or \(\mathrm { H } _ { 1 }\),
  19. the size of a hypothesis test,
  20. the power of a hypothesis test. The probability of getting a head when a coin is tossed is denoted by \(p\). This coin is tossed 12 times in order to test the hypotheses \(\mathrm { H } _ { 0 } : p = 0.5\) against \(\mathrm { H } _ { 1 } : p \neq 0.5\), using a \(5 \%\) level of significance.
  21. Find the largest critical region for this test, such that the probability in each tail is less than \(2.5 \%\).
  22. Given that \(p = 0.4\)
    1. find the probability of a type II error when using this test,
    2. find the power of this test.
  23. Suggest two ways in which the power of the test can be increased.
    4. A farmer set up a trial to assess whether adding water to dry feed increases the milk yield of his cows. He randomly selected 22 cows. Thirteen of the cows were given dry feed and the other 9 cows were given the feed with water added. The milk yields, in litres per day, were recorded with the following results. You may assume that the times taken to complete the task by the students are two independent random samples from normal distributions.
  24. Stating your hypotheses clearly, test, at the \(10 \%\) level of significance, whether or not the variances of the times taken to complete the task with and without background music are equal.
  25. Find a \(99 \%\) confidence interval for the difference in the mean times taken to complete the task with and without background music. Experiments like this are often performed using the same people in each group.
  26. Explain why this would not be appropriate in this case.
    2. As part of an investigation, a random sample of 10 people had their heart rate, in beats per minute, measured whilst standing up and whilst lying down. The results are summarized below. Stating your hypotheses clearly, test, at the \(10 \%\) level of significance, whether or not the mean amount of juice produced by machine \(B\) is more than the mean amount produced by machine \(A\).
    4. A proportion \(p\) of letters sent by a company are incorrectly addressed and if \(p\) is thought to be greater than 0.05 then action is taken. Using \(\mathrm { H } _ { 0 } : p = 0.05\) and \(\mathrm { H } _ { 1 } : p > 0.05\), a manager from the company takes a random sample of 40 letters and rejects \(\mathrm { H } _ { 0 }\) if the number of incorrectly addressed letters is more than 3 .
  27. Find the size of this test.
  28. Find the probability of a Type II error in the case where \(p\) is in fact 0.10 . Table 1 below gives some values, to 2 decimal places, of the power function of this test. The student decides to carry out a paired \(t\)-test to investigate whether, on average, the blood pressure of a person when sitting down is more than their blood pressure after standing up.
  29. State clearly the hypotheses that should be used and any necessary assumption that needs to be made.
  30. Carry out the test at the \(1 \%\) level of significance.
    2. A biologist investigating the shell size of turtles takes random samples of adult female and adult make turtles and records the length, \(x \mathrm {~cm}\), of the shell. The results are summarised below. Assuming that the scores are normally distributed and stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is evidence to support the teacher's belief.
    (8)
    6. A machine fills bottles with water. The amount of water in each bottle is normally distributed. To check the machine is working properly, a random sample of 12 bottles is selected and the amount of water, in ml , in each bottle is recorded. Unbiased estimates for the mean and variance are $$\mu = 502 \quad s ^ { 2 } = 5.6$$ Stating your hypotheses clearly, test at the \(1 \%\) level of significance
  31. whether or not the mean amount of water in a bottle is more than 500 ml ,
  32. whether or not the standard deviation of the amount of water in a bottle is less than 3 ml .
    7. A machine produces bricks. The lengths, \(x \mathrm {~mm}\), of the bricks are distributed \(\mathrm { N } \left( \mu , 2 ^ { 2 } \right)\). At the start of each week a random sample of \(n\) bricks is taken to check the machine is working correctly.
    A test is then carried out at the \(1 \%\) level of significance with $$\mathrm { H } _ { 0 } : \mu = 202 \quad \text { and } \quad \mathrm { H } _ { 1 } : \mu < 202$$
  33. Find, in terms of \(n\), the critical region of the test. The probability of a type II error, when \(\mu = 200\), is less than 0.05 .
  34. Find the minimum value of \(n\).
Edexcel S4 Q8
17 marks Challenging +1.2
8. A random sample \(W _ { 1 } , W _ { 2 } \ldots , W _ { n }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\).
  1. Write down \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } \right)\) and show that \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } \right) = n \left( \sigma ^ { 2 } + \mu ^ { 2 } \right)\). An estimator for \(\mu\) is $$\bar { X } = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i }$$
  2. Show that \(\bar { X }\) is a consistent estimator for \(\mu\). An estimator of \(\sigma ^ { 2 }\) is $$U = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } - \left( \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } \right) ^ { 2 }$$
  3. Find the bias of \(U\).
  4. Write down an unbiased estimator of \(\sigma ^ { 2 }\) in the form \(k U\), where \(k\) is in terms of \(n\). \section*{Advanced/Advanced Subsidiary} \section*{Friday 21 June 2013 - Morning} Mathematical Formulae (Pink) Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation or symbolic differentiation/integration, or have retrievable mathematical formulae stored in them. In the boxes above, write your centre number, candidate number, your surname, initials and signature. Check that you have the correct question paper.
    Answer ALL the questions.
    You must write your answer for each question in the space following the question.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for the parts of questions are shown in round brackets, e.g. (2).
    There are 6 questions in this question paper. The total mark for this paper is 75.
    There are 20 pages in this question paper. Any blank pages are indicated. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner.
    Answers without working may not gain full credit.
    1. George owns a garage and he records the mileage of cars, \(x\) thousands of miles, between services. The results from a random sample of 10 cars are summarised below.
    $$\sum x = 113.4 \quad \sum x ^ { 2 } = 1414.08$$ The mileage of cars between services is normally distributed and George believes that the standard deviation is 2.4 thousand miles. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not these data support George's belief.
    2. Every 6 months some engineers are tested to see if their times, in minutes, to assemble a particular component have changed. The times taken to assemble the component are normally distributed. A random sample of 8 engineers was chosen and their times to assemble the component were recorded in January and in July. The data are given in the table below.
  5. Using a suitable test, at the \(5 \%\) level of significance, state whether or not, on the basis of this trial, you would recommend using the new medicine. State your hypotheses clearly.
  6. State an assumption needed to carry out this test.
    2. The cloth produced by a certain manufacturer has defects that occur randomly at a constant rate of \(\lambda\) per square metre. If \(\lambda\) is thought to be greater than 1.5 then action has to be taken. Using \(\mathrm { H } _ { 0 } : \lambda = 1.5\) and \(\mathrm { H } _ { 1 } : \lambda > 1.5\) a quality control officer takes a \(4 \mathrm {~m} ^ { 2 }\) sample of cloth and rejects \(\mathrm { H } _ { 0 }\) if there are 11 or more defects. If there are 8 or fewer defects she accepts \(\mathrm { H } _ { 0 }\). If there are 9 or 10 defects a second sample of \(4 \mathrm {~m} ^ { 2 }\) is taken and H 0 is rejected if there are 11 or more defects in this second sample, otherwise it is accepted.
  7. Find the size of this test.
  8. Find the power of this test when \(\lambda = 2\).
    3. A farmer is investigating the milk yields of two breeds of cow. He takes a random sample of 9 cows of breed \(A\) and an independent random sample of 12 cows of breed \(B\). For a 5 day period he measures the amount of milk, \(x\) gallons, produced by each cow. The results are summarised in the table below.
  9. State one assumption that needs to be made in order to carry out a paired \(t\)-test.
  10. Stating your hypotheses clearly, test, at the \(1 \%\) level of significance, whether or not the drug increases the mean number of hours of sleep per night by more than 10 minutes. State the critical value for this test.
    5. A statistician believes a coin is biased and the probability, \(p\), of getting a head when the coin is tossed is less than 0.5 . The statistician decides to test this by tossing the coin 10 times and recording the number, \(X\), of heads. He sets up the hypotheses \(\mathrm { H } _ { 0 } : p = 0.5\) and \(\mathrm { H } _ { 1 } : p < 0.5\) and rejects the null hypothesis if \(x < 3\).
  11. Find the size of the test.
  12. Show that the power function of this test is $$( 1 - p ) ^ { 8 } \left( 36 p ^ { 2 } + 8 p + 1 \right)$$ Table 1 gives values, to 2 decimal places, of the power function for the statistician's test. \begin{table}[h] \section*{Table 1}
  13. On the axes below draw the graph of the power function for the statistician's test.
  14. Find the range of values of \(p\) for which the probability of accepting the coin as unbiased, when in fact it is biased, is less than or equal to 0.4 .
    (3) \includegraphics[max width=\textwidth, alt={}, center]{47023328-16c0-452b-be48-046187e4193e-38_747_792_731_351}
    6. (a) Explain what is meant by the sampling distribution of an estimator \(T\) of the population parameter \(\theta\).
  15. Explain what you understand by the statement that \(T\) is a biased estimator of \(\theta\). A population has mean \(\mu\) and variance \(\sigma ^ { 2 }\).
    A random sample \(X _ { 1 } , X _ { 2 } , \ldots , X _ { 10 }\) is taken from this population.
  16. Calculate the bias of each of the following estimators of \(\mu\). $$\begin{aligned} & \hat { \mu } _ { 1 } = \frac { X _ { 3 } + X _ { 5 } + X _ { 7 } } { 3 } \\ & \hat { \mu } _ { 2 } = \frac { 5 X _ { 1 } + 2 X _ { 2 } + X _ { 9 } } { 6 } \\ & \hat { \mu } _ { 3 } = \frac { 3 X _ { 10 } - X _ { 1 } } { 3 } \end{aligned}$$
  17. Find the variance of each of these three estimators.
  18. State, giving a reason, which of these three estimators for \(\mu\) is
    1. the best estimator,
    2. the worst estimator.
      7. Two groups of students take the same examination. A random sample of students is taken from each of the groups.
      The marks of the 9 students from Group 1 are as follows $$\begin{array} { l l l l l l l l l } 30 & 29 & 35 & 27 & 23 & 33 & 33 & 35 & 28 \end{array}$$ The marks, \(x\), of the 7 students from Group 2 gave the following statistics $$\bar { x } = 31.29 \quad s ^ { 2 } = 12.9$$ A test is to be carried out to see whether or not there is a difference between the mean marks of the two groups of students. You may assume that the samples are taken from normally distributed populations and that they are independent.
  19. State one other assumption that must be made in order to apply this test and show that this assumption is reasonable by testing it at a \(10 \%\) level of significance. State your hypotheses clearly.
  20. Stating your hypotheses clearly, test, using a significance level of \(5 \%\), whether or not there is a difference between the mean marks of the two groups of students. \section*{TOTAL FOR PAPER: 75 MARKS} \section*{END} Materials required for examination
    Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Items included with question papers Nil 6686 Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S4), the paper reference (6686), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. Pages 6, 7 and 8 are blank. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. The random variable \(X\) has an \(F\) distribution with 10 and 12 degrees of freedom. Find \(a\) and \(b\) such that \(\mathrm { P } ( a < X < b ) = 0.90\).
    2. A chemist has developed a fuel additive and claims that it reduces the fuel consumption of cars. To test this claim, 8 randomly selected cars were each filled with 20 litres of fuel and driven around a race circuit. Each car was tested twice, once with the additive and once without. The distances, in miles, that each car travelled before running out of fuel are given in the table below.
    Car12345678
    Distance without additive163172195170183185161176
    Distance with additive168185187172180189172175
    Assuming that the distances travelled follow a normal distribution and stating your hypotheses clearly test, at the \(10 \%\) level of significance, whether or not there is evidence to support the chemist's claim.
    3. A technician is trying to estimate the area \(\mu ^ { 2 }\) of a metal square. The independent random variables \(X _ { 1 }\) and \(X _ { 2 }\) are each distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and represent two measurements of the sides of the square. Two estimators of the area, \(A _ { 1 }\) and \(A _ { 2 }\), are proposed where $$A _ { 1 } = X _ { 1 } X _ { 2 } \quad \text { and } \quad A _ { 2 } = \left( \frac { X _ { 1 } + X _ { 2 } } { 2 } \right) ^ { 2 } .$$ [You may assume that if \(X _ { 1 }\) and \(X _ { 2 }\) are independent random variables then $$\left. \mathrm { E } \left( X _ { 1 } X _ { 2 } \right) = \mathrm { E } \left( X _ { 1 } \right) \mathrm { E } \left( X _ { 2 } \right) \right]$$
  21. Find \(\mathrm { E } \left( A _ { 1 } \right)\) and show that \(\mathrm { E } \left( A _ { 2 } \right) = \mu ^ { 2 } + \frac { \sigma ^ { 2 } } { 2 }\).
  22. Find the bias of each of these estimators. The technician is told that \(\operatorname { Var } \left( A _ { 1 } \right) = \sigma ^ { 4 } + 2 \mu ^ { 2 } \sigma ^ { 2 }\) and \(\operatorname { Var } \left( A _ { 2 } \right) = \frac { 1 } { 2 } \sigma ^ { 4 } + 2 \mu ^ { 2 } \sigma ^ { 2 }\). The technician decided to use \(A _ { 1 }\) as the estimator for \(\mu ^ { 2 }\).
  23. Suggest a possible reason for this decision. A statistician suggests taking a random sample of \(n\) measurements of sides of the square and finding the mean \(\bar { X }\). He knows that \(\mathrm { E } \left( \bar { X } ^ { 2 } \right) = \mu ^ { 2 } + \frac { \sigma ^ { 2 } } { n }\) and \(\operatorname { Var } \left( \bar { X } ^ { 2 } \right) = \frac { 2 \sigma ^ { 4 } } { n ^ { 2 } } + \frac { 4 \sigma ^ { 2 } \mu ^ { 2 } } { n }\).
  24. Explain whether or not \(\bar { X } ^ { 2 }\) is a consistent estimator of \(\mu ^ { 2 }\).
    4. A recent census in the U.K. revealed that the heights of females in the U.K. have a mean of 160.9 cm . A doctor is studying the heights of female Indians in a remote region of South America. The doctor measured the height, \(x \mathrm {~cm}\), of each of a random sample of 30 female Indians and obtained the following statistics. $$\Sigma x = 4400.7 , \quad \Sigma \mathrm { x } ^ { 2 } = 646904.41 .$$ The heights of female Indians may be assumed to follow a normal distribution.
    The doctor presented the results of the study in a medical journal and wrote 'the female Indians in this region are more than 10 cm shorter than females in the U.K.'
  25. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test the doctor's statement. The census also revealed that the standard deviation of the heights of U.K. females was 6.0 cm .
  26. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not there is evidence that the variance of the heights of female Indians is different from that of females in the U.K.
    5. The times, \(x\) seconds, taken by the competitors in the 100 m freestyle events at a school swimming gala are recorded. The following statistics are obtained from the data.
    No. of competitorsSample Mean \(\bar { x }\)\(\sum x ^ { 2 }\)
    Girls883.1055746
    Boys788.9056130
    Following the gala a proud parent claims that girls are faster swimmers than boys. Assuming that the times taken by the competitors are two independent random samples from normal distributions,
  27. test, at the \(10 \%\) level of significance, whether or not the variances of the two distributions are the same. State your hypotheses clearly.
  28. Stating your hypotheses clearly, test the parent's claim. Use a \(5 \%\) level of significance.
    6. A nutritionist studied the levels of cholesterol, \(X \mathrm { mg } / \mathrm { cm } ^ { 3 }\), of male students at a large college. She assumed that \(X\) was distributed \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\) and examined a random sample of 25 male students. Using this sample she obtained unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\) as $$\hat { \mu } = 1.68 , \quad \hat { \sigma } ^ { 2 } = 1.79 .$$
  29. Find a 95\% confidence interval for \(\mu\).
  30. Obtain a \(95 \%\) confidence interval for \(\sigma ^ { 2 }\). A cholesterol reading of more than \(2.5 \mathrm { mg } / \mathrm { cm } ^ { 3 }\) is regarded as high.
  31. Use appropriate confidence limits from parts (a) and (b) to find the lowest estimate of the proportion of male students in the college with high cholesterol.
Edexcel S4 Q3
12 marks Standard +0.3
  1. The weights, in grams, of mice are normally distributed. A biologist takes a random sample of 10 mice. She weighs each mouse and records its weight.
The ten mice are then fed on a special diet. They are weighed again after two weeks.
Their weights in grams are as follows:
MouseA\(B\)CD\(E\)\(F\)G\(H\)\(I\)\(J\)
Weight before diet50.048.347.554.038.942.750.146.840.341.2
Weight after diet52.147.650.152.342.244.351.848.041.943.6
Stating your hypotheses clearly, and using a \(1 \%\) level of significance, test whether or not the diet causes an increase in the mean weight of the mice.
Edexcel S4 Q5
17 marks Standard +0.3
5. A machine is filling bottles of milk. A random sample of 16 bottles was taken and the volume of milk in each bottle was measured and recorded. The volume of milk in a bottle is normally distributed and the unbiased estimate of the variance, \(s ^ { 2 }\), of the volume of milk in a bottle is 0.003
  1. Find a 95\% confidence interval for the variance of the population of volumes of milk from which the sample was taken. The machine should fill bottles so that the standard deviation of the volumes is equal to 0.07
  2. Comment on this with reference to your 95\% confidence interval.
Edexcel S4 Q7
9 marks Standard +0.3
  1. An engineering firm buys steel rods. The steel rods from its present supplier are known to have a mean tensile strength of \(230 \mathrm {~N} / \mathrm { mm } ^ { 2 }\).
A new supplier of steel rods offers to supply rods at a cheaper price than the present supplier. A random sample of ten rods from this new supplier gave tensile strengths, \(x \mathrm { N } / \mathrm { mm } ^ { 2 }\), which are summarised below. Turn over
  1. A company manufactures bolts with a mean diameter of 5 mm . The company wishes to check that the diameter of the bolts has not decreased. A random sample of 10 bolts is taken and the diameters, \(x \mathrm {~mm}\), of the bolts are measured. The results are summarised below.
$$\sum x = 49.1 \quad \sum x ^ { 2 } = 241.2$$ Using a \(1 \%\) level of significance, test whether or not the mean diameter of the bolts is less than 5 mm .
(You may assume that the diameter of the bolts follows a normal distribution.)
2. An emission-control device is tested to see if it reduces \(\mathrm { CO } _ { 2 }\) emissions from cars. The emissions from 6 randomly selected cars are measured with and without the device. The results are as follows. Turn over
advancing learning, changing lives
  1. A teacher wishes to test whether playing background music enables students to complete a task more quickly. The same task was completed by 15 students, divided at random into two groups. The first group had background music playing during the task and the second group had no background music playing.
    The times taken, in minutes, to complete the task are summarised below.
(d) Find the value of \(s\). The graph of the power function for the manager's test is shown in Figure 1. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{a1841cf5-93f3-4043-b6ed-651168b13b87-34_1157_1436_847_260} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure} (e) On the same axes, draw the graph of the power function for the deputy's test.
(f) (i) State the value of \(p\) where these graphs intersect.
(ii) Compare the effectiveness of the two tests if \(p\) is greater than this value. The deputy suggests that they should use his sampling method rather than the manager's.
(g) Give a reason why the manager might not agree to this change.
  1. A random sample of 15 strawberries is taken from a large field and the weight \(x\) grams of each strawberry is recorded. The results are summarised below.
$$\sum x = 291 \quad \sum x ^ { 2 } = 5968$$ Assume that the weights of strawberries are normally distributed. Calculate a 95\% confidence interval for
(a) (i) the mean of the weights of the strawberries in the field,
(ii) the variance of the weights of the strawberries in the field. Strawberries weighing more than 23 g are considered to be less tasty.
(b) Use appropriate confidence limits from part (a) to find the highest estimate of the proportion of strawberries that are considered to be less tasty.
  1. A car manufacturer claims that, on a motorway, the mean number of miles per gallon for the Panther car is more than 70 . To test this claim a car magazine measures the number of miles per gallon, \(x\), of each of a random sample of 20 Panther cars and obtained the following statistics.
$$\bar { x } = 71.2 \quad s = 3.4$$ The number of miles per gallon may be assumed to be normally distributed.
(a) Stating your hypotheses clearly and using a \(5 \%\) level of significance, test the manufacturer's claim. The standard deviation of the number of miles per gallon for the Tiger car is 4 .
(b) Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not there is evidence that the variance of the number of miles per gallon for the Panther car is different from that of the Tiger car.
  1. Faults occur in a roll of material at a rate of \(\lambda\) per \(\mathrm { m } ^ { 2 }\). To estimate \(\lambda\), three pieces of material of sizes \(3 \mathrm {~m} ^ { 2 } , 7 \mathrm {~m} ^ { 2 }\) and \(10 \mathrm {~m} ^ { 2 }\) are selected and the number of faults \(X _ { 1 } , X _ { 2 }\) and \(X _ { 3 }\) respectively are recorded.
The estimator \(\hat { \lambda }\), where $$\hat { \lambda } = k \left( X _ { 1 } + X _ { 2 } + X _ { 3 } \right)$$ is an unbiased estimator of \(\lambda\).
(a) Write down the distributions of \(X _ { 1 } , X _ { 2 }\) and \(X _ { 3 }\) and find the value of \(k\).
(b) Find \(\operatorname { Var } ( \hat { \lambda } )\). A random sample of \(n\) pieces of this material, each of size \(4 \mathrm {~m} ^ { 2 }\), was taken. The number of faults on each piece, \(Y\), was recorded.
(c) Show that \(\frac { 1 } { 4 } \bar { Y }\) is an unbiased estimator of \(\lambda\).
(d) Find \(\operatorname { Var } \left( \frac { 1 } { 4 } \bar { Y } \right)\).
(e) Find the minimum value of \(n\) for which \(\frac { 1 } { 4 } \bar { Y }\) becomes a better estimator of \(\lambda\) than \(\hat { \lambda }\).
Turn over
advancing learning, changing lives
  1. Find the value of the constant \(a\) such that
$$\mathrm { P } \left( a < F _ { 8,10 } < 3.07 \right) = 0.94$$ 2. Two independent random samples \(X _ { 1 } , X _ { 2 } , \ldots , X _ { 7 }\) and \(Y _ { 1 } , Y _ { 2 } , Y _ { 3 } , Y _ { 4 }\) were taken from different normal populations with a common standard deviation \(\sigma\). The following sample statistics were calculated. $$s _ { x } = 14.67 \quad s _ { y } = 12.07$$ Find the \(99 \%\) confidence interval for \(\sigma ^ { 2 }\) based on these two samples.
3. Manuel is planning to buy a new machine to squeeze oranges in his cafe and he has two models, at the same price, on trial. The manufacturers of machine \(B\) claim that their machine produces more juice from an orange than machine \(A\). To test this claim Manuel takes a random sample of 8 oranges, cuts them in half and puts one half in machine \(A\) and the other half in machine \(B\). The amount of juice, in ml , produced by each machine is given in the table below. \section*{Table 1} Figure 1 shows the graph of the power function of the test used by the consultant. \includegraphics[max width=\textwidth, alt={}, center]{a1841cf5-93f3-4043-b6ed-651168b13b87-48_1722_1671_657_132} \section*{Figure 1} (e) On Figure 1 draw the graph of the power function of the manager's test.
(2)
(f) State, giving your reasons, which test you would recommend.
(2)
  1. The weights of the contents of breakfast cereal boxes are normally distributed.
A manufacturer changes the style of the boxes but claims that the weight of the contents remains the same.
A random sample of 6 old style boxes had contents with the following weights (in grams). $$\begin{array} { l l l l l l } 512 & 503 & 514 & 506 & 509 & 515 \end{array}$$ The weights, \(y\) grams, of the contents of an independent random sample of 5 new style boxes gave $$\bar { y } = 504.8 \text { and } s _ { y } = 3.420$$ (a) Use a two-tail test to show, at the \(10 \%\) level of significance, that the variances of the weights of the contents of the old and new style boxes can be assumed to be equal. State your hypotheses clearly.
(b) Showing your working clearly, find a \(90 \%\) confidence interval for \(\mu _ { x } - \mu _ { y }\), where \(\mu _ { x }\) and \(\mu _ { y }\) are the mean weights of the contents of old and new style boxes respectively.
(c) With reference to your confidence interval comment on the manufacturer's claim. 6. A random sample \(X _ { 1 } , X _ { 2 } , \ldots , X _ { n }\) is taken from a population where each of the \(X _ { i }\) have a continuous uniform distribution over the interval \([ 0 , \beta ]\).
The random variable \(Y = \max \left\{ X _ { 1 } , X _ { 2 } , \ldots , X _ { n } \right\}\).
The probability density function of \(Y\) is given by $$\mathrm { f } ( y ) = \left\{ \begin{array} { c c } \frac { n } { \beta ^ { n } } y ^ { n - 1 } & 0 \leqslant y \leqslant \beta \\ 0 & \text { otherwise } \end{array} \right.$$ (a) Show that \(\mathrm { E } \left( Y ^ { m } \right) = \frac { n } { n + m } \beta ^ { m }\).
(b) Write down \(\mathrm { E } ( Y )\).
(c) Using your answers to parts (a) and (b), or otherwise, show that $$\operatorname { Var } ( Y ) = \frac { n } { ( n + 1 ) ^ { 2 } ( n + 2 ) } \beta ^ { 2 }$$ (d) State, giving your reasons, whether or not \(Y\) is a consistent estimator of \(\beta\). The random variables \(M = 2 \bar { X }\), where \(\bar { X } = \frac { 1 } { n } \left( X _ { 1 } + X _ { 2 } + \ldots + X _ { n } \right)\), and \(S = k Y\), where \(k\) is a constant, are both unbiased estimators of \(\beta\).
(e) Find the value of \(k\) in terms of \(n\).
(f) State, giving your reasons, which of \(M\) and \(S\) is the better estimator of \(\beta\) in this case. Five observations of \(X\) are: \(\quad \begin{array} { l l l l l } 8.5 & 6.3 & 5.4 & 9.1 & 7.6 \end{array}\) (g) Calculate the better estimate of \(\beta\). 7. A machine produces components whose lengths are normally distributed with mean 102.3 mm and standard deviation 2.8 mm . After the machine had been serviced, a random sample of 20 components were tested to see if the mean and standard deviation had changed. The lengths, \(x \mathrm {~mm}\), of each of these 20 components are summarised as $$\sum x = 2072 \quad \sum x ^ { 2 } = 214856$$ (a) Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not there is evidence of a change in standard deviation.
(b) Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not the mean length of the components has changed from the original value of 102.3 mm using
(i) a normal distribution,
(ii) a \(t\) distribution.
(c) Comment on the mean length of components produced after the service in the light of the tests from part (a) and part (b). Give a reason for your answer. Turn over
  1. A medical student is investigating whether there is a difference in a person's blood pressure when sitting down and after standing up. She takes a random sample of 12 people and measures their blood pressure, in mmHg , when sitting down and after standing up.
The results are shown below.
Edexcel S4 Q8
12 marks Standard +0.8
  1. A random sample \(W _ { 1 } , W _ { 2 } \ldots , W _ { n }\) is taken from a distribution with mean \(\mu\) and variance \(\sigma ^ { 2 }\)
    1. Write down \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } \right)\) and show that \(\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } \right) = n \left( \sigma ^ { 2 } + \mu ^ { 2 } \right)\)
    An estimator for \(\mu\) is $$\bar { X } = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i }$$
  2. Show that \(\bar { X }\) is a consistent estimator for \(\mu\). An estimator of \(\sigma ^ { 2 }\) is $$U = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } - \left( \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } \right) ^ { 2 }$$
  3. Find the bias of \(U\).
  4. Write down an unbiased estimator of \(\sigma ^ { 2 }\) in the form \(k U\), where \(k\) is in terms of \(n\). Turn over
    1. George owns a garage and he records the mileage of cars, \(x\) thousands of miles, between services. The results from a random sample of 10 cars are summarised below.
    $$\sum x = 113.4 \quad \sum x ^ { 2 } = 1414.08$$ The mileage of cars between services is normally distributed and George believes that the standard deviation is 2.4 thousand miles. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not these data support George’s belief.
    2. Every 6 months some engineers are tested to see if their times, in minutes, to assemble a particular component have changed. The times taken to assemble the component are normally distributed. A random sample of 8 engineers was chosen and their times to assemble the component were recorded in January and in July. The data are given in the table below. \end{table} Table 1 Figure 1 shows a graph of the power function for the scientist's test.
  5. On the same axes draw the graph of the power function for the statistician's test. Given that it takes 20 minutes to collect and test a 20 ml sample and 15 minutes to collect and test a 10 ml sample
  6. show that the expected time of the statistician's test is slower than the scientist's test for \(\lambda \mathrm { e } ^ { - \lambda } > \frac { 1 } { 3 }\)
  7. By considering the times when \(\lambda = 1\) and \(\lambda = 2\) together with the power curves in part (e) suggest, giving a reason, which test you would use.
    (2) \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{a1841cf5-93f3-4043-b6ed-651168b13b87-93_1179_1152_1455_395} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure}
    1. The carbon content, measured in suitable units, of steel is normally distributed. Two independent random samples of steel were taken from a refining plant at different times and their carbon content recorded. The results are given below.
    Sample \(A : \quad 1.5 \quad 0.9 \quad 1.3 \quad 1.2\) \(\begin{array} { l l l l l l l } \text { Sample } B : & 0.4 & 0.6 & 0.8 & 0.3 & 0.5 & 0.4 \end{array}\)
  8. Stating your hypotheses clearly, carry out a suitable test, at the \(10 \%\) level of significance, to show that both samples can be assumed to have come from populations with a common variance \(\sigma ^ { 2 }\).
  9. Showing your working clearly, find the \(99 \%\) confidence interval for \(\sigma ^ { 2 }\) based on both samples.
Edexcel S4 2002 June Q1
3 marks Standard +0.3
  1. The random variable \(X\) has an \(F\) distribution with 10 and 12 degrees of freedom. Find \(a\) and \(b\) such that \(\mathrm { P } ( a < X < b ) = 0.90\).
    (3)
  2. A chemist has developed a fuel additive and claims that it reduces the fuel consumption of cars. To test this claim, 8 randomly selected cars were each filled with 20 litres of fuel and driven around a race circuit. Each car was tested twice, once with the additive and once without. The distances, in miles, that each car travelled before running out of fuel are given in the table below.
Car12345678
Distance without additive163172195170183185161176
Distance with additive168185187172180189172175
Assuming that the distances travelled follow a normal distribution and stating your hypotheses clearly test, at the \(10 \%\) level of significance, whether or not there is evidence to support the chemist's claim.
(8)