Questions S4 (270 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR S4 2015 June Q4
4 The discrete random variable \(Y\) has probability generating function $$\mathrm { G } _ { Y } ( t ) = 0.09 t ^ { 2 } + 0.24 t ^ { 3 } + 0.34 t ^ { 4 } + 0.24 t ^ { 5 } + 0.09 t ^ { 6 }$$
  1. Find the mean and variance of \(Y\).
    \(Y\) is the sum of two independent observations of a random variable \(X\).
  2. Find the probability generating function of \(X\), expressing your answer as a cubic polynomial in \(t\).
  3. Write down the value of \(\mathrm { P } ( X = 2 )\).
OCR S4 2015 June Q5
5 The random variable \(X\) has a Poisson distribution with mean \(\lambda\). It is given that the moment generating function of \(X\) is \(e ^ { \lambda \left( e ^ { t } - 1 \right) }\).
  1. Use the moment generating function to verify that the mean of \(X\) is \(\lambda\), and to show that the variance of \(X\) is also \(\lambda\).
  2. Five independent observations of \(X\) are added to produce a new variable \(Y\). Find the moment generating function of \(Y\), simplifying your answer.
OCR S4 2015 June Q6
6 In a two-tail Wilcoxon rank-sum test, the sample sizes are 13 and 15. The sum of the ranks for the sample of size 13 is 135 . Carry out the test at the \(5 \%\) level of significance.
OCR S4 2015 June Q7
7 The discrete random variable \(X\) can take the values 0,1 and 2 with equal probabilities.
The random variables \(X _ { 1 }\) and \(X _ { 2 }\) are independent observations of \(X\), and the random variables \(Y\) and \(Z\) are defined as follows:
\(Y\) is the smaller of \(X _ { 1 }\) and \(X _ { 2 }\), or their common value if they are equal; \(Z = \left| X _ { 1 } - X _ { 2 } \right|\).
  1. Draw up a table giving the joint distribution of \(Y\) and \(Z\).
  2. Find \(P ( Y = 0 \mid Z = 0 )\).
  3. Find \(\operatorname { Cov } ( Y , Z )\).
OCR S4 2015 June Q8
8 The independent random variables \(X _ { 1 }\) and \(X _ { 2 }\) have the distributions \(\mathrm { B } \left( n _ { 1 } , \theta \right)\) and \(\mathrm { B } \left( n _ { 2 } , \theta \right)\) respectively. Two possible estimators for \(\theta\) are $$T _ { 1 } = \frac { 1 } { 2 } \left( \frac { X _ { 1 } } { n _ { 1 } } + \frac { X _ { 2 } } { n _ { 2 } } \right) \text { and } T _ { 2 } = \frac { X _ { 1 } + X _ { 2 } } { n _ { 1 } + n _ { 2 } } .$$
  1. Show that \(T _ { 1 }\) and \(T _ { 2 }\) are both unbiased estimators, and calculate their variances.
  2. Find \(\frac { \operatorname { Var } \left( T _ { 1 } \right) } { \operatorname { Var } \left( T _ { 2 } \right) }\). Given that \(n _ { 1 } \neq n _ { 2 }\), use the inequality \(\left( n _ { 1 } - n _ { 2 } \right) ^ { 2 } > 0\) to find which of \(T _ { 1 }\) and \(T _ { 2 }\) is the more efficient estimator.
OCR S4 2018 June Q1
1 A Wilcoxon signed-rank test is carried out at the \(5 \%\) level of significance on a random sample of size 32 . The hypotheses are \(\mathrm { H } _ { 0 } : m = m _ { 0 } , \mathrm { H } _ { 1 } : m < m _ { 0 }\) where \(m\) is the population median and \(m _ { 0 }\) is a specific numerical value. The value obtained for the test statistic \(T\) is 162 . Find the outcome of the test.
OCR S4 2018 June Q2
2 The distances from home to work, in km , of 8 men and 5 women were recorded and are given below. The workers were chosen at random.
Men47101316172021
Women12141822
Carry out a Wilcoxon rank-sum test at the \(5 \%\) significance level to test whether there is a significant difference between the distances from home to work between men and women.
OCR S4 2018 June Q3
3 Events \(A\) and \(B\) are such that \(\mathrm { P } ( A ) = 0.6 , \mathrm { P } ( B ) = 0.4\) and \(\mathrm { P } ( A \cup B ) = 0.8\).
  1. Find \(\mathrm { P } ( A \cap B )\).
  2. Find \(\mathrm { P } \left( A \cap B ^ { \prime } \right)\).
  3. Find \(\mathrm { P } ( A \mid B )\). Events \(A\) and \(B\) are as above and a third event \(C\) is such that \(\mathrm { P } ( A \cup B \cup C ) = 1 , \mathrm { P } ( A \cap B \cap C ) = 0.05\), \(\mathrm { P } ( A \cap C ) = \mathrm { P } ( B \cap C )\) and \(\mathrm { P } \left( A \cap B ^ { \prime } \cap C ^ { \prime } \right) = 3 \mathrm { P } \left( A ^ { \prime } \cap B \cap C ^ { \prime } \right)\).
  4. Find \(\mathrm { P } ( C )\).
OCR S4 2018 June Q4
4 The random variable \(X\) has a \(\chi ^ { 2 }\) distribution with \(v\) degrees of freedom. The moment generating function of \(X\) is $$\mathrm { M } _ { X } ( t ) = ( 1 - 2 t ) ^ { - \frac { 1 } { 2 } v }$$
  1. Show that \(\mathrm { E } ( X ) = v\).
  2. Find \(\operatorname { Var } ( X )\).
  3. Obtain the moment generating function of the sum \(Y\) of two independent \(\chi ^ { 2 }\) random variables, one with 6 degrees of freedom and the other with 8 degrees of freedom.
  4. Identify the distribution of \(Y\).
OCR S4 2018 June Q5
5 The independent discrete random variables \(U\) and \(V\) can each take the values 1, 2 and 3, all with probability \(\frac { 1 } { 3 }\). The random variables \(X\) and \(Y\) are defined as follows: $$X = | U - V | , Y = U + V .$$
  1. In the Printed Answer Book complete the table showing the joint probability distribution of \(X\) and \(Y\).
  2. Find \(\operatorname { Cov } ( X , Y )\).
  3. State with a reason whether \(X\) and \(Y\) are independent.
  4. Find \(\mathrm { P } ( Y = 3 \mid X = 1 )\).
OCR S4 2018 June Q6
6 In each round of a quiz a contestant can answer up to three questions. Each correct answer scores 1 point and allows the contestant to go on to the next question. A wrong answer scores 0 points and the contestant is allowed no further question in that round. If all 3 questions are answered correctly 1 bonus point is scored, making a total score of 4 for the round. For a certain contestant, \(A\), the probability of giving a correct answer is \(\frac { 3 } { 4 }\), independently of any other question. The random variable \(X _ { r }\) is the number of points scored by \(A\) during the \(r ^ { \text {th } }\) round.
  1. Find the probability generating function of \(X _ { r }\).
  2. Use the probability generating function found in part (i) to find the mean and variance of \(X _ { r }\).
  3. Write down an expression for the probability generating function of \(X _ { 1 } + X _ { 2 }\) and find the probability that \(A\) has a total score of 4 at the end of two rounds.
OCR S4 2018 June Q7
7 Two independent observations \(X _ { 1 }\) and \(X _ { 2 }\) are made of a continuous random variable with probability density function $$f ( x ) = \begin{cases} \frac { 1 } { \theta } & 0 \leqslant x \leqslant \theta
0 & \text { otherwise } \end{cases}$$ where \(\theta\) is a parameter whose value is to be estimated.
  1. Find \(\mathrm { E } ( X )\).
  2. Show that \(S _ { 1 } = X _ { 1 } + X _ { 2 }\) is an unbiased estimator of \(\theta\).
    \(L\) is the larger of \(X _ { 1 }\) and \(X _ { 2 }\), or their common value if they are equal.
  3. Show that the probability density function of \(L\) is \(\frac { 2 l } { \theta ^ { 2 } }\) for \(0 \leqslant l \leqslant \theta\).
  4. Find \(\mathrm { E } ( L )\).
  5. Find an unbiased estimator \(S _ { 2 }\) of \(\theta\), based on \(L\).
  6. Determine which of the two estimators \(S _ { 1 }\) and \(S _ { 2 }\) is the more efficient.
OCR MEI S4 Q4
12 marks
4 An experiment is carried out to compare five industrial paints, A, B, C, D, E, that are intended to be used to protect exterior surfaces in polluted urban environments. Five different types of surface (I, II, III, IV, V) are to be used in the experiment, and five specimens of each type of surface are available. Five different external locations ( \(1,2,3,4,5\) ) are used in the experiment. The paints are applied to the specimens of the surfaces which are then left in the locations for a period of six months. At the end of this period, a "score" is given to indicate how effective the paint has been in protecting the surface.
  1. Name a suitable experimental design for this trial and give an example of an experimental layout. Initial analysis of the data indicates that any differences between the types of surface are negligible, as also are any differences between the locations. It is therefore decided to analyse the data by one-way analysis of variance.
  2. State the usual model, including the accompanying distributional assumptions, for the one-way analysis of variance. Interpret the terms in the model.
  3. The data for analysis are as follows. Higher scores indicate better performance. The underlying distributions of strengths are assumed to be Normal for both suppliers, with variances 2.45 for supplier A and 1.40 for supplier B.
  4. Test at the \(5 \%\) level of significance whether it is reasonable to assume that the mean strengths from the two suppliers are equal.
  5. Provide a two-sided 90\% confidence interval for the true mean difference.
  6. Show that the test procedure used in part (i), with samples of sizes 7 and 5 and a \(5 \%\) significance level, leads to acceptance of the null hypothesis of equal means if \(- 1.556 < \bar { x } - \bar { y } < 1.556\), where \(\bar { x }\) and \(\bar { y }\) are the observed sample means from suppliers A and B . Hence find the probability of a Type II error for this test procedure if in fact the true mean strength from supplier A is 2.0 units more than that from supplier B.
  7. A manager suggests that the Wilcoxon rank sum test should be used instead, comparing the median strengths for the samples of sizes 7 and 5 . Give one reason why this suggestion might be sensible and two why it might not.
OCR MEI S4 2009 June Q1
1 An industrial process produces components. Some of the components contain faults. The number of faults in a component is modelled by the random variable \(X\) with probability function $$\mathrm { P } ( X = x ) = \theta ( 1 - \theta ) ^ { x } \quad \text { for } x = 0,1,2 , \ldots$$ where \(\theta\) is a parameter with \(0 < \theta < 1\). The numbers of faults in different components are independent.
A random sample of \(n\) components is inspected. \(n _ { 0 }\) are found to have no faults, \(n _ { 1 }\) to have one fault and the remainder \(\left( n - n _ { 0 } - n _ { 1 } \right)\) to have two or more faults.
  1. Find \(\mathrm { P } ( X \geqslant 2 )\) and hence show that the likelihood is $$\mathrm { L } ( \theta ) = \theta ^ { n _ { 0 } + n _ { 1 } } ( 1 - \theta ) ^ { 2 n - 2 n _ { 0 } - n _ { 1 } }$$
  2. Find the maximum likelihood estimator \(\hat { \theta }\) of \(\theta\). You are not required to verify that any turning point you locate is a maximum.
  3. Show that \(\mathrm { E } ( X ) = \frac { 1 - \theta } { \theta }\). Deduce that another plausible estimator of \(\theta\) is \(\tilde { \theta } = \frac { 1 } { 1 + \bar { X } }\) where \(\bar { X }\) is the sample mean. What additional information is needed in order to calculate the value of this estimator?
  4. You are given that, in large samples, \(\tilde { \theta }\) may be taken as Normally distributed with mean \(\theta\) and variance \(\theta ^ { 2 } ( 1 - \theta ) / n\). Use this to obtain a \(95 \%\) confidence interval for \(\theta\) for the case when 100 components are inspected and it is found that 92 have no faults, 6 have one fault and the remaining 2 have exactly four faults each.
OCR MEI S4 2009 June Q2
2
  1. The random variable \(Z\) has the standard Normal distribution with probability density function $$\mathrm { f } ( z ) = \frac { 1 } { \sqrt { 2 \pi } } \mathrm { e } ^ { - z ^ { 2 } / 2 } , \quad - \infty < z < \infty$$ Obtain the moment generating function of \(Z\).
  2. Let \(\mathrm { M } _ { Y } ( t )\) denote the moment generating function of the random variable \(Y\). Show that the moment generating function of the random variable \(a Y + b\), where \(a\) and \(b\) are constants, is \(\mathrm { e } ^ { b t } \mathrm { M } _ { Y } ( a t )\).
  3. Use the results in parts (i) and (ii) to obtain the moment generating function \(\mathrm { M } _ { X } ( t )\) of the random variable \(X\) having the Normal distribution with parameters \(\mu\) and \(\sigma ^ { 2 }\).
  4. If \(W = \mathrm { e } ^ { X }\) where \(X\) is as in part (iii), \(W\) is said to have a lognormal distribution. Show that, for any positive integer \(k\), the expected value of \(W ^ { k }\) is \(\mathrm { M } _ { X } ( k )\). Use this result to find the expected value and variance of the lognormal distribution.
OCR MEI S4 2009 June Q3
3
  1. At a waste disposal station, two methods for incinerating some of the rubbish are being compared. Of interest is the amount of particulates in the exhaust, which can be measured over the working day in a convenient unit of concentration. It is assumed that the underlying distributions of concentrations of particulates are Normal. It is also assumed that the underlying variances are equal. During a period of several months, measurements are made for method A on a random sample of 10 working days and for method B on a separate random sample of 7 working days, with results, in the convenient unit, as follows.
    Method A124.8136.4116.6129.1140.7120.2124.6127.5111.8130.3
    Method B130.4136.2119.8150.6143.5126.1130.7
    Use a \(t\) test at the \(10 \%\) level of significance to examine whether either method is better in resulting, on the whole, in a lower concentration of particulates. State the null and alternative hypotheses under test.
  2. The company's statistician criticises the design of the trial in part (i) on the grounds that it is not paired. Summarise the arguments the statistician will have used. A new trial is set up with a paired design, measuring the concentrations of particulates on a random sample of 9 paired occasions. The results are as follows.
    PairIIIIIIIVVVIVIIVIIIIX
    Method A119.6127.6141.3139.5141.3124.1116.6136.2128.8
    Method B112.2128.8130.2134.0135.1120.4116.9134.4125.2
    Use a \(t\) test at the \(5 \%\) level of significance to examine the same hypotheses as in part (i). State the underlying distributional assumption that is needed in this case.
  3. State the names of procedures that could be used in the situations of parts (i) and (ii) if the underlying distributional assumptions could not be made. What hypotheses would be under test?
OCR MEI S4 2009 June Q4
4
  1. Describe, with the aid of a specific example, an experimental situation for which a Latin square design is appropriate, indicating carefully the features which show that a completely randomised or randomised blocks design would be inappropriate.
  2. The model for the one-way analysis of variance may be written, in a customary notation, as $$x _ { i j } = \mu + \alpha _ { i } + e _ { i j }$$ State the distributional assumptions underlying \(e _ { i j }\) in this model. What is the interpretation of the term \(\alpha _ { i }\) ?
  3. An experiment for comparing 5 treatments is carried out, with a total of 20 observations. A partial one-way analysis of variance table for the analysis of the results is as follows.
    Source of variationSums of squaresDegrees of freedomMean squaresMean square ratio
    Between treatments
    Residual68.76
    Total161.06
    Copy and complete the table, and carry out the appropriate test using a \(1 \%\) significance level.
OCR MEI S4 2011 June Q1
1 The random variable \(X\) has the Normal distribution with mean 0 and variance \(\theta\), so that its probability density function is $$\mathrm { f } ( x ) = \frac { 1 } { \sqrt { 2 \pi \theta } } \mathrm { e } ^ { - x ^ { 2 } / 2 \theta } , \quad - \infty < x < \infty$$ where \(\theta ( \theta > 0 )\) is unknown. A random sample of \(n\) observations from \(X\) is denoted by \(X _ { 1 } , X _ { 2 } , \ldots , X _ { n }\).
  1. Find \(\hat { \theta }\), the maximum likelihood estimator of \(\theta\).
  2. Show that \(\hat { \theta }\) is an unbiased estimator of \(\theta\).
  3. In large samples, the variance of \(\hat { \theta }\) may be estimated by \(\frac { 2 \hat { \theta } ^ { 2 } } { n }\). Use this and the results of parts (i) and (ii) to find an approximate \(95 \%\) confidence interval for \(\theta\) in the case when \(n = 100\) and \(\Sigma X _ { i } ^ { 2 } = 1000\).
OCR MEI S4 2011 June Q2
2 The random variable \(X\) has the \(\chi _ { n } ^ { 2 }\) distribution. This distribution has moment generating function \(\mathrm { M } ( \theta ) = ( 1 - 2 \theta ) ^ { - \frac { 1 } { 2 } n }\), where \(\theta < \frac { 1 } { 2 }\).
  1. Verify the expression for \(\mathrm { M } ( \theta )\) quoted above for the cases \(n = 2\) and \(n = 4\), given that the probability density functions of \(X\) in these cases are as follows. $$\begin{array} { l l } n = 2 : & \mathrm { f } ( x ) = \frac { 1 } { 2 } \mathrm { e } ^ { - \frac { 1 } { 2 } x } \quad ( x > 0 )
    n = 4 : & \mathrm { f } ( x ) = \frac { 1 } { 4 } x \mathrm { e } ^ { - \frac { 1 } { 2 } x } \quad ( x > 0 ) \end{array}$$
  2. For the general case, use \(\mathrm { M } ( \theta )\) to find the mean and variance of \(X\) in terms of \(n\).
  3. \(Y _ { 1 } , Y _ { 2 } , \ldots , Y _ { k }\) are independent random variables, each with the \(\chi _ { 1 } ^ { 2 }\) distribution. Show that \(W = \sum _ { i = 1 } ^ { k } Y _ { i }\) has the \(\chi _ { k } ^ { 2 }\) distribution.
  4. Use the Central Limit Theorem to find an approximation for \(\mathrm { P } ( W < 118.5 )\) for the case \(k = 100\).
OCR MEI S4 2011 June Q3
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic, power.
  2. A market research organisation is designing a sample survey to investigate whether expenditure on everyday food items has increased in 2011 compared with 2010. For one of the populations being studied, the random variable \(X\) is used to model weekly expenditure, in \(\pounds\), on these items in 2011, where \(X\) is Normally distributed with mean \(\mu\) and variance \(\sigma ^ { 2 }\). As the corresponding mean value in 2010 was 94 , the hypotheses to be examined are $$\begin{aligned} & \mathrm { H } _ { 0 } : \mu = 94
    & \mathrm { H } _ { 1 } : \mu > 94 \end{aligned}$$ By comparison with the corresponding 2010 value, \(\sigma ^ { 2 }\) is assumed to be 25 .
    The following criteria for the survey are laid down.
    • If in fact \(\mu = 94\), the probability of concluding that \(\mu > 94\) must be only \(2 \%\)
    • If in fact \(\mu = 97\), the probability of concluding that \(\mu > 94\) must be \(95 \%\)
    A random sample of size \(n\) is to be taken and the usual Normal test based on \(\bar { X }\) is to be used, with a critical value of \(c\) such that \(\mathrm { H } _ { 0 }\) is rejected if the value of \(\bar { X }\) exceeds \(c\). Find \(c\) and the smallest value of \(n\) that is required.
  3. Sketch the power function of an ideal test for examining the hypotheses in part (ii).
OCR MEI S4 2011 June Q4
4
  1. Provide an example of an experimental situation where there is one factor of primary interest and where a suitable experimental design would be
    1. randomised blocks,
    2. a Latin square. In each case, explain carefully why the design is suitable and why the other design would not be appropriate.
  2. An industrial experiment to compare four treatments for increasing the tensile strength of steel is carried out according to a completely randomised design. For various reasons, it is not possible to use the same number of replicates for each treatment. The increases, in a suitable unit of tensile strength, are as follows.
    Treatment
    A
    Treatment
    B
    Treatment
    C
    Treatment
    D
    10.121.19.222.6
    21.220.38.817.4
    11.616.015.223.1
    13.615.019.2
    [The sum of these data items is 256.8 and the sum of their squares is 4471.92 .] Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a \(5 \%\) significance level. RECOGNISING ACHIEVEMENT
OCR MEI S4 2013 June Q1
1 Traffic engineers are studying the flow of vehicles along a road. At an initial stage of the investigation, they assume that the average flow remains the same throughout the working day. An automatic counter records the number of vehicles passing a certain point per minute during the working day. A random sample of these records is selected; the sample values are denoted by \(x _ { 1 } , x _ { 2 } , \ldots , x _ { n }\).
  1. The engineers model the underlying random variable \(X\) by a Poisson distribution with unknown parameter \(\theta\). Obtain the likelihood of \(x _ { 1 } , x _ { 2 } , \ldots , x _ { n }\) and hence find the maximum likelihood estimate of \(\theta\).
  2. Write down the maximum likelihood estimate of the probability that no vehicles pass during a minute.
  3. The engineers note that, in a sample of size 1000 with sample mean \(\bar { x } = 5\), there are no observations of zero. Suggest why this might cast some doubt on the investigation.
  4. On checking the automatic counter, the engineers find that, due to a fault, no record at all is made if no vehicle passes in a minute. They therefore model \(X\) as a Poisson random variable, again with an unknown parameter \(\theta\), except that the value \(x = 0\) cannot occur. Show that, under this model, $$\mathrm { P } ( X = x ) = \frac { \theta ^ { x } } { \left( \mathrm { e } ^ { \theta } - 1 \right) x ! } , \quad x = 1,2 , \ldots$$ and hence show that the maximum likelihood estimate of \(\theta\) satisfies the equation $$\frac { \theta \mathrm { e } ^ { \theta } } { \mathrm { e } ^ { \theta } - 1 } = \bar { x }$$
OCR MEI S4 2013 June Q2
2 The random variable \(X\) takes values \(- 2,0\) and 2 , each with probability \(\frac { 1 } { 3 }\).
  1. Write down the values of
    (A) \(\mu\), the mean of \(X\),
    (B) \(\mathrm { E } \left( X ^ { 2 } \right)\),
    (C) \(\sigma ^ { 2 }\), the variance of \(X\).
  2. Obtain the moment generating function (mgf) of \(X\). A random sample of \(n\) independent observations on \(X\) has sample mean \(\bar { X }\), and the standardised mean is denoted by \(Z\) where $$Z = \frac { \bar { X } - \mu } { \frac { \sigma } { \sqrt { n } } }$$
  3. Stating carefully the required general results for mgfs of sums and of linear transformations, show that the mgf of \(Z\) is $$M _ { Z } ( \theta ) = \left\{ \frac { 1 } { 3 } \left( 1 + e ^ { \frac { \theta \sqrt { 3 } } { \sqrt { 2 n } } } + e ^ { - \frac { \theta \sqrt { 3 } } { \sqrt { 2 n } } } \right) \right\} ^ { n } .$$
  4. By expanding the exponential functions in \(\mathrm { M } _ { Z } ( \theta )\), show that, for large \(n\), $$\mathrm { M } _ { Z } ( \theta ) \approx \left( 1 + \frac { \theta ^ { 2 } } { 2 n } \right) ^ { n }$$
  5. Use the result \(\mathrm { e } ^ { y } = \lim _ { n \rightarrow \infty } \left( 1 + \frac { y } { n } \right) ^ { n }\) to find the limit of \(\mathrm { M } _ { Z } ( \theta )\) as \(n \rightarrow \infty\), and deduce the approximate distribution of \(Z\) for large \(n\).
OCR MEI S4 2013 June Q3
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic, power.
  2. A test is to be carried out concerning a parameter \(\theta\). The null hypothesis is that \(\theta\) has the particular value \(\theta _ { 0 }\). The alternative hypothesis is \(\theta \neq \theta _ { 0 }\). Draw a sketch of the operating characteristic for a perfect test that never makes an error.
  3. The random variable \(X\) is distributed as \(\mathrm { N } ( \mu , 9 )\). A random sample of size 25 is available. The null hypothesis \(\mu = 0\) is to be tested against the alternative hypothesis \(\mu \neq 0\). The null hypothesis will be accepted if \(- 1 < \bar { x } < 1\) where \(\bar { x }\) is the value of the sample mean, otherwise it will be rejected. Calculate the probability of a Type I error. Calculate the probability of a Type II error if in fact \(\mu = 0.5\); comment on the value of this probability.
  4. Without carrying out any further calculations, draw a sketch of the operating characteristic for the test in part (iii).
OCR MEI S4 2013 June Q4
4
  1. Explain the advantages of randomisation and replication in a statistically designed experiment.
  2. The usual statistical model underlying the one-way analysis of variance is given, in the usual notation, by $$x _ { i j } = \mu + \alpha _ { i } + e _ { i j }$$ where \(x _ { i j }\) denotes the \(j\) th observation on the \(i\) th treatment. Define carefully all the terms in this model and state the properties of the term that represents experimental error.
  3. A trial of five fertilisers is carried out at an agricultural research station according to a completely randomised design in which each fertiliser is applied to four experimental plots of a crop (so that there are 20 experimental units altogether). The sums of squares in a one-way analysis of variance of the resulting data on yields of the crop are as follows.
    Source of variationSum of squares
    Between fertilisers219.2
    Residual304.5
    Total523.7
    State the customary null and alternative hypotheses that are tested. Provide the degrees of freedom for each sum of squares. Hence copy and complete the analysis of variance table and carry out the test at the 5\% level.