Questions — OCR MEI S4 (40 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR MEI S4 2016 June Q2
2 The random variable \(X\) has probability density function \(\mathrm { f } ( x )\) where $$\mathrm { f } ( x ) = \lambda \mathrm { e } ^ { - \lambda x } , \quad x > 0 .$$
  1. Obtain the moment generating function (mgf) of \(X\).
  2. Use the mgf to find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). The random variable \(Y\) is defined as follows: $$Y = X _ { 1 } + \ldots + X _ { n } ,$$ where the \(X _ { i }\) are independently and identically distributed as \(X\).
  3. Write down expressions for \(\mathrm { E } ( Y )\) and \(\operatorname { Var } ( Y )\). Obtain the \(\operatorname { mgf }\) of \(Y\).
  4. Find the \(\operatorname { mgf }\) of \(Z\) where \(Z = \frac { Y - \frac { n } { \lambda } } { \frac { \sqrt { n } } { \lambda } }\).
  5. By considering the logarithm of the mgf of \(Z\), show that the distribution of \(Z\) tends to the standard Normal distribution as \(n\) tends to infinity.
OCR MEI S4 2016 June Q3
3 A large department in a university wished to compare the standards of literacy and numeracy of its students. A random sample of 24 students was taken and sub-divided, randomly, into two groups of 12 . The students in one group took a literacy assessment (scores denoted by \(x\) ); the students in the other group took a numeracy assessment (scores denoted by \(y\) ). The two assessments were designed to give the same distributions of scores when taken by random samples from the general population. The scores obtained by the students on the two assessments are shown in the table.
\(x\)234243464848505458596265
\(y\)443663555358638061578354
$$\sum x = 598 \quad \sum x ^ { 2 } = 31196 \quad \sum y = 707 \quad \sum y ^ { 2 } = 43543$$
  1. Carry out an appropriate \(t\) test, at the \(5 \%\) level of significance, to compare the standards of literacy and numeracy.
  2. State the distributional assumptions required for the \(t\) test to be valid. Name the test that you would use if the assumptions required for the \(t\) test are thought not to hold. State the hypotheses for this new test. Explain, in general terms, which of the two tests is more powerful, and why. A statistician at the university looked at the data and commented that a paired sample design would have been better.
  3. Explain how a paired sample design would be applied in this context, and how the data would be analysed. Explain also why it would be better than the design used.
OCR MEI S4 Q4
12 marks
4 An experiment is carried out to compare five industrial paints, A, B, C, D, E, that are intended to be used to protect exterior surfaces in polluted urban environments. Five different types of surface (I, II, III, IV, V) are to be used in the experiment, and five specimens of each type of surface are available. Five different external locations ( \(1,2,3,4,5\) ) are used in the experiment. The paints are applied to the specimens of the surfaces which are then left in the locations for a period of six months. At the end of this period, a "score" is given to indicate how effective the paint has been in protecting the surface.
  1. Name a suitable experimental design for this trial and give an example of an experimental layout. Initial analysis of the data indicates that any differences between the types of surface are negligible, as also are any differences between the locations. It is therefore decided to analyse the data by one-way analysis of variance.
  2. State the usual model, including the accompanying distributional assumptions, for the one-way analysis of variance. Interpret the terms in the model.
  3. The data for analysis are as follows. Higher scores indicate better performance. The underlying distributions of strengths are assumed to be Normal for both suppliers, with variances 2.45 for supplier A and 1.40 for supplier B.
  4. Test at the \(5 \%\) level of significance whether it is reasonable to assume that the mean strengths from the two suppliers are equal.
  5. Provide a two-sided 90\% confidence interval for the true mean difference.
  6. Show that the test procedure used in part (i), with samples of sizes 7 and 5 and a \(5 \%\) significance level, leads to acceptance of the null hypothesis of equal means if \(- 1.556 < \bar { x } - \bar { y } < 1.556\), where \(\bar { x }\) and \(\bar { y }\) are the observed sample means from suppliers A and B . Hence find the probability of a Type II error for this test procedure if in fact the true mean strength from supplier A is 2.0 units more than that from supplier B.
  7. A manager suggests that the Wilcoxon rank sum test should be used instead, comparing the median strengths for the samples of sizes 7 and 5 . Give one reason why this suggestion might be sensible and two why it might not.
OCR MEI S4 2009 June Q1
1 An industrial process produces components. Some of the components contain faults. The number of faults in a component is modelled by the random variable \(X\) with probability function $$\mathrm { P } ( X = x ) = \theta ( 1 - \theta ) ^ { x } \quad \text { for } x = 0,1,2 , \ldots$$ where \(\theta\) is a parameter with \(0 < \theta < 1\). The numbers of faults in different components are independent.
A random sample of \(n\) components is inspected. \(n _ { 0 }\) are found to have no faults, \(n _ { 1 }\) to have one fault and the remainder \(\left( n - n _ { 0 } - n _ { 1 } \right)\) to have two or more faults.
  1. Find \(\mathrm { P } ( X \geqslant 2 )\) and hence show that the likelihood is $$\mathrm { L } ( \theta ) = \theta ^ { n _ { 0 } + n _ { 1 } } ( 1 - \theta ) ^ { 2 n - 2 n _ { 0 } - n _ { 1 } }$$
  2. Find the maximum likelihood estimator \(\hat { \theta }\) of \(\theta\). You are not required to verify that any turning point you locate is a maximum.
  3. Show that \(\mathrm { E } ( X ) = \frac { 1 - \theta } { \theta }\). Deduce that another plausible estimator of \(\theta\) is \(\tilde { \theta } = \frac { 1 } { 1 + \bar { X } }\) where \(\bar { X }\) is the sample mean. What additional information is needed in order to calculate the value of this estimator?
  4. You are given that, in large samples, \(\tilde { \theta }\) may be taken as Normally distributed with mean \(\theta\) and variance \(\theta ^ { 2 } ( 1 - \theta ) / n\). Use this to obtain a \(95 \%\) confidence interval for \(\theta\) for the case when 100 components are inspected and it is found that 92 have no faults, 6 have one fault and the remaining 2 have exactly four faults each.
OCR MEI S4 2009 June Q2
2
  1. The random variable \(Z\) has the standard Normal distribution with probability density function $$\mathrm { f } ( z ) = \frac { 1 } { \sqrt { 2 \pi } } \mathrm { e } ^ { - z ^ { 2 } / 2 } , \quad - \infty < z < \infty$$ Obtain the moment generating function of \(Z\).
  2. Let \(\mathrm { M } _ { Y } ( t )\) denote the moment generating function of the random variable \(Y\). Show that the moment generating function of the random variable \(a Y + b\), where \(a\) and \(b\) are constants, is \(\mathrm { e } ^ { b t } \mathrm { M } _ { Y } ( a t )\).
  3. Use the results in parts (i) and (ii) to obtain the moment generating function \(\mathrm { M } _ { X } ( t )\) of the random variable \(X\) having the Normal distribution with parameters \(\mu\) and \(\sigma ^ { 2 }\).
  4. If \(W = \mathrm { e } ^ { X }\) where \(X\) is as in part (iii), \(W\) is said to have a lognormal distribution. Show that, for any positive integer \(k\), the expected value of \(W ^ { k }\) is \(\mathrm { M } _ { X } ( k )\). Use this result to find the expected value and variance of the lognormal distribution.
OCR MEI S4 2009 June Q3
3
  1. At a waste disposal station, two methods for incinerating some of the rubbish are being compared. Of interest is the amount of particulates in the exhaust, which can be measured over the working day in a convenient unit of concentration. It is assumed that the underlying distributions of concentrations of particulates are Normal. It is also assumed that the underlying variances are equal. During a period of several months, measurements are made for method A on a random sample of 10 working days and for method B on a separate random sample of 7 working days, with results, in the convenient unit, as follows.
    Method A124.8136.4116.6129.1140.7120.2124.6127.5111.8130.3
    Method B130.4136.2119.8150.6143.5126.1130.7
    Use a \(t\) test at the \(10 \%\) level of significance to examine whether either method is better in resulting, on the whole, in a lower concentration of particulates. State the null and alternative hypotheses under test.
  2. The company's statistician criticises the design of the trial in part (i) on the grounds that it is not paired. Summarise the arguments the statistician will have used. A new trial is set up with a paired design, measuring the concentrations of particulates on a random sample of 9 paired occasions. The results are as follows.
    PairIIIIIIIVVVIVIIVIIIIX
    Method A119.6127.6141.3139.5141.3124.1116.6136.2128.8
    Method B112.2128.8130.2134.0135.1120.4116.9134.4125.2
    Use a \(t\) test at the \(5 \%\) level of significance to examine the same hypotheses as in part (i). State the underlying distributional assumption that is needed in this case.
  3. State the names of procedures that could be used in the situations of parts (i) and (ii) if the underlying distributional assumptions could not be made. What hypotheses would be under test?
OCR MEI S4 2009 June Q4
4
  1. Describe, with the aid of a specific example, an experimental situation for which a Latin square design is appropriate, indicating carefully the features which show that a completely randomised or randomised blocks design would be inappropriate.
  2. The model for the one-way analysis of variance may be written, in a customary notation, as $$x _ { i j } = \mu + \alpha _ { i } + e _ { i j }$$ State the distributional assumptions underlying \(e _ { i j }\) in this model. What is the interpretation of the term \(\alpha _ { i }\) ?
  3. An experiment for comparing 5 treatments is carried out, with a total of 20 observations. A partial one-way analysis of variance table for the analysis of the results is as follows.
    Source of variationSums of squaresDegrees of freedomMean squaresMean square ratio
    Between treatments
    Residual68.76
    Total161.06
    Copy and complete the table, and carry out the appropriate test using a \(1 \%\) significance level.
OCR MEI S4 2011 June Q1
1 The random variable \(X\) has the Normal distribution with mean 0 and variance \(\theta\), so that its probability density function is $$\mathrm { f } ( x ) = \frac { 1 } { \sqrt { 2 \pi \theta } } \mathrm { e } ^ { - x ^ { 2 } / 2 \theta } , \quad - \infty < x < \infty$$ where \(\theta ( \theta > 0 )\) is unknown. A random sample of \(n\) observations from \(X\) is denoted by \(X _ { 1 } , X _ { 2 } , \ldots , X _ { n }\).
  1. Find \(\hat { \theta }\), the maximum likelihood estimator of \(\theta\).
  2. Show that \(\hat { \theta }\) is an unbiased estimator of \(\theta\).
  3. In large samples, the variance of \(\hat { \theta }\) may be estimated by \(\frac { 2 \hat { \theta } ^ { 2 } } { n }\). Use this and the results of parts (i) and (ii) to find an approximate \(95 \%\) confidence interval for \(\theta\) in the case when \(n = 100\) and \(\Sigma X _ { i } ^ { 2 } = 1000\).
OCR MEI S4 2011 June Q2
2 The random variable \(X\) has the \(\chi _ { n } ^ { 2 }\) distribution. This distribution has moment generating function \(\mathrm { M } ( \theta ) = ( 1 - 2 \theta ) ^ { - \frac { 1 } { 2 } n }\), where \(\theta < \frac { 1 } { 2 }\).
  1. Verify the expression for \(\mathrm { M } ( \theta )\) quoted above for the cases \(n = 2\) and \(n = 4\), given that the probability density functions of \(X\) in these cases are as follows. $$\begin{array} { l l } n = 2 : & \mathrm { f } ( x ) = \frac { 1 } { 2 } \mathrm { e } ^ { - \frac { 1 } { 2 } x } \quad ( x > 0 )
    n = 4 : & \mathrm { f } ( x ) = \frac { 1 } { 4 } x \mathrm { e } ^ { - \frac { 1 } { 2 } x } \quad ( x > 0 ) \end{array}$$
  2. For the general case, use \(\mathrm { M } ( \theta )\) to find the mean and variance of \(X\) in terms of \(n\).
  3. \(Y _ { 1 } , Y _ { 2 } , \ldots , Y _ { k }\) are independent random variables, each with the \(\chi _ { 1 } ^ { 2 }\) distribution. Show that \(W = \sum _ { i = 1 } ^ { k } Y _ { i }\) has the \(\chi _ { k } ^ { 2 }\) distribution.
  4. Use the Central Limit Theorem to find an approximation for \(\mathrm { P } ( W < 118.5 )\) for the case \(k = 100\).
OCR MEI S4 2011 June Q3
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic, power.
  2. A market research organisation is designing a sample survey to investigate whether expenditure on everyday food items has increased in 2011 compared with 2010. For one of the populations being studied, the random variable \(X\) is used to model weekly expenditure, in \(\pounds\), on these items in 2011, where \(X\) is Normally distributed with mean \(\mu\) and variance \(\sigma ^ { 2 }\). As the corresponding mean value in 2010 was 94 , the hypotheses to be examined are $$\begin{aligned} & \mathrm { H } _ { 0 } : \mu = 94
    & \mathrm { H } _ { 1 } : \mu > 94 \end{aligned}$$ By comparison with the corresponding 2010 value, \(\sigma ^ { 2 }\) is assumed to be 25 .
    The following criteria for the survey are laid down.
    • If in fact \(\mu = 94\), the probability of concluding that \(\mu > 94\) must be only \(2 \%\)
    • If in fact \(\mu = 97\), the probability of concluding that \(\mu > 94\) must be \(95 \%\)
    A random sample of size \(n\) is to be taken and the usual Normal test based on \(\bar { X }\) is to be used, with a critical value of \(c\) such that \(\mathrm { H } _ { 0 }\) is rejected if the value of \(\bar { X }\) exceeds \(c\). Find \(c\) and the smallest value of \(n\) that is required.
  3. Sketch the power function of an ideal test for examining the hypotheses in part (ii).
OCR MEI S4 2011 June Q4
4
  1. Provide an example of an experimental situation where there is one factor of primary interest and where a suitable experimental design would be
    1. randomised blocks,
    2. a Latin square. In each case, explain carefully why the design is suitable and why the other design would not be appropriate.
  2. An industrial experiment to compare four treatments for increasing the tensile strength of steel is carried out according to a completely randomised design. For various reasons, it is not possible to use the same number of replicates for each treatment. The increases, in a suitable unit of tensile strength, are as follows.
    Treatment
    A
    Treatment
    B
    Treatment
    C
    Treatment
    D
    10.121.19.222.6
    21.220.38.817.4
    11.616.015.223.1
    13.615.019.2
    [The sum of these data items is 256.8 and the sum of their squares is 4471.92 .] Construct the usual one-way analysis of variance table. Carry out the appropriate test, using a \(5 \%\) significance level. RECOGNISING ACHIEVEMENT
OCR MEI S4 2013 June Q1
1 Traffic engineers are studying the flow of vehicles along a road. At an initial stage of the investigation, they assume that the average flow remains the same throughout the working day. An automatic counter records the number of vehicles passing a certain point per minute during the working day. A random sample of these records is selected; the sample values are denoted by \(x _ { 1 } , x _ { 2 } , \ldots , x _ { n }\).
  1. The engineers model the underlying random variable \(X\) by a Poisson distribution with unknown parameter \(\theta\). Obtain the likelihood of \(x _ { 1 } , x _ { 2 } , \ldots , x _ { n }\) and hence find the maximum likelihood estimate of \(\theta\).
  2. Write down the maximum likelihood estimate of the probability that no vehicles pass during a minute.
  3. The engineers note that, in a sample of size 1000 with sample mean \(\bar { x } = 5\), there are no observations of zero. Suggest why this might cast some doubt on the investigation.
  4. On checking the automatic counter, the engineers find that, due to a fault, no record at all is made if no vehicle passes in a minute. They therefore model \(X\) as a Poisson random variable, again with an unknown parameter \(\theta\), except that the value \(x = 0\) cannot occur. Show that, under this model, $$\mathrm { P } ( X = x ) = \frac { \theta ^ { x } } { \left( \mathrm { e } ^ { \theta } - 1 \right) x ! } , \quad x = 1,2 , \ldots$$ and hence show that the maximum likelihood estimate of \(\theta\) satisfies the equation $$\frac { \theta \mathrm { e } ^ { \theta } } { \mathrm { e } ^ { \theta } - 1 } = \bar { x }$$
OCR MEI S4 2013 June Q2
2 The random variable \(X\) takes values \(- 2,0\) and 2 , each with probability \(\frac { 1 } { 3 }\).
  1. Write down the values of
    (A) \(\mu\), the mean of \(X\),
    (B) \(\mathrm { E } \left( X ^ { 2 } \right)\),
    (C) \(\sigma ^ { 2 }\), the variance of \(X\).
  2. Obtain the moment generating function (mgf) of \(X\). A random sample of \(n\) independent observations on \(X\) has sample mean \(\bar { X }\), and the standardised mean is denoted by \(Z\) where $$Z = \frac { \bar { X } - \mu } { \frac { \sigma } { \sqrt { n } } }$$
  3. Stating carefully the required general results for mgfs of sums and of linear transformations, show that the mgf of \(Z\) is $$M _ { Z } ( \theta ) = \left\{ \frac { 1 } { 3 } \left( 1 + e ^ { \frac { \theta \sqrt { 3 } } { \sqrt { 2 n } } } + e ^ { - \frac { \theta \sqrt { 3 } } { \sqrt { 2 n } } } \right) \right\} ^ { n } .$$
  4. By expanding the exponential functions in \(\mathrm { M } _ { Z } ( \theta )\), show that, for large \(n\), $$\mathrm { M } _ { Z } ( \theta ) \approx \left( 1 + \frac { \theta ^ { 2 } } { 2 n } \right) ^ { n }$$
  5. Use the result \(\mathrm { e } ^ { y } = \lim _ { n \rightarrow \infty } \left( 1 + \frac { y } { n } \right) ^ { n }\) to find the limit of \(\mathrm { M } _ { Z } ( \theta )\) as \(n \rightarrow \infty\), and deduce the approximate distribution of \(Z\) for large \(n\).
OCR MEI S4 2013 June Q3
3
  1. Explain the meaning of the following terms in the context of hypothesis testing: Type I error, Type II error, operating characteristic, power.
  2. A test is to be carried out concerning a parameter \(\theta\). The null hypothesis is that \(\theta\) has the particular value \(\theta _ { 0 }\). The alternative hypothesis is \(\theta \neq \theta _ { 0 }\). Draw a sketch of the operating characteristic for a perfect test that never makes an error.
  3. The random variable \(X\) is distributed as \(\mathrm { N } ( \mu , 9 )\). A random sample of size 25 is available. The null hypothesis \(\mu = 0\) is to be tested against the alternative hypothesis \(\mu \neq 0\). The null hypothesis will be accepted if \(- 1 < \bar { x } < 1\) where \(\bar { x }\) is the value of the sample mean, otherwise it will be rejected. Calculate the probability of a Type I error. Calculate the probability of a Type II error if in fact \(\mu = 0.5\); comment on the value of this probability.
  4. Without carrying out any further calculations, draw a sketch of the operating characteristic for the test in part (iii).
OCR MEI S4 2013 June Q4
4
  1. Explain the advantages of randomisation and replication in a statistically designed experiment.
  2. The usual statistical model underlying the one-way analysis of variance is given, in the usual notation, by $$x _ { i j } = \mu + \alpha _ { i } + e _ { i j }$$ where \(x _ { i j }\) denotes the \(j\) th observation on the \(i\) th treatment. Define carefully all the terms in this model and state the properties of the term that represents experimental error.
  3. A trial of five fertilisers is carried out at an agricultural research station according to a completely randomised design in which each fertiliser is applied to four experimental plots of a crop (so that there are 20 experimental units altogether). The sums of squares in a one-way analysis of variance of the resulting data on yields of the crop are as follows.
    Source of variationSum of squares
    Between fertilisers219.2
    Residual304.5
    Total523.7
    State the customary null and alternative hypotheses that are tested. Provide the degrees of freedom for each sum of squares. Hence copy and complete the analysis of variance table and carry out the test at the 5\% level.