OCR MEI S4 (Statistics 4) 2016 June

Question 1
View details
1 The random variable \(X\) has a Cauchy distribution centred on \(m\). Its probability density function ( pdf ) is \(\mathrm { f } ( x )\) where $$\mathrm { f } ( x ) = \frac { 1 } { \pi } \frac { 1 } { 1 + ( x - m ) ^ { 2 } } , \quad \text { for } - \infty < x < \infty$$
  1. Sketch the pdf. Show that the mode and median are at \(x = m\).
  2. A sample of size 1 , consisting of the observation \(x _ { 1 }\), is taken from this distribution. Show that the maximum likelihood estimate (MLE) of \(m\) is \(x _ { 1 }\).
  3. Now suppose that a sample of size 2 , consisting of observations \(x _ { 1 }\) and \(x _ { 2 }\), is taken from the distribution. By considering the logarithm of the likelihood function or otherwise, show that the MLE, \(\hat { m }\), satisfies the cubic equation $$\left( 2 \hat { m } - \left( x _ { 1 } + x _ { 2 } \right) \right) \left( \hat { m } ^ { 2 } - \left( x _ { 1 } + x _ { 2 } \right) \hat { m } + 1 + x _ { 1 } x _ { 2 } \right) = 0$$
  4. Obtain expressions for the three roots of this equation. Show that if \(\left| x _ { 1 } - x _ { 2 } \right| < 2\) then only one root is real. How do you know, without doing further calculations, that in this case the real root will be the MLE of \(m\) ?
  5. Obtain the three possible values of \(\hat { m }\) in the case \(x _ { 1 } = - 2\) and \(x _ { 2 } = 2\). Evaluate the likelihood function for each value of \(\hat { m }\) and comment on your answer.
Question 2
View details
2 The random variable \(X\) has probability density function \(\mathrm { f } ( x )\) where $$\mathrm { f } ( x ) = \lambda \mathrm { e } ^ { - \lambda x } , \quad x > 0 .$$
  1. Obtain the moment generating function (mgf) of \(X\).
  2. Use the mgf to find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\). The random variable \(Y\) is defined as follows: $$Y = X _ { 1 } + \ldots + X _ { n } ,$$ where the \(X _ { i }\) are independently and identically distributed as \(X\).
  3. Write down expressions for \(\mathrm { E } ( Y )\) and \(\operatorname { Var } ( Y )\). Obtain the \(\operatorname { mgf }\) of \(Y\).
  4. Find the \(\operatorname { mgf }\) of \(Z\) where \(Z = \frac { Y - \frac { n } { \lambda } } { \frac { \sqrt { n } } { \lambda } }\).
  5. By considering the logarithm of the mgf of \(Z\), show that the distribution of \(Z\) tends to the standard Normal distribution as \(n\) tends to infinity.
Question 3
View details
3 A large department in a university wished to compare the standards of literacy and numeracy of its students. A random sample of 24 students was taken and sub-divided, randomly, into two groups of 12 . The students in one group took a literacy assessment (scores denoted by \(x\) ); the students in the other group took a numeracy assessment (scores denoted by \(y\) ). The two assessments were designed to give the same distributions of scores when taken by random samples from the general population. The scores obtained by the students on the two assessments are shown in the table.
\(x\)234243464848505458596265
\(y\)443663555358638061578354
$$\sum x = 598 \quad \sum x ^ { 2 } = 31196 \quad \sum y = 707 \quad \sum y ^ { 2 } = 43543$$
  1. Carry out an appropriate \(t\) test, at the \(5 \%\) level of significance, to compare the standards of literacy and numeracy.
  2. State the distributional assumptions required for the \(t\) test to be valid. Name the test that you would use if the assumptions required for the \(t\) test are thought not to hold. State the hypotheses for this new test. Explain, in general terms, which of the two tests is more powerful, and why. A statistician at the university looked at the data and commented that a paired sample design would have been better.
  3. Explain how a paired sample design would be applied in this context, and how the data would be analysed. Explain also why it would be better than the design used.