OCR S4 2008 June — Question 6 15 marks

Exam BoardOCR
ModuleS4 (Statistics 4)
Year2008
SessionJune
Marks15
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicCumulative distribution functions
TypeDistribution of order statistics
DifficultyChallenging +1.8 This S4 question requires understanding of order statistics (showing P(S>s) for the minimum), deriving PDFs from CDFs, calculating expectations to verify/construct unbiased estimators, and comparing efficiency via variance calculations. While methodical, it demands multiple advanced statistical concepts and extended multi-step reasoning beyond typical A-level core content, placing it well above average difficulty but not at the extreme end for Further Maths statistics.
Spec5.03a Continuous random variables: pdf and cdf5.05b Unbiased estimates: of population mean and variance5.05c Hypothesis test: normal distribution for population mean

6 The continuous random variable \(Y\) has cumulative distribution function given by $$\mathrm { F } ( y ) = \begin{cases} 0 & y < a , \\ 1 - \frac { a ^ { 3 } } { y ^ { 3 } } & y \geqslant a , \end{cases}$$ where \(a\) is a positive constant. A random sample of 3 observations, \(Y _ { 1 } , Y _ { 2 } , Y _ { 3 }\), is taken, and the smallest is denoted by \(S\).
  1. Show that \(\mathrm { P } ( S > s ) = \left( \frac { a } { s } \right) ^ { 9 }\) and hence obtain the probability density function of \(S\).
  2. Show that \(S\) is not an unbiased estimator of \(a\), and construct an unbiased estimator, \(T _ { 1 }\), based on \(S\). It is given that \(T _ { 2 }\), where \(T _ { 2 } = \frac { 2 } { 9 } \left( Y _ { 1 } + Y _ { 2 } + Y _ { 3 } \right)\), is another unbiased estimator of \(a\).
  3. Given that \(\operatorname { Var } ( Y ) = \frac { 3 } { 4 } a ^ { 2 }\) and \(\operatorname { Var } ( S ) = \frac { 9 } { 448 } a ^ { 2 }\), determine which of \(T _ { 1 }\) and \(T _ { 2 }\) is the more efficient estimator.
  4. The values of \(Y\) for a particular sample are 12.8, 4.5 and 7.0. Find the values of \(T _ { 1 }\) and \(T _ { 2 }\) for this sample, and give a reason, unrelated to efficiency, why \(T _ { 1 }\) gives a better estimate of \(a\) than \(T _ { 2 }\) in this case.

AnswerMarks Guidance
Integrate \(k_1 e^{a*x}\) to obtain \(k_2 e^{a*x}\)M1 any constants involving π or not; any n
Obtain correct indefinite integral of their \(k_1 e^{a*x}\)A1
Substitute limits to obtain \(\frac{1}{2}e(e^3 - 1)\) or \(\frac{1}{2}(e^3 - 1)\)A1 or exact equiv perhaps involving \(e^0\)
Integrate \(k(2x - 1)^n\) to obtain \(k'(2x - 1)^{n+1}\)M1 any constants involving π or not; any n
Obtain correct indefinite integral of their \(k(2x - 1)^n\)A1
Substitute limits to obtain \(\frac{1}{18}\pi\) or \(\frac{1}{8}\)A1 or exact equiv
Apply formula \(\int \pi y^2 dx\) at least onceB1 for \(y = e^{3x}\) and/or \(y = (2x - 1)^4\)
Subtract, correct way found, attempts at volumesM1 allow with π missing but must involve
Obtain \(\frac{1}{3}\pi e^3 - \frac{2}{3}\pi\)A1 or similarly simplified exact equiv
Integrate $k_1 e^{a*x}$ to obtain $k_2 e^{a*x}$ | M1 | any constants involving π or not; any n
Obtain correct indefinite integral of their $k_1 e^{a*x}$ | A1 | 
Substitute limits to obtain $\frac{1}{2}e(e^3 - 1)$ or $\frac{1}{2}(e^3 - 1)$ | A1 | or exact equiv perhaps involving $e^0$
Integrate $k(2x - 1)^n$ to obtain $k'(2x - 1)^{n+1}$ | M1 | any constants involving π or not; any n
Obtain correct indefinite integral of their $k(2x - 1)^n$ | A1 | 
Substitute limits to obtain $\frac{1}{18}\pi$ or $\frac{1}{8}$ | A1 | or exact equiv
Apply formula $\int \pi y^2 dx$ at least once | B1 | for $y = e^{3x}$ and/or $y = (2x - 1)^4$
Subtract, correct way found, attempts at volumes | M1 | allow with π missing but must involve
Obtain $\frac{1}{3}\pi e^3 - \frac{2}{3}\pi$ | A1 | or similarly simplified exact equiv
6 The continuous random variable $Y$ has cumulative distribution function given by

$$\mathrm { F } ( y ) = \begin{cases} 0 & y < a , \\ 1 - \frac { a ^ { 3 } } { y ^ { 3 } } & y \geqslant a , \end{cases}$$

where $a$ is a positive constant. A random sample of 3 observations, $Y _ { 1 } , Y _ { 2 } , Y _ { 3 }$, is taken, and the smallest is denoted by $S$.\\
(i) Show that $\mathrm { P } ( S > s ) = \left( \frac { a } { s } \right) ^ { 9 }$ and hence obtain the probability density function of $S$.\\
(ii) Show that $S$ is not an unbiased estimator of $a$, and construct an unbiased estimator, $T _ { 1 }$, based on $S$.

It is given that $T _ { 2 }$, where $T _ { 2 } = \frac { 2 } { 9 } \left( Y _ { 1 } + Y _ { 2 } + Y _ { 3 } \right)$, is another unbiased estimator of $a$.\\
(iii) Given that $\operatorname { Var } ( Y ) = \frac { 3 } { 4 } a ^ { 2 }$ and $\operatorname { Var } ( S ) = \frac { 9 } { 448 } a ^ { 2 }$, determine which of $T _ { 1 }$ and $T _ { 2 }$ is the more efficient estimator.\\
(iv) The values of $Y$ for a particular sample are 12.8, 4.5 and 7.0. Find the values of $T _ { 1 }$ and $T _ { 2 }$ for this sample, and give a reason, unrelated to efficiency, why $T _ { 1 }$ gives a better estimate of $a$ than $T _ { 2 }$ in this case.

\hfill \mbox{\textit{OCR S4 2008 Q6 [15]}}