OCR MEI S4 2012 June — Question 1 24 marks

Exam BoardOCR MEI
ModuleS4 (Statistics 4)
Year2012
SessionJune
Marks24
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicCentral limit theorem
TypeEstimator properties and bias
DifficultyStandard +0.3 This is a structured multi-part question that guides students through standard probability theory results (mixture distributions, law of total expectation/variance) with clear hints provided. While it requires careful algebraic manipulation and understanding of sampling distributions, each part follows directly from the previous one with no novel insights required. The CLT application in part (iv) is routine for S4 level.
Spec5.03a Continuous random variables: pdf and cdf5.03b Solve problems: using pdf5.04a Linear combinations: E(aX+bY), Var(aX+bY)

1 In a certain country, any baby born is equally likely to be a boy or a girl, independently for all births. The birthweight of a baby boy is given by the continuous random variable \(X _ { B }\) with probability density function (pdf) \(\mathrm { f } _ { B } ( x )\) and cumulative distribution function (cdf) \(\mathrm { F } _ { B } ( x )\). The birthweight of a baby girl is given by the continuous random variable \(X _ { G }\) with pdf \(\mathrm { f } _ { G } ( x )\) and cdf \(\mathrm { F } _ { G } ( x )\). The continuous random variable \(X\) denotes the birthweight of a baby selected at random.
  1. By considering $$\mathrm { P } ( X \leqslant x ) = \mathrm { P } ( X \leqslant x \mid \text { boy } ) \mathrm { P } ( \text { boy } ) + \mathrm { P } ( X \leqslant x \mid \text { girl } ) \mathrm { P } ( \text { girl } ) ,$$ find the cdf of \(X\) in terms of \(\mathrm { F } _ { B } ( x )\) and \(\mathrm { F } _ { G } ( x )\), and deduce that the pdf of \(X\) is $$\mathrm { f } ( x ) = \frac { 1 } { 2 } \left\{ \mathrm { f } _ { B } ( x ) + \mathrm { f } _ { G } ( x ) \right\} .$$
  2. The birthweights of baby boys and girls have means \(\mu _ { B }\) and \(\mu _ { G }\) respectively. Deduce that $$\mathrm { E } ( X ) = \frac { 1 } { 2 } \left( \mu _ { B } + \mu _ { G } \right) .$$
  3. The birthweights of baby boys and girls have common variance \(\sigma ^ { 2 }\). Find an expression for \(\mathrm { E } \left( X ^ { 2 } \right)\) in terms of \(\mu _ { B } , \mu _ { G }\) and \(\sigma ^ { 2 }\), and deduce that $$\operatorname { Var } ( X ) = \sigma ^ { 2 } + \frac { 1 } { 4 } \left( \mu _ { B } - \mu _ { G } \right) ^ { 2 } .$$
  4. A random sample of size \(2 n\) is taken from all the babies born in a certain period. The mean birthweight of the babies in this sample is \(\bar { X }\). Write down an approximation to the sampling distribution of \(\bar { X }\) if \(n\) is large.
  5. Suppose instead that a stratified sample of size \(2 n\) is taken by selecting \(n\) baby boys at random and, independently, \(n\) baby girls at random. The mean birthweight of the \(2 n\) babies in this sample is \(\bar { X } _ { s t }\). Write down the expected value of \(\bar { X } _ { s t }\) and find the variance of \(\bar { X } _ { s t }\).
  6. Deduce that both \(\bar { X }\) and \(\bar { X } _ { s t }\) are unbiased estimators of the population mean birthweight. Find which is the more efficient.

Part (vi)
AnswerMarks Guidance
\(E(\bar{X}) = E(\bar{X}_n) = \frac{1}{2}(\mu_B + \mu_G) = E(X)\) ie they are unbiased. Clearly \(\text{Var}(\bar{X}) > \text{Var}(\bar{X}_n)\), therefore \(\bar{X}_n\) is the more efficient.E1, E1, M1, M1, E1 For any attempt to compare variances. Candidates are not required to note that the variances are equal in the case \(\mu_B = \mu_G\). For deduction that \(\text{Var}(\bar{X}) > \text{Var}(\bar{X}_n)\) [IFT c's variances]. More efficient.
**Part (vi)**

| $E(\bar{X}) = E(\bar{X}_n) = \frac{1}{2}(\mu_B + \mu_G) = E(X)$ ie they are unbiased. Clearly $\text{Var}(\bar{X}) > \text{Var}(\bar{X}_n)$, therefore $\bar{X}_n$ is the more efficient. | E1, E1, M1, M1, E1 | For any attempt to compare variances. Candidates are not required to note that the variances are equal in the case $\mu_B = \mu_G$. For deduction that $\text{Var}(\bar{X}) > \text{Var}(\bar{X}_n)$ [IFT c's variances]. More efficient. |

---
1 In a certain country, any baby born is equally likely to be a boy or a girl, independently for all births. The birthweight of a baby boy is given by the continuous random variable $X _ { B }$ with probability density function (pdf) $\mathrm { f } _ { B } ( x )$ and cumulative distribution function (cdf) $\mathrm { F } _ { B } ( x )$. The birthweight of a baby girl is given by the continuous random variable $X _ { G }$ with pdf $\mathrm { f } _ { G } ( x )$ and cdf $\mathrm { F } _ { G } ( x )$.

The continuous random variable $X$ denotes the birthweight of a baby selected at random.\\
(i) By considering

$$\mathrm { P } ( X \leqslant x ) = \mathrm { P } ( X \leqslant x \mid \text { boy } ) \mathrm { P } ( \text { boy } ) + \mathrm { P } ( X \leqslant x \mid \text { girl } ) \mathrm { P } ( \text { girl } ) ,$$

find the cdf of $X$ in terms of $\mathrm { F } _ { B } ( x )$ and $\mathrm { F } _ { G } ( x )$, and deduce that the pdf of $X$ is

$$\mathrm { f } ( x ) = \frac { 1 } { 2 } \left\{ \mathrm { f } _ { B } ( x ) + \mathrm { f } _ { G } ( x ) \right\} .$$

(ii) The birthweights of baby boys and girls have means $\mu _ { B }$ and $\mu _ { G }$ respectively. Deduce that

$$\mathrm { E } ( X ) = \frac { 1 } { 2 } \left( \mu _ { B } + \mu _ { G } \right) .$$

(iii) The birthweights of baby boys and girls have common variance $\sigma ^ { 2 }$. Find an expression for $\mathrm { E } \left( X ^ { 2 } \right)$ in terms of $\mu _ { B } , \mu _ { G }$ and $\sigma ^ { 2 }$, and deduce that

$$\operatorname { Var } ( X ) = \sigma ^ { 2 } + \frac { 1 } { 4 } \left( \mu _ { B } - \mu _ { G } \right) ^ { 2 } .$$

(iv) A random sample of size $2 n$ is taken from all the babies born in a certain period. The mean birthweight of the babies in this sample is $\bar { X }$. Write down an approximation to the sampling distribution of $\bar { X }$ if $n$ is large.\\
(v) Suppose instead that a stratified sample of size $2 n$ is taken by selecting $n$ baby boys at random and, independently, $n$ baby girls at random. The mean birthweight of the $2 n$ babies in this sample is $\bar { X } _ { s t }$. Write down the expected value of $\bar { X } _ { s t }$ and find the variance of $\bar { X } _ { s t }$.\\
(vi) Deduce that both $\bar { X }$ and $\bar { X } _ { s t }$ are unbiased estimators of the population mean birthweight. Find which is the more efficient.

\hfill \mbox{\textit{OCR MEI S4 2012 Q1 [24]}}