Estimator properties and bias - Central limit theorem

OCR S4 2011 June Q7

7 The continuous random variable $U$ has unknown mean $\mu$ and known variance $\sigma ^ { 2 }$. In order to estimate $\mu$, two random samples, one of 4 observations of $U$ and the other of 6 observations of $U$, are taken. The sample means are denoted by $\bar { U } _ { 4 }$ and $\bar { U } _ { 6 }$ respectively. One estimator $S$, given by $S = \frac { 1 } { 2 } \left( \bar { U } _ { 4 } + \bar { U } _ { 6 } \right)$, is proposed.

Show that $S$ is unbiased and find $\operatorname { Var } ( S )$ in terms of $\sigma ^ { 2 }$. A second estimator $T$ of the form $a \bar { U } _ { 4 } + b \bar { U } _ { 6 }$ is proposed, where $a$ and $b$ are chosen such that $T$ is an unbiased estimator for $\mu$ with the smallest possible variance.
Find the values of $a$ and $b$ and the corresponding variance of $T$.
State, giving a reason, which of $S$ and $T$ is the better estimator.
Compare the efficiencies of this preferred estimator and the mean of all 10 observations.

Edexcel S3 2014 June Q6

6. A random sample $X _ { 1 } , X _ { 2 } , \ldots , X _ { n }$ is taken from a population with mean $\mu$.

Show that $\bar { X } = \frac { 1 } { n } \left( X _ { 1 } + X _ { 2 } + \ldots + X _ { n } \right)$ is an unbiased estimator of the population mean $\mu$. A company produces small jars of coffee. Five jars of coffee were taken at random and weighed. The weights, in grams, were as follows $$\begin{array} { l l l l l } 197 & 203 & 205 & 201 & 195 \end{array}$$
Calculate unbiased estimates of the population mean and variance of the weights of the jars produced by the company. It is known from previous results that the weights are normally distributed with standard deviation 4.8 g . The manager is going to take a second random sample. He wishes to ensure that there is at least a $95 \%$ probability that the estimate of the population mean is within 1.25 g of its true value.
Find the minimum sample size required.

Edexcel S4 2007 June Q2

2. The value of orders, in $\pounds$, made to a firm over the internet has distribution $\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)$. A random sample of $n$ orders is taken and $\bar { X }$ denotes the sample mean.

Write down the mean and variance of $\bar { X }$ in terms of $\mu$ and $\sigma ^ { 2 }$. A second sample of $m$ orders is taken and $\bar { Y }$ denotes the mean of this sample.
An estimator of the population mean is given by $$U = \frac { n \bar { X } + m \bar { Y } } { n + m }$$
Show that $U$ is an unbiased estimator for $\mu$.
Show that the variance of $U$ is $\frac { \sigma ^ { 2 } } { n + m }$.
State which of $\bar { X }$ or $U$ is a better estimator for $\mu$. Give a reason for your answer.

Edexcel S4 2008 June Q1

A random sample $X _ { 1 } , X _ { 2 } , \ldots , X _ { 10 }$ is taken from a population with mean $\mu$ and variance $\sigma ^ { 2 }$.
1. Determine the bias, if any, of each of the following estimators of $\mu$.
$$\begin{aligned} & \theta _ { 1 } = \frac { X _ { 3 } + X _ { 4 } + X _ { 5 } } { 3 }
& \theta _ { 2 } = \frac { X _ { 10 } - X _ { 1 } } { 3 }
& \theta _ { 3 } = \frac { 3 X _ { 1 } + 2 X _ { 2 } + X _ { 10 } } { 6 } \end{aligned}$$
Find the variance of each of these estimators.
State, giving reasons, which of these three estimators for $\mu$ is
1. the best estimator,
2. the worst estimator.

Edexcel S4 2011 June Q6

A random sample $X _ { 1 } , X _ { 2 } , \ldots , X _ { n }$ is taken from a population where each of the $X _ { i }$ have a continuous uniform distribution over the interval $[ 0 , \beta ]$.
The random variable $Y = \max \left\{ X _ { 1 } , X _ { 2 } , \ldots , X _ { n } \right\}$.
The probability density function of $Y$ is given by

$$f ( y ) = \left\{ \begin{array} { c c } \frac { n } { \beta ^ { n } } y ^ { n - 1 } & 0 \leqslant y \leqslant \beta
0 & \text { otherwise } \end{array} \right.$$

Show that $\mathrm { E } \left( Y ^ { m } \right) = \frac { n } { n + m } \beta ^ { m }$.
Write down $\mathrm { E } ( Y )$.
Using your answers to parts (a) and (b), or otherwise, show that $$\operatorname { Var } ( Y ) = \frac { n } { ( n + 1 ) ^ { 2 } ( n + 2 ) } \beta ^ { 2 }$$
State, giving your reasons, whether or not $Y$ is a consistent estimator of $\beta$. The random variables $M = 2 \bar { X }$, where $\bar { X } = \frac { 1 } { n } \left( X _ { 1 } + X _ { 2 } + \ldots + X _ { n } \right)$, and $S = k Y$, where $k$ is a constant, are both unbiased estimators of $\beta$.
Find the value of $k$ in terms of $n$.
State, giving your reasons, which of $M$ and $S$ is the better estimator of $\beta$ in this case. Five observations of $X$ are: $\quad \begin{array} { l l l l l } 8.5 & 6.3 & 5.4 & 9.1 & 7.6 \end{array}$
Calculate the better estimate of $\beta$.

Edexcel S4 2013 June Q8

8. A random sample $W _ { 1 } , W _ { 2 } \ldots , W _ { n }$ is taken from a distribution with mean $\mu$ and variance $\sigma ^ { 2 }$

Write down $\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } \right)$ and show that $\mathrm { E } \left( \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } \right) = n \left( \sigma ^ { 2 } + \mu ^ { 2 } \right)$ An estimator for $\mu$ is $$\bar { X } = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i }$$
Show that $\bar { X }$ is a consistent estimator for $\mu$. An estimator of $\sigma ^ { 2 }$ is $$U = \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } ^ { 2 } - \left( \frac { 1 } { n } \sum _ { i = 1 } ^ { n } W _ { i } \right) ^ { 2 }$$
Find the bias of $U$.
Write down an unbiased estimator of $\sigma ^ { 2 }$ in the form $k U$, where $k$ is in terms of $n$.

Edexcel S4 2014 June Q6

6. Emily is monitoring the level of pollution in a river. Over a period of time she has found that the amount of pollution, $X$, in a 100 ml sample of river water has a continuous distribution with probability density function $\mathrm { f } ( x )$ given by $$f ( x ) = \left\{ \begin{array} { c c } \frac { 2 x } { a ^ { 2 } } & 0 \leqslant x \leqslant a
0 & \text { otherwise } \end{array} \right.$$ where $a$ is a constant. Emily takes a random sample $X _ { 1 } , X _ { 2 } , X _ { 3 } , \ldots , X _ { n }$ to try to estimate the value of $a$.

Show that $\mathrm { E } ( \bar { X } ) = \frac { 2 a } { 3 }$ and $\operatorname { Var } ( \bar { X } ) = \frac { a ^ { 2 } } { 18 n }$ The random variable $S = p \bar { X }$, where $p$ is a constant, is an unbiased estimator of $a$.
Write down the value of $p$ and find $\operatorname { Var } ( S )$. Felix suggests using the statistic $M = \max \left\{ X _ { 1 } , X _ { 2 } , X _ { 3 } , \ldots , X _ { n } \right\}$ as an estimator of $a$.
He calculates $\mathrm { E } ( M ) = \frac { 2 n } { 2 n + 1 } a$ and $\operatorname { Var } ( M ) = \frac { n } { ( n + 1 ) ( 2 n + 1 ) ^ { 2 } } a ^ { 2 }$
State, giving your reasons, whether or not $M$ is a consistent estimator of $a$. The random variable $T = q M$, where $q$ is a constant, is an unbiased estimator of $a$.
Write down, in terms of $n$, the value of $q$ and find $\operatorname { Var } ( T )$.
State, giving your reasons, which of $S$ or $T$ you would recommend Emily use as an estimator of $a$. Emily took a sample of 5 values of $X$ and obtained the following:
5.3
4.3
$\begin{array} { l l } 5.7 & 7.8 \end{array}$
6.9
Calculate the estimate of $a$ using your recommended estimator from part (e).
Find the standard error of your estimate, giving your answer to 2 decimal places.

Edexcel S4 2016 June Q6

6. A random sample of size $n$ is taken from the random variable $X$, which has a continuous uniform distribution over the interval [ $0 , a$ ], $a > 0$ The sample mean is denoted by $\bar { X }$

Show that $Y = 2 \bar { X }$ is an unbiased estimator of $a$ The maximum value, $M$, in the sample has probability density function $$f ( m ) = \left\{ \begin{array} { c c } \frac { n m ^ { n - 1 } } { a ^ { n } } & 0 \leqslant m \leqslant a
0 & \text { otherwise } \end{array} \right.$$
Find E(M)
Show that $\operatorname { Var } ( M ) = \frac { n a ^ { 2 } } { ( n + 2 ) ( n + 1 ) ^ { 2 } }$ The estimator $S$ is defined by $S = \frac { n + 1 } { n } M$
Given that $n > 1$
state which of $Y$ or $S$ is the better estimator for $a$. Give a reason for your answer.

Edexcel S4 Q6

6. A statistics student is trying to estimate the probability, $p$, of rolling a 6 with a particular die. The die is rolled 10 times and the random variable $X _ { 1 }$ represents the number of sixes obtained. The random variable $R _ { 1 } = \frac { X _ { 1 } } { 10 }$ is proposed as an estimator of $p$.

Show that $R _ { 1 }$ is an unbiased estimator of $p$. The student decided to roll the die again $n$ times ( $n > 10$ ) and the random variable $X _ { 2 }$ represents the number of sixes in these $n$ rolls. The random variable $R _ { 2 } = \frac { X _ { 2 } } { n }$ and the random variable $Y = \frac { 1 } { 2 } \left( R _ { 1 } + R _ { 2 } \right)$.
Show that both $R _ { 2 }$ and $Y$ are unbiased estimators of $p$.
Find $\operatorname { Var } \left( R _ { 2 } \right)$ and $\operatorname { Var } ( Y )$.
State giving a reason which of the 3 estimators $R _ { 1 } , R _ { 2 }$ and $Y$ are consistent estimators of $p$.
For the case $n = 20$ state, giving a reason, which of the 3 estimators $R _ { 1 } , R _ { 2 }$ and $Y$ you would recommend. The student's teacher pointed out that a better estimator could be found based on the random variable $X _ { 1 } + X _ { 2 }$.
Find a suitable estimator and explain why it is better than $R _ { 1 } , R _ { 2 }$ and $Y$. END