OCR MEI S4 2008 June — Question 2 24 marks

Exam BoardOCR MEI
ModuleS4 (Statistics 4)
Year2008
SessionJune
Marks24
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicNegative Binomial Distribution
TypeDerive probability generating function
DifficultyStandard +0.8 This is a substantial S4 question requiring derivation of a PGF from first principles, manipulation of generating functions to find mean/variance, understanding of sum properties of independent random variables, and application to a real-world scenario. While the techniques are standard for Further Maths S4 (geometric series, PGF differentiation, CLT approximation), the multi-part structure, proof elements, and need to connect abstract theory to practical application make it moderately challenging—above average difficulty but within reach of well-prepared S4 students.
Spec5.02g Geometric probabilities: P(X=r) = p(1-p)^(r-1)5.02h Geometric: mean 1/p and variance (1-p)/p^25.04a Linear combinations: E(aX+bY), Var(aX+bY)5.05a Sample mean distribution: central limit theorem

2 Independent trials, on each of which the probability of a 'success' is \(p ( 0 < p < 1 )\), are being carried out. The random variable \(X\) counts the number of trials up to and including that on which the first success is obtained. The random variable \(Y\) counts the number of trials up to and including that on which the \(n\)th success is obtained.
  1. Write down an expression for \(\mathrm { P } ( X = x )\) for \(x = 1,2 , \ldots\). Show that the probability generating function of \(X\) is $$\mathrm { G } ( t ) = p t ( 1 - q t ) ^ { - 1 }$$ where \(q = 1 - p\), and hence that the mean and variance of \(X\) are $$\mu = \frac { 1 } { p } \quad \text { and } \quad \sigma ^ { 2 } = \frac { q } { p ^ { 2 } }$$ respectively.
  2. Explain why the random variable \(Y\) can be written as $$Y = X _ { 1 } + X _ { 2 } + \ldots + X _ { n }$$ where the \(X _ { i }\) are independent random variables each distributed as \(X\). Hence write down the probability generating function, the mean and the variance of \(Y\).
  3. State an approximation to the distribution of \(Y\) for large \(n\).
  4. The aeroplane used on a certain flight seats 140 passengers. The airline seeks to fill the plane, but its experience is that not all the passengers who buy tickets will turn up for the flight. It uses the random variable \(Y\) to model the situation, with \(p = 0.8\) as the probability that a passenger turns up. Find the probability that it needs to sell at least 160 tickets to get 140 passengers who turn up. Suggest a reason why the model might not be appropriate.

Question 2:
Part (i):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(P(X=x) = q^{x-1}p\)B1 FT into pgf only
Pgf: \(G(t) = E(t^X) = \sum_{x=1}^{\infty} pt^x q^{x-1}\)M1
\(= pt(1 + qt + q^2t^2 + \ldots) = pt(1-qt)^{-1}\)A1, A1 BEWARE PRINTED ANSWER [consideration of \(
\(\mu = G'(1)\), \(\sigma^2 = G''(1) + \mu - \mu^2\)M1 For attempt to find \(G'(t)\) and/or \(G''(t)\)
\(G'(t) = pqt(1-qt)^{-2} + p(1-qt)^{-1}\)A1
\(\therefore G'(1) = pq(1-q)^{-2} + p(1-q)^{-1} = \dfrac{q}{p} + 1 = \dfrac{1}{p}\)A1 BEWARE PRINTED ANSWER
\(G''(t) = pqt(-2)(1-qt)^{-3}(-q) + pq(1-qt)^{-2} + p(-1)(1-qt)^{-2}(-q)\)A1
\(\therefore G''(1) = 2pq^2(1-q)^{-3} + pq(1-q)^{-2} + pq(1-q)^{-2} = \dfrac{2q^2}{p^2} + \dfrac{2q}{p}\)A1
\(\therefore \sigma^2 = \dfrac{2q^2}{p^2} + \dfrac{2q}{p} + \dfrac{1}{p} - \dfrac{1}{p^2} = \dfrac{2q^2+2pq+p-1}{p^2} = \dfrac{q}{p^2}\)M1, A1 For inserting their values. BEWARE PRINTED ANSWER
Part (ii):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(X_1\) = number of trials to first success, \(X_2\) = next, ..., \(X_n\) = \(n\)th, \(\therefore Y = X_1+X_2+\ldots+X_n\) = total no. of trials to \(n\)th successE1, E1
\(\therefore\) pgf of \(Y = (\text{pgf of } X)^n = p^n t^n (1-qt)^{-n}\)1
\(\mu_Y = n\mu_X = \dfrac{n}{p}\)1
\(\sigma_Y^2 = n\sigma_X^2 = \dfrac{nq}{p^2}\)1
Part (iii):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(N(\text{candidate's } \mu_Y, \text{ candidate's } \sigma_Y^2)\)1
Part (iv):
AnswerMarks Guidance
Answer/WorkingMark Guidance
\(Y\) = no of tickets to be sold, random variable as in (ii) with \(n=140\) and \(p=0.8\)E1
\(\sim \text{Approx } N\left(\dfrac{140}{0.8}=175,\ \dfrac{140\times 0.2}{(0.8)^2}=43.75\right)\)1
\(P(Y \geq 160) \approx P\left(N(175,43.75) > 159\tfrac{1}{2}\right)\)M1 Do not award if continuity correction absent or wrong, but FT if 160 used \(\to -2.268\), 0.9884
\(= P(N(0,1) > -2.343) = 0.9905\)A1, A1 CAO
For any sensible discussion in context (e.g. groups of passengers \(\Rightarrow\) not indep.)E1, E1
# Question 2:

## Part (i):

| Answer/Working | Mark | Guidance |
|---|---|---|
| $P(X=x) = q^{x-1}p$ | B1 | FT into pgf only |
| Pgf: $G(t) = E(t^X) = \sum_{x=1}^{\infty} pt^x q^{x-1}$ | M1 | |
| $= pt(1 + qt + q^2t^2 + \ldots) = pt(1-qt)^{-1}$ | A1, A1 | BEWARE PRINTED ANSWER [consideration of $|qt|<1$ not required] |
| $\mu = G'(1)$, $\sigma^2 = G''(1) + \mu - \mu^2$ | M1 | For attempt to find $G'(t)$ and/or $G''(t)$ |
| $G'(t) = pqt(1-qt)^{-2} + p(1-qt)^{-1}$ | A1 | |
| $\therefore G'(1) = pq(1-q)^{-2} + p(1-q)^{-1} = \dfrac{q}{p} + 1 = \dfrac{1}{p}$ | A1 | BEWARE PRINTED ANSWER |
| $G''(t) = pqt(-2)(1-qt)^{-3}(-q) + pq(1-qt)^{-2} + p(-1)(1-qt)^{-2}(-q)$ | A1 | |
| $\therefore G''(1) = 2pq^2(1-q)^{-3} + pq(1-q)^{-2} + pq(1-q)^{-2} = \dfrac{2q^2}{p^2} + \dfrac{2q}{p}$ | A1 | |
| $\therefore \sigma^2 = \dfrac{2q^2}{p^2} + \dfrac{2q}{p} + \dfrac{1}{p} - \dfrac{1}{p^2} = \dfrac{2q^2+2pq+p-1}{p^2} = \dfrac{q}{p^2}$ | M1, A1 | For inserting their values. BEWARE PRINTED ANSWER |

## Part (ii):

| Answer/Working | Mark | Guidance |
|---|---|---|
| $X_1$ = number of trials to first success, $X_2$ = next, ..., $X_n$ = $n$th, $\therefore Y = X_1+X_2+\ldots+X_n$ = total no. of trials to $n$th success | E1, E1 | |
| $\therefore$ pgf of $Y = (\text{pgf of } X)^n = p^n t^n (1-qt)^{-n}$ | 1 | |
| $\mu_Y = n\mu_X = \dfrac{n}{p}$ | 1 | |
| $\sigma_Y^2 = n\sigma_X^2 = \dfrac{nq}{p^2}$ | 1 | |

## Part (iii):

| Answer/Working | Mark | Guidance |
|---|---|---|
| $N(\text{candidate's } \mu_Y, \text{ candidate's } \sigma_Y^2)$ | 1 | |

## Part (iv):

| Answer/Working | Mark | Guidance |
|---|---|---|
| $Y$ = no of tickets to be sold, random variable as in (ii) with $n=140$ and $p=0.8$ | E1 | |
| $\sim \text{Approx } N\left(\dfrac{140}{0.8}=175,\ \dfrac{140\times 0.2}{(0.8)^2}=43.75\right)$ | 1 | |
| $P(Y \geq 160) \approx P\left(N(175,43.75) > 159\tfrac{1}{2}\right)$ | M1 | Do not award if continuity correction absent or wrong, but FT if 160 used $\to -2.268$, 0.9884 |
| $= P(N(0,1) > -2.343) = 0.9905$ | A1, A1 | CAO |
| For any sensible discussion in context (e.g. groups of passengers $\Rightarrow$ not indep.) | E1, E1 | |

---
2 Independent trials, on each of which the probability of a 'success' is $p ( 0 < p < 1 )$, are being carried out. The random variable $X$ counts the number of trials up to and including that on which the first success is obtained. The random variable $Y$ counts the number of trials up to and including that on which the $n$th success is obtained.\\
(i) Write down an expression for $\mathrm { P } ( X = x )$ for $x = 1,2 , \ldots$. Show that the probability generating function of $X$ is

$$\mathrm { G } ( t ) = p t ( 1 - q t ) ^ { - 1 }$$

where $q = 1 - p$, and hence that the mean and variance of $X$ are

$$\mu = \frac { 1 } { p } \quad \text { and } \quad \sigma ^ { 2 } = \frac { q } { p ^ { 2 } }$$

respectively.\\
(ii) Explain why the random variable $Y$ can be written as

$$Y = X _ { 1 } + X _ { 2 } + \ldots + X _ { n }$$

where the $X _ { i }$ are independent random variables each distributed as $X$. Hence write down the probability generating function, the mean and the variance of $Y$.\\
(iii) State an approximation to the distribution of $Y$ for large $n$.\\
(iv) The aeroplane used on a certain flight seats 140 passengers. The airline seeks to fill the plane, but its experience is that not all the passengers who buy tickets will turn up for the flight. It uses the random variable $Y$ to model the situation, with $p = 0.8$ as the probability that a passenger turns up. Find the probability that it needs to sell at least 160 tickets to get 140 passengers who turn up.

Suggest a reason why the model might not be appropriate.

\hfill \mbox{\textit{OCR MEI S4 2008 Q2 [24]}}