OCR MEI S3 2009 January — Question 4 18 marks

Exam BoardOCR MEI
ModuleS3 (Statistics 3)
Year2009
SessionJanuary
Marks18
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicChi-squared goodness of fit
TypeChi-squared goodness of fit: Other continuous
DifficultyStandard +0.3 This is a straightforward chi-squared goodness of fit test with given probabilities. Part (i) is basic definition recall, part (ii) is simple algebra verifying a geometric series sums to 1, part (iii) is a standard chi-squared test with p=0.25 (routine calculation of expected frequencies and test statistic), and part (iv) requires understanding that estimating a parameter reduces degrees of freedom. All techniques are standard S3 material with no novel problem-solving required, making it slightly easier than average.
Spec2.01c Sampling techniques: simple random, opportunity, etc5.06c Fit other distributions: discrete and continuous

4
  1. Explain the meaning of 'opportunity sampling'. Give one reason why it might be used and state one disadvantage of using it. A market researcher is conducting an 'on-street' survey in a busy city centre, for which he needs to stop and interview 100 people. For each interview the researcher counts the number of people he has to ask until one agrees to be interviewed. The data collected are as follows.
    No. of people asked1234567 or more
    Frequency261917131186
    A model for these data is proposed as follows, where \(p\) (assumed constant throughout) is the probability that a person asked agrees to be interviewed, and \(q = 1 - p\).
    No. of people asked1234567 or more
    Probability\(p\)\(p q\)\(p q ^ { 2 }\)\(p q ^ { 3 }\)\(p q ^ { 4 }\)\(p q ^ { 5 }\)\(q ^ { 6 }\)
  2. Verify that these probabilities add to 1 whatever the value of \(p\).
  3. Initially it is thought that on average 1 in 4 people asked agree to be interviewed. Test at the \(10 \%\) level of significance whether it is reasonable to suppose that the model applies with \(p = 0.25\).
  4. Later an estimate of \(p\) obtained from the data is used in the analysis. The value of the test statistic (with no combining of cells) is found to be 9.124 . What is the outcome of this new test? Comment on your answer in relation to the outcome of the test in part (iii).

Question 4:
Part (i)
AnswerMarks Guidance
AnswerMark Guidance
Sampling which selects from those that are (easily) available.E1
Circumstances may mean that it is the only economically viable method available.E1
Likely to be neither random nor representative.E1
Total: 3 marks
Part (ii)
AnswerMarks Guidance
AnswerMark Guidance
\(p + pq + pq^2 + pq^3 + pq^4 + pq^5 + q^6\)
\(= \frac{p(1-q^6)}{1-q} + q^6 = \frac{p(1-q^6)}{p} + q^6\)M1 Use of GP formula to sum probabilities, or expand in terms of \(p\) or in terms of \(q\).
\(= 1 - q^6 + q^6 = 1\)A1 Algebra shown convincingly. Beware answer given.
Total: 2 marks
Part (iii)
AnswerMarks Guidance
AnswerMark Guidance
With \(p = 0.25\): Probabilities: \(0.25,\ 0.1875,\ 0.140625,\ 0.105469,\ 0.079102,\ 0.059326,\ 0.177979\)M1, M1 Probabilities correct to 3 dp or better.
Expected frequencies: \(25.00,\ 18.75,\ 14.0625,\ 10.5469,\ 7.9102,\ 5.9326,\ 17.7979\)A1 \(\times 100\) for expected frequencies. All correct and sum to 100.
\(X^2 = 0.04 + 0.0033 + 0.6136 + 0.5706 + 1.2069 + 0.7204 + 7.8206 = 10.97(54)\)M1, A1 c.a.o.
Refer to \(\chi^2_6\)M1 Allow correct df \((= \text{cells} - 1)\) from wrongly grouped table and ft. Otherwise, no ft if wrong. \(P(X^2 > 10.975) = 0.0891\).
Upper 10% point is 10.64.A1 No ft from here if wrong.
Significant.A1 ft only c's test statistic.
Suggests model with \(p = 0.25\) does not fit.A1 ft only c's test statistic.
Total: 9 marks
Part (iv)
AnswerMarks Guidance
AnswerMark Guidance
Now with \(X^2 = 9.124\); Refer to \(\chi^2_5\)M1 Allow correct df \((= \text{cells} - 2)\) from wrongly grouped table and ft. Otherwise, no ft if wrong. \(P(X^2 > 9.124) = 0.1042\).
Upper 10% point is 9.236.A1 No ft from here if wrong.
Not significant. (Suggests new model does fit.)A1 Correct conclusion.
Improvement to the model is due to estimation of \(p\) from the data.E1 Comment about the effect of estimated \(p\), consistent with conclusion in part (iii).
Total: 4 marks
# Question 4:

## Part (i)
| Answer | Mark | Guidance |
|--------|------|----------|
| Sampling which selects from those that are (easily) available. | E1 | |
| Circumstances may mean that it is the only economically viable method available. | E1 | |
| Likely to be neither random nor representative. | E1 | |

**Total: 3 marks**

## Part (ii)
| Answer | Mark | Guidance |
|--------|------|----------|
| $p + pq + pq^2 + pq^3 + pq^4 + pq^5 + q^6$ | | |
| $= \frac{p(1-q^6)}{1-q} + q^6 = \frac{p(1-q^6)}{p} + q^6$ | M1 | Use of GP formula to sum probabilities, or expand in terms of $p$ or in terms of $q$. |
| $= 1 - q^6 + q^6 = 1$ | A1 | Algebra shown convincingly. Beware answer given. |

**Total: 2 marks**

## Part (iii)
| Answer | Mark | Guidance |
|--------|------|----------|
| With $p = 0.25$: Probabilities: $0.25,\ 0.1875,\ 0.140625,\ 0.105469,\ 0.079102,\ 0.059326,\ 0.177979$ | M1, M1 | Probabilities correct to 3 dp or better. |
| Expected frequencies: $25.00,\ 18.75,\ 14.0625,\ 10.5469,\ 7.9102,\ 5.9326,\ 17.7979$ | A1 | $\times 100$ for expected frequencies. All correct and sum to 100. |
| $X^2 = 0.04 + 0.0033 + 0.6136 + 0.5706 + 1.2069 + 0.7204 + 7.8206 = 10.97(54)$ | M1, A1 | c.a.o. |
| Refer to $\chi^2_6$ | M1 | Allow correct df $(= \text{cells} - 1)$ from wrongly grouped table and ft. Otherwise, no ft if wrong. $P(X^2 > 10.975) = 0.0891$. |
| Upper 10% point is 10.64. | A1 | No ft from here if wrong. |
| Significant. | A1 | ft only c's test statistic. |
| Suggests model with $p = 0.25$ does not fit. | A1 | ft only c's test statistic. |

**Total: 9 marks**

## Part (iv)
| Answer | Mark | Guidance |
|--------|------|----------|
| Now with $X^2 = 9.124$; Refer to $\chi^2_5$ | M1 | Allow correct df $(= \text{cells} - 2)$ from wrongly grouped table and ft. Otherwise, no ft if wrong. $P(X^2 > 9.124) = 0.1042$. |
| Upper 10% point is 9.236. | A1 | No ft from here if wrong. |
| Not significant. (Suggests new model does fit.) | A1 | Correct conclusion. |
| Improvement to the model is due to estimation of $p$ from the data. | E1 | Comment about the effect of estimated $p$, consistent with conclusion in part (iii). |

**Total: 4 marks**
4 (i) Explain the meaning of 'opportunity sampling'. Give one reason why it might be used and state one disadvantage of using it.

A market researcher is conducting an 'on-street' survey in a busy city centre, for which he needs to stop and interview 100 people. For each interview the researcher counts the number of people he has to ask until one agrees to be interviewed. The data collected are as follows.

\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\hline
No. of people asked & 1 & 2 & 3 & 4 & 5 & 6 & 7 or more \\
\hline
Frequency & 26 & 19 & 17 & 13 & 11 & 8 & 6 \\
\hline
\end{tabular}
\end{center}

A model for these data is proposed as follows, where $p$ (assumed constant throughout) is the probability that a person asked agrees to be interviewed, and $q = 1 - p$.

\begin{center}
\begin{tabular}{ | l | c | c | c | c | c | c | c | }
\hline
No. of people asked & 1 & 2 & 3 & 4 & 5 & 6 & 7 or more \\
\hline
Probability & $p$ & $p q$ & $p q ^ { 2 }$ & $p q ^ { 3 }$ & $p q ^ { 4 }$ & $p q ^ { 5 }$ & $q ^ { 6 }$ \\
\hline
\end{tabular}
\end{center}

(ii) Verify that these probabilities add to 1 whatever the value of $p$.\\
(iii) Initially it is thought that on average 1 in 4 people asked agree to be interviewed. Test at the $10 \%$ level of significance whether it is reasonable to suppose that the model applies with $p = 0.25$.\\
(iv) Later an estimate of $p$ obtained from the data is used in the analysis. The value of the test statistic (with no combining of cells) is found to be 9.124 . What is the outcome of this new test? Comment on your answer in relation to the outcome of the test in part (iii).

\hfill \mbox{\textit{OCR MEI S3 2009 Q4 [18]}}