Chi-squared goodness of fit: Poisson

A question is this type if and only if it tests whether observed frequency data fits a Poisson distribution, possibly with parameter estimated from data.

21 questions · Standard +0.4

5.06b Fit prescribed distribution: chi-squared test
Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2022 June Q4
8 marks Standard +0.3
4 A scientist is investigating the numbers of a particular type of butterfly in a certain region. He claims that the numbers of these butterflies found per square metre can be modelled by a Poisson distribution with mean 2.5. He takes a random sample of 120 areas, each of one square metre, and counts the number of these butterflies in each of these areas. The following table shows the observed frequencies together with some of the expected frequencies using the scientist's Poisson distribution.
Number per square metre0123456\(\geqslant 7\)
Observed frequency1220363213610
Expected frequency9.8524.6330.7825.65\(p\)8.023.34\(q\)
  1. Find the values of \(p\) and \(q\), correct to 2 decimal places.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to test the scientist's claim.
CAIE Further Paper 4 2023 November Q2
8 marks Standard +0.3
2 The number of breakdowns on a particular section of road is recorded each day over a period of 90 days. It is suggested that the number of breakdowns follows a Poisson distribution with mean 3.5. The data is summarised in the table, together with some of the expected frequencies resulting from the suggested Poisson distribution.
Number of breakdowns per day012345678 or more
Observed frequency0513172116954
Expected frequency2.7189.51216.64616.99311.8953.4692.407
  1. Complete the table.
  2. Carry out a goodness of fit test, at the 10\% significance level, to determine whether or not \(\operatorname { Po } ( 3.5 )\) is a good fit to the data.
CAIE Further Paper 4 2024 November Q3
8 marks Standard +0.3
3 A statistician believes that the number of telephone calls received by an advice centre in a 10 -minute interval can be modelled by the Poisson distribution \(\mathrm { Po } ( 1.9 )\). The number of calls received in a randomly chosen 10-minute interval was recorded on each of 100 days. The results are summarised in the table, together with some of the expected frequencies corresponding to the distribution \(\operatorname { Po } ( 1.9 )\).
Number of calls0123456 or more
Observed frequency101835211141
Expected frequency14.95728.41826.9971.322
  1. Complete the table.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to determine whether the statistician's belief is reasonable. \includegraphics[max width=\textwidth, alt={}, center]{e2a45d19-7d48-4aa5-93f9-6ef90f99d7c4-07_2726_35_97_20}
CAIE Further Paper 4 2024 November Q3
8 marks Standard +0.3
3 A statistician believes that the number of telephone calls received by an advice centre in a 10 -minute interval can be modelled by the Poisson distribution \(\mathrm { Po } ( 1.9 )\). The number of calls received in a randomly chosen 10-minute interval was recorded on each of 100 days. The results are summarised in the table, together with some of the expected frequencies corresponding to the distribution \(\operatorname { Po } ( 1.9 )\).
Number of calls0123456 or more
Observed frequency101835211141
Expected frequency14.95728.41826.9971.322
  1. Complete the table.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to determine whether the statistician's belief is reasonable. \includegraphics[max width=\textwidth, alt={}, center]{8b2a13d7-62f4-45a7-84c5-7d5bc870b8ce-07_2726_35_97_20}
OCR S3 2011 June Q4
9 marks Standard +0.3
4 An experiment by Lord Rutherford at Cambridge in 1909 involved measuring the numbers of \(\alpha\)-particles emitted during radioactive decay. The following table shows emissions during 2608 intervals of 7.5 seconds.
Number of particles emitted, \(x\)012345678910\(\geqslant 11\)
Frequency572033835255324082731394527106
It is given that the mean number of particles emitted per interval, calculated from the data, is 3.87 , correct to 3 significant figures.
  1. Find the contribution to the \(\chi ^ { 2 }\) value of the frequency of 273 corresponding to \(x = 6\) in a goodness of fit test for a Poisson distribution.
  2. Given that no cells need to be combined, state why the number of degrees of freedom is 10 .
  3. Given also that the calculated value of \(\chi ^ { 2 }\) is 13.0 , correct to 3 significant figures, carry out the test at the 10\% significance level.
OCR MEI S3 2012 January Q3
18 marks Standard +0.3
3
  1. A medical researcher is looking into the delay, in years, between first and second myocardial infarctions (heart attacks). The following table shows the results for a random sample of 225 patients.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients160401393
    The mean of this sample is used to construct a model which gives the following expected frequencies.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients142.2352.3219.257.084.12
    Carry out a test, using a \(2.5 \%\) level of significance, of the goodness of fit of the model to the data.
  2. A further piece of research compares the incidence of myocardial infarction in men aged 55 to 70 with that in women aged 55 to 70 . Incidence is measured by the number of infarctions per 10000 of the population. For a random sample of 8 health authorities across the UK, the following results for the year 2010 were obtained.
    Health authorityABCDEFGH
    Incidence in men4756155145545032
    Incidence in women3630304754552727
    A Wilcoxon paired sample test, using the hypotheses \(\mathrm { H } _ { 0 } : m = 0\) and \(\mathrm { H } _ { 1 } : m \neq 0\) where \(m\) is the population median difference, is to be carried out to investigate whether there is any difference between men and women on the whole.
    1. Explain why a paired test is being used in this context.
    2. Carry out the test using a \(10 \%\) level of significance.
OCR MEI S3 2011 June Q2
18 marks Standard +0.3
2 Scientists researching into the chemical composition of dust in space collect specimens using a specially designed spacecraft. The craft collects the particles of dust in trays that are made up of a large array of cells containing aerogel. The aerogel traps the particles that penetrate into the cells.
  1. For a random sample of 100 cells, the number of particles of dust in each cell was counted, giving the following results.
    Number of particles0123456789\(10 +\)
    Frequency4710201715109530
    It is thought that the number of particles collected in each cell can be modelled using the distribution Poisson(4.2) since 4.2 is the sample mean for these data. Some of the calculations for a \(\chi ^ { 2 }\) test are shown below. The cells for 8,9 and \(10 +\) particles have been combined.
    Number of particles
    Observed frequency
    Expected frequency
    Contribution to \(X ^ { 2 }\)
    567\(8 +\)
    151098
    16.3311.446.866.39
    0.10830.18130.66760.4056
    Complete the calculations and carry out the test using a \(10 \%\) significance level to see whether the number of particles per cell may be modelled in this way.
  2. The diameters of the dust particles are believed to be distributed symmetrically about a median of 15 micrometres \(( \mu \mathrm { m } )\). For a random sample of 20 particles, the sum of the signed ranks of the diameters of the particles smaller than \(15 \mu \mathrm {~m} \left( W _ { - } \right)\)is found to be 53 . Test at the \(5 \%\) level of significance whether the median diameter appears to be more than \(15 \mu \mathrm {~m}\).
OCR MEI S3 2012 June Q4
18 marks Standard +0.3
4 The numbers of call-outs per day received by a fire station for a random sample of 255 weekdays were recorded as follows.
Number of call-outs012345 or more
Frequency (days)1457922630
The mean number of call-outs per day for these data is 0.6 . A Poisson model, using this sample mean of 0.6 , is fitted to the data, and gives the following expected frequencies (correct to 3 decimal places).
Number of call-outs012345 or more
Expected frequency139.94783.96825.1905.0380.7560.101
  1. Using a \(5 \%\) significance level, carry out a test to examine the goodness of fit of the model to the data. The time \(T\), measured in days, that elapses between successive call-outs can be modelled using the exponential distribution for which \(\mathrm { f } ( t )\), the probability density function, is $$\mathrm { f } ( t ) = \begin{cases} 0 & t < 0 , \\ \lambda \mathrm { e } ^ { - \lambda t } & t \geqslant 0 , \end{cases}$$ where \(\lambda\) is a positive constant.
  2. For the distribution above, it can be shown that \(\mathrm { E } ( T ) = \frac { 1 } { \lambda }\). Given that the mean time between successive call-outs is \(\frac { 5 } { 3 }\) days, write down the value of \(\lambda\).
  3. Find \(\mathrm { F } ( t )\), the cumulative distribution function.
  4. Find the probability that the time between successive call-outs is more than 1 day.
  5. Find the median time that elapses between successive call-outs.
CAIE FP2 2014 November Q8
9 marks Standard +0.8
8 The numbers of a particular type of laptop computer sold by a store on each of 100 consecutive Saturdays are summarised in the following table.
Number sold01234567\(\geqslant 8\)
Number of Saturdays7203916142110
Fit a Poisson distribution to the data and carry out a goodness of fit test at the \(2.5 \%\) significance level.
CAIE FP2 2015 November Q8
10 marks Standard +0.8
8 The number of goals scored by a certain football team was recorded for each of 100 matches, and the results are summarised in the following table.
Number of goals0123456 or more
Frequency121631251330
Fit a Poisson distribution to the data, and test its goodness of fit at the 5\% significance level.
CAIE FP2 2017 Specimen Q8
10 marks Challenging +1.2
8 The number of goals scored by a certain football team was recorded for each of 100 matches, and the results are summarised in the following table.
Number of goals0123456 or more
Frequency121631251330
Fit a Poisson distribution to the data, and test its goodness of fit at the 5\% significance level.
OCR Further Statistics 2023 June Q8
16 marks Challenging +1.2
8 A team of researchers have reason to believe that the number of calls received in randomly chosen 10-minute intervals to a call centre can be well modelled by a Poisson distribution. To test this belief the researchers record the number of telephone calls received in 60 randomly chosen 10-minute intervals. The results, together with relevant calculations, are shown in the following table.
Total
Number of calls, \(r\)01234\(\geqslant 5\)
Observed frequency, \(f\)18131298060
rf013242732096
\(\mathrm { r } ^ { 2 } \mathrm { f }\)01348811280270
Expected frequency12.11419.38215.5068.2703.3081.42160
Contribution to test statistic2.8602.1010.7931.2326.99
  1. Calculate the mean of the observed number of calls received.
  2. Calculate the variance of the observed number of calls received.
  3. Comment on what your answers to parts (a) and (b) suggest about the proposed model.
  4. Explain why it is necessary to combine some cells in the table.
  5. Show how the values 15.506 and 0.793 in the table were obtained.
  6. Carry out the test, at the \(5 \%\) significance level. In the light of the result of the test, the team consider that a different model is appropriate. They propose the following improved model: $$P ( R = r ) = \begin{cases} \frac { 1 } { 60 } ( a + ( 2 - r ) b ) & r = 0,1,2,3,4 \\ 0 & \text { otherwise } \end{cases}$$ where \(a\) and \(b\) are integers.
  7. Use at least three of the observed frequencies to suggest appropriate values for \(a\) and \(b\). You should consider more than one possible pair of values, and explain which pair of values you consider best. (Do not carry out a goodness-of-fit test.)
Edexcel S3 2022 January Q6
12 marks Standard +0.8
  1. The number of emails per hour received by a helpdesk were recorded. The results for a random sample of 80 one-hour periods are shown in the table.
Number of emails per hour0123456
Frequencies11023151993
  1. Show that the mean number of emails per hour in the sample is 3 The manager believes that the number of emails per hour received could be modelled by a Poisson distribution. The following table shows some of the expected frequencies.
    Number of emails per hourExpected Frequencies
    0\(r\)
    111.949
    217.923
    317.923
    413.443
    5\(s\)
    \(\geqslant 6\)\(t\)
  2. Find the values of \(r , s\) and \(t\), giving your answers to 3 decimal places.
  3. Using a 10\% significance level, test whether or not a Poisson model is reasonable. You should clearly state your hypotheses, test statistic and the critical value used.
Edexcel S3 2024 January Q4
10 marks Standard +0.3
  1. The number of jobs sent to a printer per hour in a small office is recorded for 120 hours. The results are summarised in the following table.
Number of jobs012345
Frequency2434282185
  1. Show that the mean number of jobs sent to the printer per hour for these data is 1.75 The office manager believes that the number of jobs sent to the printer per hour can be modelled using a Poisson distribution. The office manager uses the mean given in part (a) to calculate the expected frequencies for this model. Some of the results are given in the following table.
    Number of jobs012345 or more
    Expected frequency20.8536.4931.93\(r\)\(s\)3.95
  2. Show that the value of \(s\) is 8.15 to 2 decimal places.
  3. Find the value of \(r\) to 2 decimal places. The value of \(\sum \frac { \left( O _ { i } - E _ { i } \right) ^ { 2 } } { E _ { i } }\) for the first four frequencies in the table is 1.43
  4. Test, at the \(5 \%\) level of significance, whether or not the number of jobs sent to the printer per hour can be modelled using a Poisson distribution. Show your working clearly, stating your hypotheses, test statistic and critical value.
Edexcel S3 2006 January Q6
13 marks Standard +0.3
6. An area of grass was sampled by placing a \(1 \mathrm {~m} \times 1 \mathrm {~m}\) square randomly in 100 places. The numbers of daisies in each of the squares were counted. It was decided that the resulting data could be modelled by a Poisson distribution with mean 2. The expected frequencies were calculated using the model. The following table shows the observed and expected frequencies.
Number of daisiesObserved frequencyExpected frequency
0813.53
13227.07
227\(r\)
318\(s\)
4109.02
533.61
611.20
700.34
\(\geq 8\)1\(t\)
  1. Find values for \(r , s\) and \(t\).
  2. Using a \(5 \%\) significance level, test whether or not this Poisson model is suitable. State your hypotheses clearly. An alternative test might have been to estimate the population mean by using the data given.
  3. Explain how this would have affected the test.
    (2)
WJEC Further Unit 2 2024 June Q3
12 marks Standard +0.3
  1. A company makes bags. The table below shows the number of bags sold on a random sample of 50 days. A manager believes that the number of bags sold per day can be modelled by the Poisson distribution with mean \(2 \cdot 2\).
Number of
bags sold
012345 or more
Frequency71011967
  1. Carry out a chi-squared goodness of fit test, using a \(10 \%\) significance level.
  2. A chi-squared goodness of fit test for the Poisson distribution with mean \(2 \cdot 5\) is conducted. This uses the same number of degrees of freedom as part (a) and gives a test statistic of 1.53 . State, with a reason, which of these two Poisson models is a better fit for the data.
Edexcel FS1 AS 2023 June Q4
12 marks Standard +0.3
  1. Table 1 below shows the number of car breakdowns in the Snoreap district in each of 60 months.
\begin{table}[h]
Number of car
breakdowns
012345
Frequency1211191431
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} Anja believes that the number of car breakdowns per month in Snoreap can be modelled by a Poisson distribution. Table 2 below shows the results of some of her calculations. \begin{table}[h]
Number of car breakdowns01234\(\geqslant 5\)
Observed frequency (O)1211191431
Expected frequency ( \(\mathbf { E } _ { \mathbf { i } }\) )9.929.644.34
\captionsetup{labelformat=empty} \caption{Table 2}
\end{table}
  1. State suitable hypotheses for a test to investigate Anja's belief.
  2. Explain why Anja has changed the label of the final column to \(\geqslant 5\)
  3. Showing your working clearly, complete Table 2
  4. Find the value of \(\frac { \left( O _ { i } - E _ { i } \right) ^ { 2 } } { E _ { i } }\) when the number of car breakdowns is
    1. 1
    2. 3
  5. Explain why Anja used 3 degrees of freedom for her test. The test statistic for Anja's test is 6.54 to 2 decimal places.
  6. Stating the critical value and using a \(5 \%\) level of significance, complete Anja's test.
Edexcel FS1 AS Specimen Q4
11 marks Standard +0.3
  1. The discrete random variable \(X\) follows a Poisson distribution with mean 1.4
    1. Write down the value of
      1. \(\mathrm { P } ( \mathrm { X } = 1 )\)
      2. \(\mathrm { P } ( \mathrm { X } \leqslant 4 )\)
    The manager of a bank recorded the number of mortgages approved each week over a 40 week period.
    Number of mortgages approved0123456
    Frequency101674201
  2. Show that the mean number of mortgages approved over the 40 week period is 1.4 The bank manager believes that the Poisson distribution may be a good model for the number of mortgages approved each week. She uses a Poisson distribution with a mean of 1.4 to calculate expected frequencies as follows.
    Number of mortgages approved012345 or more
    Expected frequency9.86r9.674.511.58s
  3. Find the value of r and the value of s giving your answers to 2 decimal places. The bank manager will test, at the \(5 \%\) level of significance, whether or not the data can be modelled by a Poisson distribution.
  4. Calculate the test statistic and state the conclusion for this test. State clearly the degrees of freedom and the hypotheses used in the test. \section*{Q uestion 4 continued} \section*{Q uestion 4 continued}
AQA Further Paper 3 Statistics 2024 June Q16
Moderate -0.8
16
256 2 The random variable \(T\) has an exponential distribution with mean 2 Find \(\mathrm { P } ( T \leq 1.4 )\) Circle your answer. \(\mathrm { e } ^ { - 2.8 }\) \(\mathrm { e } ^ { - 0.7 }\) \(1 - e ^ { - 0.7 }\) \(1 - \mathrm { e } ^ { - 2.8 }\) The continuous random variable \(Y\) has cumulative distribution function $$\mathrm { F } ( y ) = \left\{ \begin{array} { l r } 0 & y < 2 \\ - \frac { 1 } { 9 } y ^ { 2 } + \frac { 10 } { 9 } y - \frac { 16 } { 9 } & 2 \leq y < 5 \\ 1 & y \geq 5 \end{array} \right.$$ Find the median of \(Y\) Circle your answer. 2 \(\frac { 10 - 3 \sqrt { 2 } } { 2 }\) \(\frac { 7 } { 2 }\) \(\frac { 10 + 3 \sqrt { 2 } } { 2 }\) Turn over for the next question 4 Research has shown that the mean number of volcanic eruptions on Earth each day is 20 Sandra records 162 volcanic eruptions during a period of one week. Sandra claims that there has been an increase in the mean number of volcanic eruptions per week. Test Sandra's claim at the \(5 \%\) level of significance.
5 The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { 1 } { 6 } e ^ { \frac { x } { 3 } } & 0 \leq x \leq \ln 27 \\ 0 & \text { otherwise } \end{cases}$$ Show that the mean of \(X\) is \(\frac { 3 } { 2 } ( \ln 27 - 2 )\) 6 Over time it has been accepted that the mean retirement age for professional baseball players is 29.5 years old. Imran claims that the mean retirement age is no longer 29.5 years old.
He takes a random sample of 5 recently retired professional baseball players and records their retirement ages, \(x\). The results are $$\sum x = 152.1 \quad \text { and } \quad \sum ( x - \bar { x } ) ^ { 2 } = 7.81$$ 6
  1. State an assumption that you should make about the distribution of the retirement ages to investigate Imran's claim. 6
  2. Investigate Imran's claim, using the 10\% level of significance.
CAIE FP2 2017 June Q10
12 marks Standard +0.3
Roberto owns a small hotel and offers accommodation to guests. Over a period of \(100\) nights, the numbers of rooms, \(x\), that are occupied each night at Roberto's hotel and the corresponding frequencies are shown in the following table.
Number of rooms occupied \((x)\)0123456\(\geqslant 7\)
Number of nights491826201670
  1. Show that the mean number of rooms that are occupied each night is \(3.25\). [1]
The following table shows most of the corresponding expected frequencies, correct to \(2\) decimal places, using a Poisson distribution with mean \(3.25\).
Number of rooms occupied \((x)\)0123456\(\geqslant 7\)
Observed frequency491826201670
Expected frequency3.8812.6020.4822.1818.0211.72
  1. Show how the expected value of \(22.18\), for \(x = 3\), is obtained and find the expected values for \(x = 6\) and for \(x \geqslant 7\). [4]
  2. Use a goodness-of-fit test at the \(5\%\) significance level to determine whether the Poisson distribution is a suitable model for the number of rooms occupied each night at Roberto's hotel. [7]
Edexcel S3 2015 June Q3
11 marks Standard +0.3
The number of accidents on a particular stretch of motorway was recorded each day for 200 consecutive days. The results are summarised in the following table.
Number of accidents012345
Frequency4757463596
  1. Show that the mean number of accidents per day for these data is 1.6 [1]
A motorway supervisor believes that the number of accidents per day on this stretch of motorway can be modelled by a Poisson distribution. She uses the mean found in part (a) to calculate the expected frequencies for this model. Her results are given in the following table.
Number of accidents012345 or more
Frequency40.3864.61\(r\)27.5711.03\(s\)
  1. Find the value of \(r\) and the value of \(s\), giving your answers to 2 decimal places. [3]
  2. Stating your hypotheses clearly, use a 10\% level of significance to test the motorway supervisor's belief. Show your working clearly. [7]