Chi-squared goodness of fit: Poisson

A question is this type if and only if it tests whether observed frequency data fits a Poisson distribution, possibly with parameter estimated from data.

33 questions · Standard +0.4

Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2022 June Q4
8 marks Standard +0.3
4 A scientist is investigating the numbers of a particular type of butterfly in a certain region. He claims that the numbers of these butterflies found per square metre can be modelled by a Poisson distribution with mean 2.5. He takes a random sample of 120 areas, each of one square metre, and counts the number of these butterflies in each of these areas. The following table shows the observed frequencies together with some of the expected frequencies using the scientist's Poisson distribution.
Number per square metre0123456\(\geqslant 7\)
Observed frequency1220363213610
Expected frequency9.8524.6330.7825.65\(p\)8.023.34\(q\)
  1. Find the values of \(p\) and \(q\), correct to 2 decimal places.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to test the scientist's claim.
CAIE Further Paper 4 2023 November Q2
8 marks Standard +0.3
2 The number of breakdowns on a particular section of road is recorded each day over a period of 90 days. It is suggested that the number of breakdowns follows a Poisson distribution with mean 3.5. The data is summarised in the table, together with some of the expected frequencies resulting from the suggested Poisson distribution.
Number of breakdowns per day012345678 or more
Observed frequency0513172116954
Expected frequency2.7189.51216.64616.99311.8953.4692.407
  1. Complete the table.
  2. Carry out a goodness of fit test, at the 10\% significance level, to determine whether or not \(\operatorname { Po } ( 3.5 )\) is a good fit to the data.
CAIE Further Paper 4 2024 November Q3
8 marks Standard +0.3
3 A statistician believes that the number of telephone calls received by an advice centre in a 10 -minute interval can be modelled by the Poisson distribution \(\mathrm { Po } ( 1.9 )\). The number of calls received in a randomly chosen 10-minute interval was recorded on each of 100 days. The results are summarised in the table, together with some of the expected frequencies corresponding to the distribution \(\operatorname { Po } ( 1.9 )\).
Number of calls0123456 or more
Observed frequency101835211141
Expected frequency14.95728.41826.9971.322
  1. Complete the table.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to determine whether the statistician's belief is reasonable. \includegraphics[max width=\textwidth, alt={}, center]{e2a45d19-7d48-4aa5-93f9-6ef90f99d7c4-07_2726_35_97_20}
CAIE Further Paper 4 2024 November Q3
8 marks Standard +0.3
3 A statistician believes that the number of telephone calls received by an advice centre in a 10 -minute interval can be modelled by the Poisson distribution \(\mathrm { Po } ( 1.9 )\). The number of calls received in a randomly chosen 10-minute interval was recorded on each of 100 days. The results are summarised in the table, together with some of the expected frequencies corresponding to the distribution \(\operatorname { Po } ( 1.9 )\).
Number of calls0123456 or more
Observed frequency101835211141
Expected frequency14.95728.41826.9971.322
  1. Complete the table.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to determine whether the statistician's belief is reasonable. \includegraphics[max width=\textwidth, alt={}, center]{8b2a13d7-62f4-45a7-84c5-7d5bc870b8ce-07_2726_35_97_20}
OCR S3 2011 June Q4
9 marks Standard +0.3
4 An experiment by Lord Rutherford at Cambridge in 1909 involved measuring the numbers of \(\alpha\)-particles emitted during radioactive decay. The following table shows emissions during 2608 intervals of 7.5 seconds.
Number of particles emitted, \(x\)012345678910\(\geqslant 11\)
Frequency572033835255324082731394527106
It is given that the mean number of particles emitted per interval, calculated from the data, is 3.87 , correct to 3 significant figures.
  1. Find the contribution to the \(\chi ^ { 2 }\) value of the frequency of 273 corresponding to \(x = 6\) in a goodness of fit test for a Poisson distribution.
  2. Given that no cells need to be combined, state why the number of degrees of freedom is 10 .
  3. Given also that the calculated value of \(\chi ^ { 2 }\) is 13.0 , correct to 3 significant figures, carry out the test at the 10\% significance level.
OCR MEI S3 2008 June Q4
17 marks Standard +0.3
4
  1. A researcher is investigating the feeding habits of bees. She sets up a feeding station some distance from a beehive and, over a long period of time, records the numbers of bees arriving each minute. For a random sample of 100 one-minute intervals she obtains the following results.
    Number of bees01234567\(\geqslant 8\)
    Number of intervals61619181714640
    1. Show that the sample mean is 3.1 and find the sample variance. Do these values support the possibility of a Poisson model for the number of bees arriving each minute? Explain your answer.
    2. Use the mean in part (i) to carry out a test of the goodness of fit of a Poisson model to the data.
  2. The researcher notes the length of time, in minutes, that each bee spends at the feeding station. The times spent are assumed to be Normally distributed. For a random sample of 10 bees, the mean is found to be 1.465 minutes and the standard deviation is 0.3288 minutes. Find a \(95 \%\) confidence interval for the overall mean time.
OCR MEI S3 2012 January Q3
18 marks Standard +0.3
3
  1. A medical researcher is looking into the delay, in years, between first and second myocardial infarctions (heart attacks). The following table shows the results for a random sample of 225 patients.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients160401393
    The mean of this sample is used to construct a model which gives the following expected frequencies.
    Delay (years)\(0 -\)\(1 -\)\(2 -\)\(3 -\)\(4 - 10\)
    Number of patients142.2352.3219.257.084.12
    Carry out a test, using a \(2.5 \%\) level of significance, of the goodness of fit of the model to the data.
  2. A further piece of research compares the incidence of myocardial infarction in men aged 55 to 70 with that in women aged 55 to 70 . Incidence is measured by the number of infarctions per 10000 of the population. For a random sample of 8 health authorities across the UK, the following results for the year 2010 were obtained.
    Health authorityABCDEFGH
    Incidence in men4756155145545032
    Incidence in women3630304754552727
    A Wilcoxon paired sample test, using the hypotheses \(\mathrm { H } _ { 0 } : m = 0\) and \(\mathrm { H } _ { 1 } : m \neq 0\) where \(m\) is the population median difference, is to be carried out to investigate whether there is any difference between men and women on the whole.
    1. Explain why a paired test is being used in this context.
    2. Carry out the test using a \(10 \%\) level of significance.
OCR MEI S3 2011 June Q2
18 marks Standard +0.3
2 Scientists researching into the chemical composition of dust in space collect specimens using a specially designed spacecraft. The craft collects the particles of dust in trays that are made up of a large array of cells containing aerogel. The aerogel traps the particles that penetrate into the cells.
  1. For a random sample of 100 cells, the number of particles of dust in each cell was counted, giving the following results.
    Number of particles0123456789\(10 +\)
    Frequency4710201715109530
    It is thought that the number of particles collected in each cell can be modelled using the distribution Poisson(4.2) since 4.2 is the sample mean for these data. Some of the calculations for a \(\chi ^ { 2 }\) test are shown below. The cells for 8,9 and \(10 +\) particles have been combined.
    Number of particles
    Observed frequency
    Expected frequency
    Contribution to \(X ^ { 2 }\)
    567\(8 +\)
    151098
    16.3311.446.866.39
    0.10830.18130.66760.4056
    Complete the calculations and carry out the test using a \(10 \%\) significance level to see whether the number of particles per cell may be modelled in this way.
  2. The diameters of the dust particles are believed to be distributed symmetrically about a median of 15 micrometres \(( \mu \mathrm { m } )\). For a random sample of 20 particles, the sum of the signed ranks of the diameters of the particles smaller than \(15 \mu \mathrm {~m} \left( W _ { - } \right)\)is found to be 53 . Test at the \(5 \%\) level of significance whether the median diameter appears to be more than \(15 \mu \mathrm {~m}\).
OCR MEI S3 2012 June Q4
18 marks Standard +0.3
4 The numbers of call-outs per day received by a fire station for a random sample of 255 weekdays were recorded as follows.
Number of call-outs012345 or more
Frequency (days)1457922630
The mean number of call-outs per day for these data is 0.6 . A Poisson model, using this sample mean of 0.6 , is fitted to the data, and gives the following expected frequencies (correct to 3 decimal places).
Number of call-outs012345 or more
Expected frequency139.94783.96825.1905.0380.7560.101
  1. Using a \(5 \%\) significance level, carry out a test to examine the goodness of fit of the model to the data. The time \(T\), measured in days, that elapses between successive call-outs can be modelled using the exponential distribution for which \(\mathrm { f } ( t )\), the probability density function, is $$\mathrm { f } ( t ) = \begin{cases} 0 & t < 0 , \\ \lambda \mathrm { e } ^ { - \lambda t } & t \geqslant 0 , \end{cases}$$ where \(\lambda\) is a positive constant.
  2. For the distribution above, it can be shown that \(\mathrm { E } ( T ) = \frac { 1 } { \lambda }\). Given that the mean time between successive call-outs is \(\frac { 5 } { 3 }\) days, write down the value of \(\lambda\).
  3. Find \(\mathrm { F } ( t )\), the cumulative distribution function.
  4. Find the probability that the time between successive call-outs is more than 1 day.
  5. Find the median time that elapses between successive call-outs.
CAIE FP2 2011 June Q10 OR
A family was asked to record the number of letters delivered to their house on each of 200 randomly chosen weekdays. The results are summarised in the following table.
Number of letters012345\(\geqslant 6\)
Number of days57605325410
It is suggested that the number of letters delivered each weekday has a Poisson distribution. By finding the mean and variance for this sample, comment on the appropriateness of this suggestion. The following table includes some of the expected values, correct to 3 decimal places, using a Poisson distribution with mean equal to the sample mean for the above data.
Number of letters012345\(\geqslant 6\)
Expected number of days53.96470.693\(p\)\(q\)6.6221.7350.463
  1. Show that \(p = 46.304\), correct to 3 decimal places, and find \(q\).
  2. Carry out a goodness of fit test at the \(10 \%\) significance level.
CAIE FP2 2017 June Q10
12 marks Standard +0.3
10 Roberto owns a small hotel and offers accommodation to guests. Over a period of 100 nights, the numbers of rooms, \(x\), that are occupied each night at Roberto's hotel and the corresponding frequencies are shown in the following table.
Number of rooms
occupied \(( x )\)
0123456\(\geqslant 7\)
Number of nights491826201670
  1. Show that the mean number of rooms that are occupied each night is 3.25 .
    The following table shows most of the corresponding expected frequencies, correct to 2 decimal places, using a Poisson distribution with mean 3.25.
    Number of rooms
    occupied \(( x )\)
    0123456\(\geqslant 7\)
    Observed frequency491826201670
    Expected frequency3.8812.6020.4822.1818.0211.72
  2. Show how the expected value of 22.18 , for \(x = 3\), is obtained and find the expected values for \(x = 6\) and for \(x \geqslant 7\).
  3. Use a goodness-of-fit test at the \(5 \%\) significance level to determine whether the Poisson distribution is a suitable model for the number of rooms occupied each night at Roberto's hotel.
CAIE FP2 2014 November Q8
9 marks Standard +0.8
8 The numbers of a particular type of laptop computer sold by a store on each of 100 consecutive Saturdays are summarised in the following table.
Number sold01234567\(\geqslant 8\)
Number of Saturdays7203916142110
Fit a Poisson distribution to the data and carry out a goodness of fit test at the \(2.5 \%\) significance level.
CAIE FP2 2015 November Q8
10 marks Standard +0.8
8 The number of goals scored by a certain football team was recorded for each of 100 matches, and the results are summarised in the following table.
Number of goals0123456 or more
Frequency121631251330
Fit a Poisson distribution to the data, and test its goodness of fit at the \(5 \%\) significance level.
CAIE FP2 2015 November Q8
10 marks Standard +0.8
8 The number of goals scored by a certain football team was recorded for each of 100 matches, and the results are summarised in the following table.
Number of goals0123456 or more
Frequency121631251330
Fit a Poisson distribution to the data, and test its goodness of fit at the 5\% significance level.
CAIE FP2 2018 November Q10
12 marks Standard +0.3
10 The number of accidents, \(x\), that occur each day on a motorway are recorded over a period of 40 days. The results are shown in the following table.
Number of accidents0123456\(\geqslant 7\)
Observed frequency358105720
  1. Show that the mean number of accidents each day is 2.95 and calculate the variance for this sample. Explain why these values suggest that a Poisson distribution might fit the data.
    A Poisson distribution with mean 2.95, as found from the data, is used to calculate the expected frequencies, correct to 2 decimal places. The results are shown in the following table.
    Number of accidents0123456\(\geqslant 7\)
    Observed frequency358105720
    Expected frequency2.096.189.118.966.613.901.921.23
  2. Show how the expected frequency of 6.61 for \(x = 4\) is obtained.
  3. Test at the \(5 \%\) significance level the goodness of fit of this Poisson distribution to the data.
CAIE FP2 2019 November Q11 OR
Standard +0.3
The number of puncture repairs carried out each week by a small repair shop is recorded over a period of 40 weeks. The results are shown in the following table.
Number of repairs in a week012345\(\geqslant 6\)
Number of weeks61596310
  1. Calculate the mean and variance for the number of repairs in a week and comment on the possible suitability of a Poisson distribution to model the data.
    Records over a longer period of time indicate that the mean number of repairs in a week is 1.6 . The following table shows some of the expected frequencies, correct to 3 decimal places, for a period of 40 weeks using a Poisson distribution with mean 1.6.
    Number of repairs in a week012345\(\geqslant 6\)
    Expected frequency8.07612.92110.3375.5132.205\(a\)\(b\)
  2. Show that \(a = 0.706\) and find the value of the constant \(b\).
  3. Carry out a goodness of fit test of a Poisson distribution with mean 1.6, using a \(10 \%\) significance level.
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE FP2 2017 Specimen Q8
10 marks Challenging +1.2
8 The number of goals scored by a certain football team was recorded for each of 100 matches, and the results are summarised in the following table.
Number of goals0123456 or more
Frequency121631251330
Fit a Poisson distribution to the data, and test its goodness of fit at the 5\% significance level.
OCR Further Statistics 2023 June Q8
16 marks Challenging +1.2
8 A team of researchers have reason to believe that the number of calls received in randomly chosen 10-minute intervals to a call centre can be well modelled by a Poisson distribution. To test this belief the researchers record the number of telephone calls received in 60 randomly chosen 10-minute intervals. The results, together with relevant calculations, are shown in the following table.
Total
Number of calls, \(r\)01234\(\geqslant 5\)
Observed frequency, \(f\)18131298060
rf013242732096
\(\mathrm { r } ^ { 2 } \mathrm { f }\)01348811280270
Expected frequency12.11419.38215.5068.2703.3081.42160
Contribution to test statistic2.8602.1010.7931.2326.99
  1. Calculate the mean of the observed number of calls received.
  2. Calculate the variance of the observed number of calls received.
  3. Comment on what your answers to parts (a) and (b) suggest about the proposed model.
  4. Explain why it is necessary to combine some cells in the table.
  5. Show how the values 15.506 and 0.793 in the table were obtained.
  6. Carry out the test, at the \(5 \%\) significance level. In the light of the result of the test, the team consider that a different model is appropriate. They propose the following improved model: $$P ( R = r ) = \begin{cases} \frac { 1 } { 60 } ( a + ( 2 - r ) b ) & r = 0,1,2,3,4 \\ 0 & \text { otherwise } \end{cases}$$ where \(a\) and \(b\) are integers.
  7. Use at least three of the observed frequencies to suggest appropriate values for \(a\) and \(b\). You should consider more than one possible pair of values, and explain which pair of values you consider best. (Do not carry out a goodness-of-fit test.)
OCR Further Statistics 2024 June Q5
12 marks Standard +0.3
5 Some bird-watchers study the song of chaffinches in a particular wood. They investigate whether the number, \(N\), of separate bursts of song in a 5 minute period can be modelled by a Poisson distribution. They assume that a burst of song can be considered as a single event, and that bursts of song occur randomly. \section*{(a) State two further assumptions needed for \(N\) to be well modelled by a Poisson distribution.} The bird-watchers record the value of \(N\) in each of 60 periods of 5 minutes. The mean and variance of the results are 3.55 and 5.6475 respectively.
(b) Explain what this suggests about the validity of a Poisson distribution as a model in this context. The complete results are shown in the table.
\(n\)012345678\(\geqslant 9\)
Frequency103781366250
The bird-watchers carry out a \(\chi ^ { 2 }\) goodness of fit test at the \(5 \%\) significance level.
(c) State suitable hypotheses for the test.
(d) Determine the contribution to the test statistic for \(n = 3\).
(e) The total value of the test statistic, obtained by combining the cells for \(n \leqslant 1\) and also for \(n \geqslant 6\), is 9.202 , correct to 4 significant figures. Complete the goodness of fit test.
(f) It is known that chaffinches are more likely to sing in the presence of other chaffinches. Explain whether this fact affects the validity of a Poisson model for \(N\).
Edexcel S3 2022 January Q6
12 marks Standard +0.8
  1. The number of emails per hour received by a helpdesk were recorded. The results for a random sample of 80 one-hour periods are shown in the table.
Number of emails per hour0123456
Frequencies11023151993
  1. Show that the mean number of emails per hour in the sample is 3 The manager believes that the number of emails per hour received could be modelled by a Poisson distribution. The following table shows some of the expected frequencies.
    Number of emails per hourExpected Frequencies
    0\(r\)
    111.949
    217.923
    317.923
    413.443
    5\(s\)
    \(\geqslant 6\)\(t\)
  2. Find the values of \(r , s\) and \(t\), giving your answers to 3 decimal places.
  3. Using a 10\% significance level, test whether or not a Poisson model is reasonable. You should clearly state your hypotheses, test statistic and the critical value used.
Edexcel S3 2024 January Q4
10 marks Standard +0.3
  1. The number of jobs sent to a printer per hour in a small office is recorded for 120 hours. The results are summarised in the following table.
Number of jobs012345
Frequency2434282185
  1. Show that the mean number of jobs sent to the printer per hour for these data is 1.75 The office manager believes that the number of jobs sent to the printer per hour can be modelled using a Poisson distribution. The office manager uses the mean given in part (a) to calculate the expected frequencies for this model. Some of the results are given in the following table.
    Number of jobs012345 or more
    Expected frequency20.8536.4931.93\(r\)\(s\)3.95
  2. Show that the value of \(s\) is 8.15 to 2 decimal places.
  3. Find the value of \(r\) to 2 decimal places. The value of \(\sum \frac { \left( O _ { i } - E _ { i } \right) ^ { 2 } } { E _ { i } }\) for the first four frequencies in the table is 1.43
  4. Test, at the \(5 \%\) level of significance, whether or not the number of jobs sent to the printer per hour can be modelled using a Poisson distribution. Show your working clearly, stating your hypotheses, test statistic and critical value.
Edexcel S3 2006 January Q6
13 marks Standard +0.3
6. An area of grass was sampled by placing a \(1 \mathrm {~m} \times 1 \mathrm {~m}\) square randomly in 100 places. The numbers of daisies in each of the squares were counted. It was decided that the resulting data could be modelled by a Poisson distribution with mean 2. The expected frequencies were calculated using the model. The following table shows the observed and expected frequencies.
Number of daisiesObserved frequencyExpected frequency
0813.53
13227.07
227\(r\)
318\(s\)
4109.02
533.61
611.20
700.34
\(\geq 8\)1\(t\)
  1. Find values for \(r , s\) and \(t\).
  2. Using a \(5 \%\) significance level, test whether or not this Poisson model is suitable. State your hypotheses clearly. An alternative test might have been to estimate the population mean by using the data given.
  3. Explain how this would have affected the test.
    (2)
Edexcel S3 2005 June Q5
12 marks Standard +0.8
5. The number of times per day a computer fails and has to be restarted is recorded for 200 days. The results are summarised in the table.
Number of restartsFrequency
099
165
222
312
42
Test whether or not a Poisson model is suitable to represent the number of restarts per day. Use a \(5 \%\) level of significance and state your hypothesis clearly.
(Total 12 marks)
Edexcel S3 2011 June Q5
13 marks Standard +0.3
  1. The number of hurricanes per year in a particular region was recorded over 80 years. The results are summarised in Table 1 below.
\begin{table}[h]
No of
hurricanes,
\(h\)
01234567
Frequency0251720121212
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table}
  1. Write down two assumptions that will support modelling the number of hurricanes per year by a Poisson distribution.
  2. Show that the mean number of hurricanes per year from Table 1 is 4.4875
  3. Use the answer in part (b) to calculate the expected frequencies \(r\) and \(s\) given in Table 2 below to 2 decimal places. \begin{table}[h]
    \(h\)01234567 or more
    Expected
    frequency
    0.904.04\(r\)13.55\(s\)13.6510.2113.39
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  4. Test, at the \(5 \%\) level of significance, whether or not the data can be modelled by a Poisson distribution. State your hypotheses clearly.
Edexcel S3 Q4
13 marks Standard +0.3
4. Breakdowns on a certain stretch of motorway were recorded each day for 80 consecutive days. The results are summarised in the table below.
Number of
breakdowns
012\(> 2\)
Frequency3832100
It is suggested that the number of breakdowns per day can be modelled by a Poisson distribution. Using a \(5 \%\) level of significance, test whether or not the Poisson distribution is a suitable model for these data. State your hypotheses clearly.
(13 marks)