Goodness-of-fit test for Poisson

A question is this type if and only if it involves using a chi-squared goodness-of-fit test to determine whether observed frequency data fits a Poisson distribution.

10 questions

CAIE Further Paper 4 2020 Specimen Q2
2 Each \(\mathbf { 6 }\) Q id \(n\) ically \(\mathbf { b }\) ased d ce is th \(n\) rep ated \(\mathrm { y } n\) il an eq \(n\) b \(\mathbf { r }\) is \(\mathbf { b }\) ain d Th \(\mathbf { m }\) b \(\mathbf { r }\) 6 th se ed d s reco d d d b resu ts are sm marised it b fb low ig ab e.
Numb r \(\mathbf { 6 }\) th s123456\(\geqslant 7\)
Freq n y\(\boldsymbol { 6 }\)\(\mathbf { 3 }\)23510
Carry \(\mathbf { a }\) a ss \(\mathbf { 6 }\) fit test, at th \(\% _ { 0 }\) sig fican e lev l, to test wh th r Ge( Đ is a satisfacto y md lfo th d ta.
[0pt] []
Edexcel S2 2009 January Q1
  1. A botanist is studying the distribution of daisies in a field. The field is divided into a number of equal sized squares. The mean number of daisies per square is assumed to be 3. The daisies are distributed randomly throughout the field.
Find the probability that, in a randomly chosen square there will be
  1. more than 2 daisies,
  2. either 5 or 6 daisies. The botanist decides to count the number of daisies, \(x\), in each of 80 randomly selected squares within the field. The results are summarised below $$\sum x = 295 \quad \sum x ^ { 2 } = 1386$$
  3. Calculate the mean and the variance of the number of daisies per square for the 80 squares. Give your answers to 2 decimal places.
  4. Explain how the answers from part (c) support the choice of a Poisson distribution as a model.
  5. Using your mean from part (c), estimate the probability that exactly 4 daisies will be found in a randomly selected square.
OCR MEI Further Statistics A AS 2024 June Q3
3 A glassware factory produces a large number of ornaments each week. Just before they leave the factory, all the ornaments are checked and some may be found to be defective. The Quality Assurance Manager of the factory wishes to model the number of defective ornaments that are found each week using a Poisson distribution. The numbers of defective ornaments found each week in a period of 40 weeks are shown in Table 3.1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 3.1}
No. of defective ornaments in a week, \(r\)0123456\(\geqslant 7\)
No. of weeks with \(r\) defective ornaments, \(f\)2141353120
\end{table} You are given that summary statistics for the data are \(\sum f = 40 , \sum \mathrm { rf } = 84\) and \(\sum \mathrm { r } ^ { 2 } \mathrm { f } = 256\).
  1. By using the summary statistics to determine estimates for the mean and variance of the number of defective ornaments produced by the factory each week, explain how the data support the suggestion that the number of defective ornaments produced each week can be modelled using a Poisson distribution. The Quality Assurance Manager is asked by the head office to carry out a chi-squared hypothesis test for goodness of fit based on a \(\operatorname { Po } ( 2 )\) distribution.
  2. Table 3.2, which is incomplete, gives observed frequency, probability, expected frequency and chi-squared contribution. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 3.2}
    No. of defective ornaments in a week, \(r\)Observed frequencyProbabilityExpected frequencyChi-squared contribution
    020.135345.41342.15232
    114
    2130.270670.43620
    357.2179
    \(\geqslant 4\)60.142880.01421
    \end{table}
    1. Complete the copy of the table in the Printed Answer Booklet.
    2. Carry out the test at the \(10 \%\) significance level.
  3. On one occasion a fork-lift truck in the factory drops a crate containing eight ornaments and all of them are subsequently found to be defective. Explain why the Poisson model cannot model defects occurring in this manner.
OCR MEI Further Statistics A AS 2021 November Q7
7 A biologist is investigating migrating butterflies. Fig. 7.1 shows the numbers of migrating butterflies passing her location in 100 randomly chosen one-minute periods. \begin{table}[h]
Number of butterflies01234567\(\geqslant 8\)
Frequency6918261316930
\captionsetup{labelformat=empty} \caption{Fig. 7.1}
\end{table}
    1. Use the data to show that a suitable estimate for the mean number of butterflies passing her location per minute is 3.3.
    2. Explain how the value of the variance estimate calculated from the sample supports the suggestion that a Poisson distribution may be a suitable model for these data. The biologist decides to carry out a test to investigate whether a Poisson distribution may be a suitable model for these data.
  1. In this question you must show detailed reasoning. Complete the copy of Fig. 7.2 of expected frequencies and contributions for a chi-squared test in the Printed Answer Booklet. \begin{table}[h]
    Number of butterfliesFrequencyProbabilityExpected frequencyChi-squared contribution
    060.03693.68831.4489
    190.121712.17140.8264
    2180.2160
    3260.6916
    4130.182318.22521.4981
    5160.120312.0286
    690.06626.61580.8593
    \(\geqslant 7\)30.05105.09660.8625
    \captionsetup{labelformat=empty} \caption{Fig. 7.2}
    \end{table}
  2. Complete the chi-squared test at the \(5 \%\) significance level.
OCR MEI Further Statistics Minor 2019 June Q4
4 Zara uses a metal detector to search for coins on a beach.
She wonders if the numbers of coins that she finds in an area of \(10 \mathrm {~m} ^ { 2 }\) can be modelled by a Poisson distribution. The table below shows the numbers of coins that she finds in randomly chosen areas of \(10 \mathrm {~m} ^ { 2 }\) over a period of months.
Number of coins found0123456\(> 6\)
Frequency1328301410230
  1. Software gives the sample mean as 1.98 and the sample standard deviation as 1.4212. Explain how these values suggest that a Poisson distribution may be an appropriate model for the numbers of coins found. Zara decides to carry out a chi-squared test to investigate whether a Poisson distribution is an appropriate model.
    Fig. 4 is a screenshot showing part of the spreadsheet used to analyse the data. Some values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    ABCD
    1Number of coins foundObserved frequencyExpected frequencyChi-squared contribution
    201313.80690.0472
    3128
    423027.06430.3184
    531417.86250.8352
    64108.84190.1517
    7\(\geqslant 5\)50.0015
    \captionsetup{labelformat=empty} \caption{Fig. 4}
    \end{table}
  2. Showing your calculations, find the missing values in each of the following cells.
    • C3
    • C7
    • D3
    • Explain why the numbers for 5, 6 and more than 6 coins found have been combined into the single category of at least 5 coins found, as shown in the spreadsheet.
    • Complete the hypothesis test at the \(5 \%\) level of significance.
    For the rest of this question, you should assume that the number of coins that Zara finds in an area of \(10 \mathrm {~m} ^ { 2 }\) can be modelled by a Poisson distribution with mean 1.98.
    Zara also finds pieces of jewellery independently of the coins she finds. The number of pieces of jewellery that she finds per \(10 \mathrm {~m} ^ { 2 }\) area is modelled by a Poisson distribution with mean 0.42 .
  3. Find the probability that Zara finds a total of exactly 3 items (coins and/or jewellery) in an area of \(10 \mathrm {~m} ^ { 2 }\).
  4. Find the probability that Zara finds a total of at least 30 items (coins and/or jewellery) in an area of \(100 \mathrm {~m} ^ { 2 }\).
OCR MEI Further Statistics Minor 2022 June Q3
3 Jane wonders whether the number of wasps entering a wasp's nest per 5 second interval can be modelled by a Poisson distribution with mean \(\mu\). She counts the number of wasps entering the nest over 60 randomly selected 5 -second intervals. The results are shown in Fig. 3.1. \begin{table}[h]
Number of wasps0123456789\(\geqslant 10\)
Frequency025512101011140
\captionsetup{labelformat=empty} \caption{Fig. 3.1}
\end{table}
  1. Show that a suitable estimate for the value of \(\mu\) is 5.1. Fig. 3.2 shows part of a screenshot for a \(\chi ^ { 2 }\) test to assess the goodness of fit of a Poisson model. The sample mean has been used as an estimate for the population mean. Some of the values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    ABCDE
    \includegraphics[max width=\textwidth, alt={}]{e8624e9b-5143-49d2-9683-cc3a1082694e-4_132_40_1069_273}Number of waspsObserved frequencyPoisson probabilityExpected frequencyChi-squared contribution
    2\(\leqslant 2\)70.11656.98870.0000
    3358.08741.1786
    44120.2765
    55100.0255
    66100.14908.94000.1257
    77110.10866.51343.0904
    8\(\geqslant 8\)50.14408.6414
    9
    \captionsetup{labelformat=empty} \caption{Fig. 3.2}
    \end{table}
  2. Determine the missing values in each of the following cells, giving your answers correct to 4 decimal places.
    • C3
    • D5
    • E8
    • Explain why some of the frequencies have been combined into the categories \(\leqslant 2\) and \(\geqslant 8\).
    • In this question you must show detailed reasoning.
    Carry out the hypothesis test at the 5\% significance level.
  3. Jane also carries out a \(\chi ^ { 2 }\) test for the number of wasps leaving another nest. As part of her calculations, she finds that the probability of no wasps leaving the nest in a 5 -second period is 0.0053 . She finds that a Poisson distribution is also an appropriate model in this case. Find a suitable estimate for the value of the mean number of wasps leaving the nest per 5-second period.
OCR MEI Further Statistics Minor 2023 June Q4
4 Eve lives in a narrow lane in the country. She wonders whether the number of vehicles passing her house per minute can be modelled by a Poisson distribution with mean \(\mu\). She counts the number of vehicles passing her house over 100 randomly selected one-minute intervals. The results are shown in Table 4.1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 4.1}
Number of vehicles012345678910\(\geqslant 11\)
Frequency3633141041001010
\end{table}
  1. Use the results to find an estimate for \(\mu\). The spreadsheet in Fig. 4.2 shows data for a \(\chi ^ { 2 }\) test to assess the goodness of fit of a Poisson model. The sample mean from part (a) has been used as an estimate for the population mean. Some of the values in the spreadsheet have been deliberately omitted. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 4.2}
    \multirow[b]{2}{*}{1}ABCDE
    Number of vehiclesObserved frequencyPoisson probabilityExpected frequencyChi-squared contribution
    20360.272527.25322.8073
    31330.354335.4291
    42143.5400
    5\(\geqslant 3\)170.5145
    6
    \end{table}
  2. Calculate the missing values in each of the following cells, giving your answers correct to 4 decimal places.
    • C4
    • D5
    • E3
    • In this question you must show detailed reasoning.
    Carry out the \(\chi ^ { 2 }\) test at the 5\% significance level.
  3. Eve checks her data and notices that the two largest numbers of vehicles per minute (8 and 10) occurred when some horses were being ridden along the lane, causing delays to the vehicles. She therefore repeats the analysis, missing out these two items of data. She finds that the value of the \(\chi ^ { 2 }\) test statistic is now 4.748. The number of degrees of freedom of the test is unchanged. Make two comments about this revised test.
Edexcel FS1 AS 2018 June Q1
  1. A researcher is investigating the distribution of orchids in a field. He believes that the Poisson distribution with a mean of 1.75 may be a good model for the number of orchids in each square metre. He randomly selects 150 non-overlapping areas, each of one square metre, and counts the number of orchids present in each square.
The results are recorded in the table below.
Number of orchids in
each square metre
0123456
Number of squares304235261160
He calculates the expected frequencies as follows
Number of orchids in
each square metre
012345More than 5
Number of squares26.0745.6239.9123.2810.193.57\(r\)
  1. Find the value of \(r\) giving your answer to 2 decimal places. The researcher will test, at the \(5 \%\) level of significance, whether or not the data can be modelled by a Poisson distribution with mean 1.75
  2. State clearly the hypotheses required to test whether or not this Poisson distribution is a suitable model for these data. The test statistic for this test is 2.0 and the number of degrees of freedom to be used is 4
  3. Explain fully why there are 4 degrees of freedom.
  4. Stating your critical value clearly, determine whether or not these data support the researcher's belief. The researcher works in another field where the number of orchids in each square metre is known to have a Poisson distribution with mean 1.5 He randomly selects 200 non-overlapping areas, each of one square metre, in this second field, and counts the number of orchids present in each square.
  5. Using a Poisson approximation, show that the probability that he finds at least one square with exactly 6 orchids in it is 0.506 to 3 decimal places.
Edexcel FS1 2022 June Q3
  1. During the summer, mountain rescue team \(A\) receives calls for help randomly with a rate of 0.4 per day.
    1. Find the probability that during the summer, mountain rescue team \(A\) receives at least 19 calls for help in 28 randomly selected days.
    The leader of mountain rescue team \(A\) randomly selects 250 summer days from the last few years.
    She records the number of calls for help received on each of these days.
  2. Using a Poisson approximation, estimate the probability of the leader finding at least 20 of these days when more than 1 call for help was received by mountain rescue team \(A\). Mountain rescue team \(A\) believes that the number of calls for help per day is lower in the winter than in the summer. The number of calls for help received in 42 randomly selected winter days is 8
  3. Use a suitable test, at the \(5 \%\) level of significance, to assess whether or not there is evidence that the number of calls for help per day is lower in the winter than in the summer. State your hypotheses clearly. During the summer, mountain rescue team \(B\) receives calls for help randomly with a rate of 0.2 per day, independently of calls to mountain rescue team \(A\). The random variable \(C\) is the total number of calls for help received by mountain rescue teams \(A\) and \(B\) during a period of \(n\) days in the summer.
    On a Monday in the summer, mountain rescue teams \(A\) and \(B\) each receive a call for help. Given that over the next \(n\) days \(\mathrm { P } ( C = 0 ) < 0.001\)
  4. calculate the minimum value of \(n\)
  5. Write down an assumption that needs to be made for the model to be appropriate.
OCR FS1 AS 2021 June Q4
30 marks
4 The table shows the results of a random sample drawn from a population which is thought to have the distribution \(\mathrm { U } ( 20 )\). \end{table}