Chi-squared goodness of fit: Binomial

A question is this type if and only if it tests whether observed frequency data fits a binomial distribution, possibly with parameter estimated from data.

44 questions · Standard +0.4

Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2020 November Q3
7 marks Standard +0.3
3 Apples are sold in bags of 5. Based on her previous experience, Freya claims that the probability of any apple weighing more than 100 grams is 0.35 , independently of other apples in the bag. The apples in a random sample of 150 bags are checked and the number, \(x\), in each bag weighing more than 100 grams is recorded. The results are shown in the following table.
\(x\)012345
Frequency12394637124
Carry out a goodness of fit test at the \(5 \%\) significance level and hence comment on Freya's claim.
CAIE Further Paper 4 2021 November Q3
8 marks Standard +0.3
3 A supermarket sells pears in packs of 8 . Some of the pears in a pack may not be ripe, and the supermarket manager claims that the number of unripe pears in a pack can be modelled by the distribution \(\mathrm { B } ( 8,0.15 )\). A random sample of 150 packs was selected and the number of unripe pears in each pack was recorded. The following table shows the observed frequencies together with some of the expected frequencies using the manager's binomial distribution.
Number of unripe pears per pack012345\(\geqslant 6\)
Observed frequency35484315630
Expected frequency40.874\(p\)35.64112.5792.7750.392\(q\)
  1. Find the values of \(p\) and \(q\).
  2. Carry out a goodness of fit test, at the \(5 \%\) significance level, to test whether the manager's claim is justified.
CAIE Further Paper 4 2022 November Q2
8 marks Standard +0.8
2 An organisation runs courses to train students to become engineers. These students are taught in groups of 8 . The director of the organisation claims that on average \(60 \%\) of the students in a group achieve a pass. A random sample of 150 groups of 8 students is chosen. The following table shows the observed frequencies together with some of the expected frequencies using the appropriate binomial distribution.
Number of passes per group012345678
Observed frequency00824453626101
Expected frequency\(p\)1.1806.19318.57934.836\(q\)\(r\)13.4372.519
  1. Find the values of \(p , q\) and \(r\) giving your answers correct to 3 decimal places.
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to test whether there is evidence to reject the director's claim.
CAIE Further Paper 4 2024 November Q3
10 marks Standard +0.3
3 Rosie sows 5 seeds in each of 150 plant pots. The number of seeds that germinate is recorded for each pot. The results are summarised in the following table.
Number of seeds that germinate012345
Number of pots12404335164
Rosie suggests that the number of seeds that germinate follows the binomial distribution \(\mathrm { B } ( 5 , p )\).
  1. Use Rosie's results to show that \(p = 0.42\).
  2. Carry out a goodness of fit test, at the \(10 \%\) significance level, to test whether the distribution \(\mathrm { B } ( 5,0.42 )\) is a good fit for the data. \includegraphics[max width=\textwidth, alt={}, center]{b9cbf607-4f40-41bb-8374-6b2c39f945ac-06_2720_38_109_2010} \includegraphics[max width=\textwidth, alt={}, center]{b9cbf607-4f40-41bb-8374-6b2c39f945ac-07_2726_35_97_20}
OCR S3 2015 June Q6
13 marks Standard +0.3
6 In each of 38 randomly selected weeks of the English Premier Football League there were 10 matches. Table 1 summarises the number of home wins in 10 matches, \(X\), and the corresponding number of weeks. \begin{table}[h]
Number of home wins012345678910
Number of weeks01288971200
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} A researcher investigates whether \(X\) can be modelled by the distribution \(\mathrm { B } ( 10 , p )\). He calculates the expected frequencies using a value of \(p\) obtained from the sample mean.
  1. Show that \(p = 0.45\). Table 2 shows the observed and expected number of weeks. \begin{table}[h]
    Number of home wins012345678910Totals
    Observed number of weeks0128897120038
    Expected number of weeks0.0960.7882.8996.3269.0588.8936.0642.8350.8700.1580.01338
    \captionsetup{labelformat=empty} \caption{Table 2
  2. Show how the value of 2.835 for 7 home wins is obtained.}
\end{table} The researcher carries out a test, at the \(5 \%\) significance level, of whether the distribution \(\mathrm { B } ( 10 , p )\) fits the data.
  • Explain why it is necessary to combine classes.
  • Carry out the test.
  • OCR S3 2009 January Q8
    14 marks Standard +0.3
    8 A soft drinks factory produces lemonade which is sold in packs of 6 bottles. As part of the factory's quality control, random samples of 75 packs are examined at regular intervals. The number of underfilled bottles in a pack of 6 bottles is denoted by the random variable \(X\). The results of one quality control check are shown in the following table.
    Number of underfilled bottles0123
    Number of packs442083
    A researcher assumes that \(X \sim \mathrm {~B} ( 3 , p )\).
    1. By finding the sample mean, show that an estimate of \(p\) is 0.2 .
    2. Show that, at the \(5 \%\) significance level, there is evidence that this binomial distribution does not fit the data.
    3. Another researcher suggests that the goodness of fit test should be for \(\mathrm { B } ( 6 , p )\). She finds that the corresponding value of \(\chi ^ { 2 }\) is 2.74 , correct to 3 significant figures. Given that the number of degrees of freedom is the same as in part (ii), state the conclusion of the test at the same significance level.
    OCR MEI S3 2010 January Q1
    17 marks
    1 Coastal wildlife wardens are monitoring populations of herring gulls. Herring gulls usually lay 3 eggs per nest and the wardens wish to model the number of eggs per nest that hatch. They assume that the situation can be modelled by the binomial distribution \(\mathrm { B } ( 3 , p )\) where \(p\) is the probability that an egg hatches. A random sample of 80 nests each containing 3 eggs has been observed with the following results.
    Number of eggs hatched0123
    Number of nests7232921
    1. Initially it is assumed that the value of \(p\) is \(\frac { 1 } { 2 }\). Test at the \(5 \%\) level of significance whether it is reasonable to suppose that the model applies with \(p = \frac { 1 } { 2 }\).
    2. The model is refined by estimating \(p\) from the data. Find the mean of the observed data and hence an estimate of \(p\).
    3. Using the estimated value of \(p\), the value of the test statistic \(X ^ { 2 }\) turns out to be 2.3857 . Is it reasonable to suppose, at the \(5 \%\) level of significance, that this refined model applies?
    4. Discuss the reasons for the different outcomes of the tests in parts (i) and (iii).
    CAIE FP2 2015 June Q11 OR
    Challenging +1.2
    Each of 200 identically biased dice is thrown repeatedly until an even number is obtained. The number of throws, \(x\), needed is recorded and the results are summarised in the following table.
    \(x\)123456\(\geqslant 7\)
    Frequency12643223510
    State a type of distribution that could be used to fit the data given in the table above. Fit a distribution of this type in which the probability of throwing an even number for each die is 0.6 and carry out a goodness of fit test at the 5\% significance level. For each of these dice, it is known that the probability of obtaining a 6 when it is thrown is 0.25 . Ten of these dice are each thrown 5 times. Find the probability that at least one 6 is obtained on exactly 4 of the 10 dice.
    CAIE FP2 2016 June Q9
    10 marks Standard +0.3
    9 Applicants for a national teacher training course are required to pass a mathematics test. Each year, the applicants are tested in groups of 6 and the number of successful applicants in each group is recorded. The overall proportion of successful applicants has remained constant over the years and is equal to \(60 \%\) of the applicants. The results from 150 randomly chosen groups are shown in the following table.
    Number of successful applicants0123456
    Number of groups13255138302
    Test, at the \(5 \%\) significance level, the goodness of fit of the distribution \(\mathbf { B } ( 6,0.6 )\) for the number of successful applicants in a group.
    CAIE FP2 2017 June Q11 OR
    Standard +0.3
    A shop is supplied with large quantities of plant pots in packs of six. These pots can be damaged easily if they are not packed carefully. The manager of the shop is a statistician and he believes that the number of damaged pots in a pack of six has a binomial distribution. He chooses a random sample of 250 packs and records the numbers of damaged pots per pack. His results are shown in the following table.
    Number of damaged
    pots per pack \(( x )\)
    0123456
    Frequency486978322210
    1. Show that the mean number of damaged pots per pack in this sample is 1.656 .
      The following table shows some of the expected frequencies, correct to 2 decimal places, using an appropriate binomial distribution.
      Number of damaged
      pots per pack \(( x )\)
      0123456
      Expected frequency36.0182.36\(a\)39.89\(b\)1.740.11
    2. Find the values of \(a\) and \(b\), correct to 2 decimal places
    3. Use a goodness-of-fit test at the \(1 \%\) significance level to determine whether the manager's belief is justified.
    CAIE FP2 2018 June Q11 OR
    Standard +0.8
    A scientist carries out an experiment to investigate the quantity \(X\), which takes the values \(0,1,2,3,4\), 5 or 6 . He believes that the values taken by \(X\) follow a binomial distribution. He conducts 250 trials. His results are summarised in the following table.
    \(x\)0123456
    Observed frequency228372531730
    1. Show that unbiased estimates of the mean and variance for these results are 1.876 and 1.266 respectively, correct to 3 decimal places. By evaluating the mean and variance of the distribution B(6, 0.313), explain why \(X\) could have this distribution.
      The expected frequencies corresponding to the distribution \(\mathrm { B } ( 6,0.313 )\) are shown in the following table.
      \(x\)0123456
      Observed frequency228372531730
      Expected frequency26.371.981.849.717.03.10.2
    2. Show how the expected frequency for \(x = 4\) is calculated.
    3. Test at the \(5 \%\) significance level whether the scientist's belief is correct.
      If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
    CAIE FP2 2009 November Q9
    10 marks Standard +0.3
    9 It has been found that \(60 \%\) of the computer chips produced in a factory are faulty. As part of quality control, 100 samples of 4 chips are selected at random, and each chip is tested. The number of faulty chips in each sample is recorded, with the results given in the following table.
    Number of faulty chips01234
    Number of samples212274910
    The expected values for a binomial distribution with parameters \(n = 4\) and \(p = 0.6\) are given in the following table.
    Number of faulty chips01234
    Expected value2.5615.3634.5634.5612.96
    Show how the expected value 34.56 corresponding to 2 faulty chips is obtained. Carry out a goodness of fit test at the 5\% significance level, and state what can be deduced from the outcome of the test.
    CAIE FP2 2012 November Q8
    9 marks Challenging +1.2
    8 Drinking glasses are sold in packs of 4. The manufacturer conducts a survey to assess the quality of the glasses. The results from a sample of 50 randomly chosen packs are summarised in the following table.
    Number of perfect glasses01234
    Number of packs13101719
    Fit a binomial distribution to the data and carry out a goodness of fit test at the \(10 \%\) significance level.
    CAIE FP2 2013 November Q8
    10 marks Standard +0.8
    8 A factory produces china mugs. Random samples of size 6 are selected at regular intervals, and the mugs are inspected for defects. During one week, 100 samples are selected and the numbers of defective mugs found are summarised in the following table.
    Number of defective mugs0123456
    Number of samples1143358210
    Fit a binomial distribution to the data and carry out a goodness of fit test at the 5\% significance level.
    Edexcel S3 2023 January Q4
    14 marks Standard +0.3
    4 A research student is investigating the number of children who are girls in families with 4 children. The table below shows her results for 200 such families.
    Number of girls01234
    Frequency1568693810
    The research student suggests that a binomial distribution with \(p = \frac { 1 } { 2 }\) could be a suitable model for the number of children who are girls in a family of 4 children.
    1. Using her results and a \(5 \%\) significance level, test the research student's claim. You should state your hypotheses, expected frequencies, test statistic and the critical value used. The research student decides to refine the model and retains the idea of using a binomial distribution but does not specify the probability that the child is a girl.
    2. Use the data in the table to show that the probability that a child is a girl is 0.45 The research student uses the probability from part (b) to calculate a new set of expected frequencies, none of which are less than 5
      The statistic \(\sum \frac { ( O - E ) ^ { 2 } } { E }\) is evaluated and found to be 2.47
    3. Test, at the \(5 \%\) significance level, whether using a binomial distribution is suitable to model the number of children who are girls in a family of 4 children. You should state your hypotheses and the critical value used.
    Edexcel S3 2014 June Q6
    14 marks Standard +0.3
    6. Eight tasks were given to each of 125 randomly selected job applicants. The number of tasks failed by each applicant is recorded. The results are as follows
    Number of tasks failed by an applicant0123456 or more
    Frequency22145421230
    1. Show that the probability of a randomly selected task, from this sample, being failed is 0.3 An employer believes that a binomial distribution might provide a good model for the number of tasks, out of 8, that an applicant fails. He uses a binomial distribution, with the estimated probability 0.3 of a task being failed. The calculated expected frequencies are as follows
      Number of tasks failed by an applicant0123456 or more
      Expected frequency7.2124.7137.06\(r\)17.025.83\(s\)
    2. Find the value of \(r\) and the value of \(s\) giving your answers to 2 decimal places.
    3. Test, at the \(5 \%\) level of significance, whether or not a binomial distribution is a suitable model for these data. State your hypotheses and show your working clearly. The employer believes that all applicants have the same probability of failing each task.
    4. Use your result from part(c) to comment on this belief.
    Edexcel S3 2018 June Q2
    12 marks Standard +0.3
    1. A random sample of 75 packets of seeds is selected from a production line. Each packet contains 12 seeds. The seeds are planted and the number of seeds that germinate from each packet is recorded. The results are as follows.
    Number of seeds that
    germinate from each packet
    6 or
    fewer
    789101112
    Number of packets0351828174
    1. Show that the probability of a randomly selected seed from this sample germinating is 0.82 A gardener suggests that a binomial distribution can be used to model the number of seeds that germinate from a packet of 12 seeds. She uses a binomial distribution with the estimated probability 0.82 of a seed germinating. Some of the calculated expected frequencies are shown in the table below.
      Number of seeds that
      germinate from each packet
      6 or
      fewer
      789101112
      Expected frequency\(s\)2.807.97\(r\)22.0418.266.93
    2. Calculate the value of \(r\) and the value of \(s\), giving your answers correct to 2 decimal places.
    3. Test, at the \(10 \%\) level of significance, whether or not these data suggest that the binomial distribution is a suitable model for the number of seeds that germinate from a packet of 12 seeds. State your hypotheses clearly and show your working.
    Edexcel S3 2021 June Q5
    16 marks Standard +0.3
    1. A researcher is looking into the effectiveness of a new medicine for the relief of symptoms. He collects random samples of 8 people who are taking the medicine from each of 50 different medical practices. The number of people who say that the medicine is a success, in each sample, is recorded. The results are summarised in the table below.
    Number of successes012345678
    Number of practices46312107422
    The researcher decides to model this data using a binomial distribution.
    1. State two necessary assumptions that the researcher made in order to use this model.
    2. Show that the mean number of successes per sample is 3.54 He decides to use this mean to calculate expected frequencies. The results are shown in the table below.
      Number of successes012345678
      Expected frequency0.472.968.2313.07\(f\)8.233.270.74\(g\)
    3. Calculate the value of \(f\) and the value of \(g\). Give your answers to 2 decimal places.
    4. Stating your hypotheses clearly, test at the \(10 \%\) level of significance, whether or not the binomial distribution is a suitable model for the number of successes in samples of 8 people.
    Edexcel S3 2002 June Q6
    12 marks Standard +0.3
    6. Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
    Number of femalesObserved number of littersExpected number of litters
    010.78
    196.25
    22721.88
    346\(R\)
    449S
    535\(T\)
    62621.88
    756.25
    820.78
    1. Find the values of \(R , S\) and \(T\).
    2. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a \(5 \%\) level of significance. An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
    3. Explain how this would have affected the test.
    Edexcel S3 2004 June Q6
    15 marks Standard +0.3
    6 Three six-sided dice, which were assumed to be fair, were rolled 250 times. On each occasion the number \(X\) of sixes was recorded. The results were as follows.
    Number of sixes0123
    Frequency125109133
    1. Write down a suitable model for \(X\).
    2. Test, at the \(1 \%\) level of significance, the suitability of your model for these data.
    3. Explain how the test would have been modified if it had not been assumed that the dice were fair.
    Edexcel S3 2006 June Q8
    13 marks Standard +0.3
    8. Five coins were tossed 100 times and the number of heads recorded. The results are shown in the table below.
    Number
    of heads
    012345
    Frequency6182934103
    1. Suggest a suitable distribution to model the number of heads when five unbiased coins are tossed.
    2. Test, at the \(10 \%\) level of significance, whether or not the five coins are unbiased. State your hypotheses clearly.
    Edexcel S3 2007 June Q4
    13 marks Standard +0.3
    4. A quality control manager regularly samples 20 items from a production line and records the number of defective items \(x\). The results of 100 such samples are given in table 1 below. \begin{table}[h]
    \(x\)01234567 or more
    Frequency173119149730
    \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
    1. Estimate the proportion of defective items from the production line. The manager claimed that the number of defective items in a sample of 20 can be modelled by a binomial distribution. He used the answer in part (a) to calculate the expected frequencies given in Table 2. \begin{table}[h]
      \(x\)01234567 or more
      Expected
      frequency
      12.227.0\(r\)19.0\(s\)3.20.90.2
      \captionsetup{labelformat=empty} \caption{Table 2}
      \end{table}
    2. Find the value of \(r\) and the value of \(s\) giving your answers to 1 decimal place.
    3. Stating your hypotheses clearly, use a \(5 \%\) level of significance to test the manager's claim.
    4. Explain what the analysis in part (c) tells the manager about the occurrence of defective items from this production line.
    Edexcel S3 2008 June Q6
    13 marks Standard +0.3
    1. Ten cuttings were taken from each of 100 randomly selected garden plants. The numbers of cuttings that did not grow were recorded.
    The results are as follows
    No. of cuttings
    which did
    not grow
    012345678,9 or 10
    Frequency11213020123210
    1. Show that the probability of a randomly selected cutting, from this sample, not growing is 0.223 A gardener believes that a binomial distribution might provide a good model for the number of cuttings, out of 10 , that do not grow. He uses a binomial distribution, with the probability 0.2 of a cutting not growing. The calculated expected frequencies are as follows
      No. of cuttings which did
      not grow
      012345 or more
      Expected frequency\(r\)26.84\(s\)20.138.81\(t\)
    2. Find the values of \(r , s\) and \(t\).
    3. State clearly the hypotheses required to test whether or not this binomial distribution is a suitable model for these data. The test statistic for the test is 4.17 and the number of degrees of freedom used is 4 .
    4. Explain fully why there are 4 degrees of freedom.
    5. Stating clearly the critical value used, carry out the test using a \(5 \%\) level of significance.
    Edexcel S3 2012 June Q6
    14 marks Standard +0.3
    6. A total of 100 random samples of 6 items are selected from a production line in a factory and the number of defective items in each sample is recorded. The results are summarised in the table below.
    Number of
    defective
    items
    0123456
    Number of
    samples
    616202317108
    1. Show that the mean number of defective items per sample is 2.91 A factory manager suggests that the data can be modelled by a binomial distribution with \(n = 6\). He uses the mean from the sample above and calculates expected frequencies as shown in the table below.
      Number of
      defective
      items
      0123456
      Expected
      frequency
      1.8710.5424.82\(a\)22.018.29\(b\)
    2. Calculate the value of \(a\) and the value of \(b\) giving your answers to 2 decimal places.
    3. Test, at the \(5 \%\) level, whether or not the binomial distribution is a suitable model for the number of defective items in samples of 6 items. State your hypotheses clearly.
    Edexcel S3 2014 June Q6
    17 marks Standard +0.3
    6. Bags of \(\pounds 1\) coins are paid into a bank. Each bag contains 20 coins. The bank manager believes that \(5 \%\) of the \(\pounds 1\) coins paid into the bank are fakes. He decides to use the distribution \(X \sim \mathrm {~B} ( 20,0.05 )\) to model the random variable \(X\), the number of fake \(\pounds 1\) coins in each bag.
    1. State the assumptions necessary for the binomial distribution to be an appropriate model in this case. The bank manager checks a random sample of 150 bags of \(\pounds 1\) coins and records the number of fake coins found in each bag. His results are summarised in Table 1. \begin{table}[h]
      Number of fake coins in each bag01234 or more
      Observed frequency436226136
      Expected frequency53.856.6\(r\)8.9\(s\)
      \captionsetup{labelformat=empty} \caption{Table 1}
      \end{table}
    2. Calculate the values of \(r\) and \(s\), giving your answers to 1 decimal place.
    3. Carry out a hypothesis test, at the \(5 \%\) significance level, to see if the data supports the bank manager's statistical model. State your hypotheses clearly. Question 6 parts (d) and (e) are continued on page 24 The assistant manager thinks that a binomial distribution is a good model but suggests that the proportion of fake coins is higher than \(5 \%\). She calculates the actual proportion of fake coins in the sample and uses this value to carry out a new hypothesis test on the data. Her expected frequencies are shown in Table 2. \begin{table}[h]
      Number of fake coins in each bag01234 or more
      Observed frequency436226136
      Expected frequency44.555.733.212.54.1
      \captionsetup{labelformat=empty} \caption{Table 2}
      \end{table}
    4. Explain why there are 2 degrees of freedom in this case.
    5. Given that she obtains a \(\chi ^ { 2 }\) test statistic of 2.67 , test the assistant manager's hypothesis that the binomial distribution is a good model for the number of fake coins in each bag. Use a \(5 \%\) level of significance and state your hypotheses clearly.