5.06b Fit prescribed distribution: chi-squared test

136 questions

Sort by: Default | Easiest first | Hardest first
OCR Further Statistics 2023 June Q8
16 marks Challenging +1.2
8 A team of researchers have reason to believe that the number of calls received in randomly chosen 10-minute intervals to a call centre can be well modelled by a Poisson distribution. To test this belief the researchers record the number of telephone calls received in 60 randomly chosen 10-minute intervals. The results, together with relevant calculations, are shown in the following table.
Total
Number of calls, \(r\)01234\(\geqslant 5\)
Observed frequency, \(f\)18131298060
rf013242732096
\(\mathrm { r } ^ { 2 } \mathrm { f }\)01348811280270
Expected frequency12.11419.38215.5068.2703.3081.42160
Contribution to test statistic2.8602.1010.7931.2326.99
  1. Calculate the mean of the observed number of calls received.
  2. Calculate the variance of the observed number of calls received.
  3. Comment on what your answers to parts (a) and (b) suggest about the proposed model.
  4. Explain why it is necessary to combine some cells in the table.
  5. Show how the values 15.506 and 0.793 in the table were obtained.
  6. Carry out the test, at the \(5 \%\) significance level. In the light of the result of the test, the team consider that a different model is appropriate. They propose the following improved model: $$P ( R = r ) = \begin{cases} \frac { 1 } { 60 } ( a + ( 2 - r ) b ) & r = 0,1,2,3,4 \\ 0 & \text { otherwise } \end{cases}$$ where \(a\) and \(b\) are integers.
  7. Use at least three of the observed frequencies to suggest appropriate values for \(a\) and \(b\). You should consider more than one possible pair of values, and explain which pair of values you consider best. (Do not carry out a goodness-of-fit test.)
OCR Further Statistics 2024 June Q5
12 marks Standard +0.3
5 Some bird-watchers study the song of chaffinches in a particular wood. They investigate whether the number, \(N\), of separate bursts of song in a 5 minute period can be modelled by a Poisson distribution. They assume that a burst of song can be considered as a single event, and that bursts of song occur randomly. \section*{(a) State two further assumptions needed for \(N\) to be well modelled by a Poisson distribution.} The bird-watchers record the value of \(N\) in each of 60 periods of 5 minutes. The mean and variance of the results are 3.55 and 5.6475 respectively.
(b) Explain what this suggests about the validity of a Poisson distribution as a model in this context. The complete results are shown in the table.
\(n\)012345678\(\geqslant 9\)
Frequency103781366250
The bird-watchers carry out a \(\chi ^ { 2 }\) goodness of fit test at the \(5 \%\) significance level.
(c) State suitable hypotheses for the test.
(d) Determine the contribution to the test statistic for \(n = 3\).
(e) The total value of the test statistic, obtained by combining the cells for \(n \leqslant 1\) and also for \(n \geqslant 6\), is 9.202 , correct to 4 significant figures. Complete the goodness of fit test.
(f) It is known that chaffinches are more likely to sing in the presence of other chaffinches. Explain whether this fact affects the validity of a Poisson model for \(N\).
OCR Further Statistics 2021 November Q6
11 marks Standard +0.3
6 A practice examination paper is taken by 500 candidates, and the organiser wishes to know what continuous distribution could be used to model the actual time, \(X\) minutes, taken by candidates to complete the paper. The organiser starts by carrying out a goodness-of-fit test for the distribution \(\mathrm { N } \left( 100,15 ^ { 2 } \right)\) at the \(5 \%\) significance level. The grouped data and the results of some of the calculations are shown in the following table.
Time\(0 \leqslant X < 80\)\(80 \leqslant X < 90\)\(90 \leqslant X < 100\)\(100 \leqslant X < 110\)\(X \geqslant 110\)
Observed frequency \(O\)3695137129103
Expected frequency \(E\)45.60680.641123.754123.754126.246
\(\frac { ( O - E ) ^ { 2 } } { E }\)2.0232.5571.4180.2224.280
  1. State suitable hypotheses for the test.
  2. Show how the figures 123.754 and 0.222 in the column for \(100 \leqslant X < 110\) were obtained. [3]
  3. Carry out the test. The organiser now wants to suggest an improved model for the data.
    1. Suggest an aspect of the data that the organiser should take into account in considering an improved model.
    2. The graph of the probability density function for the distribution \(\mathrm { N } \left( 100,15 ^ { 2 } \right)\) is shown in the diagram in the Printed Answer Booklet. On the same diagram sketch the probability density function of an improved model that takes into account the aspect of the data in part (d)(i).
OCR Further Statistics Specimen Q8
15 marks Standard +0.3
8 A continuous random variable \(X\) has probability density function given by $$\mathrm { f } ( x ) = \left\{ \begin{array} { c c } 0.8 \mathrm { e } ^ { - 0.8 x } & x \geq 0 \\ 0 & x < 0 \end{array} \right.$$
  1. Find the mean and variance of \(X\). The lifetime of a certain organism is thought to have the same distribution as \(X\). The lifetimes in days of a random sample of 60 specimens of the organism were found. The observed frequencies, together with the expected frequencies correct to 3 decimal places, are given in the table.
    Range\(0 \leq x < 1\)\(1 \leq x < 2\)\(2 \leq x < 3\)\(3 \leq x < 4\)\(x \geq 4\)
    Observed24221031
    Expected33.04014.8466.6712.9972.446
  2. Show how the expected frequency for \(1 \leq x < 2\) is obtained.
  3. Carry out a goodness of fit test at the \(5 \%\) significance level.
Edexcel S3 2021 January Q5
18 marks Standard +0.3
5. Chrystal is studying the lengths of pine cones that have fallen from a tree. She believes that the length, \(X \mathrm {~cm}\), of the pine cones can be modelled by a normal distribution with mean 6 cm and standard deviation 0.75 cm . She collects a random sample of 80 pine cones and their lengths are recorded in the table below.
Length, \(x\) cm\(x < 5\)\(5 \leqslant x < 5.5\)\(5.5 \leqslant x < 6\)\(6 \leqslant x < 6.5\)\(x \geqslant 6.5\)
Frequency614242610
  1. Stating your hypotheses clearly and using a \(10 \%\) level of significance, test Chrystal's belief. Show your working clearly and state the expected frequencies, the test statistic and the critical value used.
    (10) Chrystal's friend David asked for more information about the lengths of the 80 pine cones. Chrystal told him that $$\sum x = 464 \quad \text { and } \quad \sum x ^ { 2 } = 2722.59$$
  2. Calculate unbiased estimates of the mean and variance of the lengths of the pine cones. David used the calculations from part (b) to test whether or not the lengths of the pine cones are normally distributed using Chrystal's sample. His test statistic was 3.50 (to 3 significant figures) and he did not pool any classes.
  3. Using a \(10 \%\) level of significance, complete David's test stating the critical value and the degrees of freedom used.
  4. Estimate, to 2 significant figures, the proportion of pine cones from the tree that are longer than 7 cm . \includegraphics[max width=\textwidth, alt={}, center]{ba3f3f9c-53d2-4e95-b2f3-3f617f1821ed-15_2255_50_314_34}
Edexcel S3 2022 January Q6
12 marks Standard +0.8
  1. The number of emails per hour received by a helpdesk were recorded. The results for a random sample of 80 one-hour periods are shown in the table.
Number of emails per hour0123456
Frequencies11023151993
  1. Show that the mean number of emails per hour in the sample is 3 The manager believes that the number of emails per hour received could be modelled by a Poisson distribution. The following table shows some of the expected frequencies.
    Number of emails per hourExpected Frequencies
    0\(r\)
    111.949
    217.923
    317.923
    413.443
    5\(s\)
    \(\geqslant 6\)\(t\)
  2. Find the values of \(r , s\) and \(t\), giving your answers to 3 decimal places.
  3. Using a 10\% significance level, test whether or not a Poisson model is reasonable. You should clearly state your hypotheses, test statistic and the critical value used.
Edexcel S3 2014 June Q6
14 marks Standard +0.3
6. Eight tasks were given to each of 125 randomly selected job applicants. The number of tasks failed by each applicant is recorded. The results are as follows
Number of tasks failed by an applicant0123456 or more
Frequency22145421230
  1. Show that the probability of a randomly selected task, from this sample, being failed is 0.3 An employer believes that a binomial distribution might provide a good model for the number of tasks, out of 8, that an applicant fails. He uses a binomial distribution, with the estimated probability 0.3 of a task being failed. The calculated expected frequencies are as follows
    Number of tasks failed by an applicant0123456 or more
    Expected frequency7.2124.7137.06\(r\)17.025.83\(s\)
  2. Find the value of \(r\) and the value of \(s\) giving your answers to 2 decimal places.
  3. Test, at the \(5 \%\) level of significance, whether or not a binomial distribution is a suitable model for these data. State your hypotheses and show your working clearly. The employer believes that all applicants have the same probability of failing each task.
  4. Use your result from part(c) to comment on this belief.
Edexcel S3 2017 June Q4
11 marks Standard +0.3
4. The number of emergency plumbing calls received per day by a local council was recorded over a period of 80 days. The results are summarised in the table below.
Number of calls, \(\boldsymbol { x }\)012345678
Frequency3131415108863
  1. Show that the mean number of emergency plumbing calls received per day is 3.5 A council officer suggests that a Poisson distribution can be used to model the number of emergency plumbing calls received per day. He uses the mean from the sample above and calculates the expected frequencies shown in the table below.
    \(\boldsymbol { x }\)01234567
    8 or
    more
    Expected
    frequency
    2.428.4614.80\(r\)15.1010.576.173.08\(s\)
  2. Calculate the value of \(r\) and the value of \(s\), giving your answers correct to 2 decimal places.
  3. Test, at the \(5 \%\) level of significance, whether or not the Poisson distribution is a suitable model for the number of emergency plumbing calls received per day. State your hypotheses clearly.
Edexcel S3 2018 June Q2
12 marks Standard +0.3
  1. A random sample of 75 packets of seeds is selected from a production line. Each packet contains 12 seeds. The seeds are planted and the number of seeds that germinate from each packet is recorded. The results are as follows.
Number of seeds that
germinate from each packet
6 or
fewer
789101112
Number of packets0351828174
  1. Show that the probability of a randomly selected seed from this sample germinating is 0.82 A gardener suggests that a binomial distribution can be used to model the number of seeds that germinate from a packet of 12 seeds. She uses a binomial distribution with the estimated probability 0.82 of a seed germinating. Some of the calculated expected frequencies are shown in the table below.
    Number of seeds that
    germinate from each packet
    6 or
    fewer
    789101112
    Expected frequency\(s\)2.807.97\(r\)22.0418.266.93
  2. Calculate the value of \(r\) and the value of \(s\), giving your answers correct to 2 decimal places.
  3. Test, at the \(10 \%\) level of significance, whether or not these data suggest that the binomial distribution is a suitable model for the number of seeds that germinate from a packet of 12 seeds. State your hypotheses clearly and show your working.
Edexcel S3 2021 June Q5
16 marks Standard +0.3
  1. A researcher is looking into the effectiveness of a new medicine for the relief of symptoms. He collects random samples of 8 people who are taking the medicine from each of 50 different medical practices. The number of people who say that the medicine is a success, in each sample, is recorded. The results are summarised in the table below.
Number of successes012345678
Number of practices46312107422
The researcher decides to model this data using a binomial distribution.
  1. State two necessary assumptions that the researcher made in order to use this model.
  2. Show that the mean number of successes per sample is 3.54 He decides to use this mean to calculate expected frequencies. The results are shown in the table below.
    Number of successes012345678
    Expected frequency0.472.968.2313.07\(f\)8.233.270.74\(g\)
  3. Calculate the value of \(f\) and the value of \(g\). Give your answers to 2 decimal places.
  4. Stating your hypotheses clearly, test at the \(10 \%\) level of significance, whether or not the binomial distribution is a suitable model for the number of successes in samples of 8 people.
Edexcel S3 2022 June Q7
11 marks Challenging +1.2
7 The following table shows observed frequencies, where \(x\) is an integer, from an experiment to test whether or not a six-sided die is biased.
Number on die123456
Observed frequency\(x + 6\)\(x - 8\)\(x + 8\)\(x - 5\)\(x + 4\)\(x - 5\)
A goodness of fit test is conducted to determine if there is evidence that the die is biased.
  1. Write down suitable null and alternative hypotheses for this test. It is found that the null hypothesis is not rejected at the \(5 \%\) significance level.
  2. Hence
    1. find the minimum value of \(x\)
    2. determine the minimum number of times the die was rolled.
Edexcel S3 2023 June Q4
11 marks Standard +0.3
  1. It is suggested that the delay, in hours, of certain flights from a particular country may be modelled by the continuous random variable, \(T\), with probability density function
$$f ( t ) = \left\{ \begin{array} { c l } \frac { 2 } { 25 } t & 0 \leqslant t < 5 \\ 0 & \text { otherwise } \end{array} \right.$$
  1. Show that for \(0 \leqslant a \leqslant 4\) $$P ( a \leqslant T < a + 1 ) = \frac { 1 } { 25 } ( 2 a + 1 )$$ A random sample of 150 of these flights is taken. The delays are summarised in the table below.
    Delay ( \(\boldsymbol { t }\) hours)Frequency
    \(0 \leqslant t < 1\)10
    \(1 \leqslant t < 2\)13
    \(2 \leqslant t < 3\)24
    \(3 \leqslant t < 4\)35
    \(4 \leqslant t < 5\)68
  2. Test, at the \(5 \%\) significance level, whether the given probability density function is a suitable model for these delays.
    You should state your hypotheses, expected frequencies, test statistic and the critical value used.
Edexcel S3 2024 June Q4
11 marks Standard +0.3
  1. The manager of a company making ice cream believes that the proportions of people in the population who prefer vanilla, chocolate, strawberry and other are in the ratio \(10 : 5 : 2 : 3\)
The manager takes a random sample of 400 customers and records their age and favourite ice cream flavour. The results are shown in the table below.
\multirow{2}{*}{}Ice cream flavour
VanillaChocolateStrawberryOtherTotal
\multirow{3}{*}{Age}Child95251325158
Teenager57201736130
Adult36501016112
Total188954077400
  1. Use the data in the table to test, at the \(5 \%\) level of significance, the manager's belief. You should state your hypotheses, test statistic, critical value and conclusion clearly. A researcher wants to investigate whether or not there is a relationship between the age of a customer and their favourite ice cream flavour. In order to test whether favourite ice cream flavour and age are related, the researcher plans to carry out a \(\chi ^ { 2 }\) test.
  2. Use the table to calculate expected frequencies for the group
    1. teenagers whose favourite ice cream flavour is vanilla,
    2. adults whose favourite ice cream flavour is chocolate.
  3. Write down the number of degrees of freedom for this \(\chi ^ { 2 }\) test.
Edexcel S3 2020 October Q4
15 marks Standard +0.3
4. Luka wants to carry out a survey of students at his school. He obtains a list of all 280 students.
  1. Explain how he can use this list to select a systematic sample of 40 students. Luka is trying to make his own random number table. He generates 400 digits to put in his table. Figure 1 shows the frequency of each digit in his table. \begin{table}[h]
    Digit generated0123456789
    Frequency36423341444348383243
    \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{table} A test is carried out at the \(10 \%\) level of significance to see if the digits Luka generates follow a uniform distribution. For this test \(\sum \frac { ( \mathrm { O } - \mathrm { E } ) ^ { 2 } } { \mathrm { E } } = 5.9\)
  2. Determine the conclusion of this test.
    (3) The digits generated by Luka are taken two at a time to form two-digit numbers. Figure 2 shows the frequency of two-digit numbers in his table. \begin{table}[h]
    Two-digit numbers generated\(00 - 19\)\(20 - 39\)\(40 - 59\)\(60 - 79\)\(80 - 99\)
    Frequency3149304248
    \captionsetup{labelformat=empty} \caption{Figure 2}
    \end{table}
  3. Test, at the \(10 \%\) level of significance, whether the two-digit numbers generated by Luka follow a uniform distribution. You should state the hypotheses, the degrees of freedom and the critical value used for this test. There are 70 students in Year 12 at his school.
  4. State, giving a reason, the advice you would give to Luka regarding the use of his table of numbers for generating a simple random sample of 10 of the Year 12 students.
Edexcel S3 2021 October Q2
8 marks Standard +0.3
2. Andy has some apple trees. Over many years she has graded each apple from her trees as \(A , B , C , D\) or \(E\) according to the quality of the apple, with \(A\) being the highest quality and \(E\) being the lowest quality. She knows that the proportion of apples in each grade produced by her trees is as follows.
Grade\(A\)\(B\)\(C\)\(D\)\(E\)
Proportion\(4 \%\)\(28 \%\)\(52 \%\)\(10 \%\)\(6 \%\)
Raj advises Andy to add potassium to the soil around her apple trees. Andy believes that adding potassium will not affect the distribution of grades for the quality of the apples. To test her belief Andy adds potassium to the soil around her apple trees. The following year she counts the number of apples in each grade. The number of apples in each grade is shown in the table below.
Grade\(A\)\(B\)\(C\)\(D\)\(E\)
Frequency971136213
Test Andy's belief using a \(5 \%\) level of significance. Show your working clearly, stating your hypotheses, expected frequencies and degrees of freedom. 2 continued
Edexcel S3 2018 Specimen Q3
11 marks Standard +0.3
3. The number of accidents on a particular stretch of motorway was recorded each day for 200 consecutive days. The results are summarised in the following table.
Number of accidents012345
Frequency4757463596
  1. Show that the mean number of accidents per day for these data is 1.6 A motorway supervisor believes that the number of accidents per day on this stretch of motorway can be modelled by a Poisson distribution. She uses the mean found in part (a) to calculate the expected frequencies for this model. Her results are given in the following table.
    Number of accidents012345 or more
    Frequency40.3864.61\(r\)27.5711.03\(s\)
  2. Find the value of \(r\) and the value of \(s\), giving your answers to 2 decimal places.
  3. Stating your hypotheses clearly, use a \(10 \%\) level of significance to test the motorway supervisor's belief. Show your working clearly.
Edexcel S3 2006 January Q6
13 marks Standard +0.3
6. An area of grass was sampled by placing a \(1 \mathrm {~m} \times 1 \mathrm {~m}\) square randomly in 100 places. The numbers of daisies in each of the squares were counted. It was decided that the resulting data could be modelled by a Poisson distribution with mean 2. The expected frequencies were calculated using the model. The following table shows the observed and expected frequencies.
Number of daisiesObserved frequencyExpected frequency
0813.53
13227.07
227\(r\)
318\(s\)
4109.02
533.61
611.20
700.34
\(\geq 8\)1\(t\)
  1. Find values for \(r , s\) and \(t\).
  2. Using a \(5 \%\) significance level, test whether or not this Poisson model is suitable. State your hypotheses clearly. An alternative test might have been to estimate the population mean by using the data given.
  3. Explain how this would have affected the test.
    (2)
Edexcel S3 2004 June Q6
15 marks Standard +0.3
6 Three six-sided dice, which were assumed to be fair, were rolled 250 times. On each occasion the number \(X\) of sixes was recorded. The results were as follows.
Number of sixes0123
Frequency125109133
  1. Write down a suitable model for \(X\).
  2. Test, at the \(1 \%\) level of significance, the suitability of your model for these data.
  3. Explain how the test would have been modified if it had not been assumed that the dice were fair.
Edexcel S3 2007 June Q4
13 marks Standard +0.3
4. A quality control manager regularly samples 20 items from a production line and records the number of defective items \(x\). The results of 100 such samples are given in table 1 below. \begin{table}[h]
\(x\)01234567 or more
Frequency173119149730
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table}
  1. Estimate the proportion of defective items from the production line. The manager claimed that the number of defective items in a sample of 20 can be modelled by a binomial distribution. He used the answer in part (a) to calculate the expected frequencies given in Table 2. \begin{table}[h]
    \(x\)01234567 or more
    Expected
    frequency
    12.227.0\(r\)19.0\(s\)3.20.90.2
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  2. Find the value of \(r\) and the value of \(s\) giving your answers to 1 decimal place.
  3. Stating your hypotheses clearly, use a \(5 \%\) level of significance to test the manager's claim.
  4. Explain what the analysis in part (c) tells the manager about the occurrence of defective items from this production line.
Edexcel S3 2008 June Q6
13 marks Standard +0.3
  1. Ten cuttings were taken from each of 100 randomly selected garden plants. The numbers of cuttings that did not grow were recorded.
The results are as follows
No. of cuttings
which did
not grow
012345678,9 or 10
Frequency11213020123210
  1. Show that the probability of a randomly selected cutting, from this sample, not growing is 0.223 A gardener believes that a binomial distribution might provide a good model for the number of cuttings, out of 10 , that do not grow. He uses a binomial distribution, with the probability 0.2 of a cutting not growing. The calculated expected frequencies are as follows
    No. of cuttings which did
    not grow
    012345 or more
    Expected frequency\(r\)26.84\(s\)20.138.81\(t\)
  2. Find the values of \(r , s\) and \(t\).
  3. State clearly the hypotheses required to test whether or not this binomial distribution is a suitable model for these data. The test statistic for the test is 4.17 and the number of degrees of freedom used is 4 .
  4. Explain fully why there are 4 degrees of freedom.
  5. Stating clearly the critical value used, carry out the test using a \(5 \%\) level of significance.
Edexcel S3 2010 June Q6
12 marks Standard +0.8
  1. A total of 228 items are collected from an archaeological site. The distance from the centre of the site is recorded for each item. The results are summarised in the table below.
Distance from the
centre of the site \(( \mathrm { m } )\)
\(0 - 1\)\(1 - 2\)\(2 - 4\)\(4 - 6\)\(6 - 9\)\(9 - 12\)
Number of items221544375258
Test, at the \(5 \%\) level of significance, whether or not the data can be modelled by a continuous uniform distribution. State your hypotheses clearly.
Edexcel S3 2012 June Q6
14 marks Standard +0.3
6. A total of 100 random samples of 6 items are selected from a production line in a factory and the number of defective items in each sample is recorded. The results are summarised in the table below.
Number of
defective
items
0123456
Number of
samples
616202317108
  1. Show that the mean number of defective items per sample is 2.91 A factory manager suggests that the data can be modelled by a binomial distribution with \(n = 6\). He uses the mean from the sample above and calculates expected frequencies as shown in the table below.
    Number of
    defective
    items
    0123456
    Expected
    frequency
    1.8710.5424.82\(a\)22.018.29\(b\)
  2. Calculate the value of \(a\) and the value of \(b\) giving your answers to 2 decimal places.
  3. Test, at the \(5 \%\) level, whether or not the binomial distribution is a suitable model for the number of defective items in samples of 6 items. State your hypotheses clearly.
Edexcel S3 2013 June Q4
14 marks Standard +0.3
4. Customers at a post office are timed to see how long they wait until being served at the counter. A random sample of 50 customers is chosen and their waiting times, \(x\) minutes, are summarised in Table 1. \begin{table}[h]
Waiting time in minutes \(( x )\)Frequency
\(0 - 3\)8
\(3 - 5\)12
\(5 - 6\)13
\(6 - 8\)9
\(8 - 12\)8
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table}
  1. Show that an estimate of \(\bar { x } = 5.49\) and an estimate of \(s _ { x } ^ { 2 } = 6.88\) The post office manager believes that the customers' waiting times can be modelled by a normal distribution.
    Assuming the data is normally distributed, she calculates the expected frequencies for these data and some of these frequencies are shown in Table 2. \begin{table}[h]
    Waiting Time\(x < 3\)\(3 - 5\)\(5 - 6\)\(6 - 8\)\(x > 8\)
    Expected Frequency8.5612.737.56\(a\)\(b\)
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  2. Find the value of \(a\) and the value of \(b\).
  3. Test, at the \(5 \%\) level of significance, the manager's belief. State your hypotheses clearly.
Edexcel S3 2014 June Q6
17 marks Standard +0.3
6. Bags of \(\pounds 1\) coins are paid into a bank. Each bag contains 20 coins. The bank manager believes that \(5 \%\) of the \(\pounds 1\) coins paid into the bank are fakes. He decides to use the distribution \(X \sim \mathrm {~B} ( 20,0.05 )\) to model the random variable \(X\), the number of fake \(\pounds 1\) coins in each bag.
  1. State the assumptions necessary for the binomial distribution to be an appropriate model in this case. The bank manager checks a random sample of 150 bags of \(\pounds 1\) coins and records the number of fake coins found in each bag. His results are summarised in Table 1. \begin{table}[h]
    Number of fake coins in each bag01234 or more
    Observed frequency436226136
    Expected frequency53.856.6\(r\)8.9\(s\)
    \captionsetup{labelformat=empty} \caption{Table 1}
    \end{table}
  2. Calculate the values of \(r\) and \(s\), giving your answers to 1 decimal place.
  3. Carry out a hypothesis test, at the \(5 \%\) significance level, to see if the data supports the bank manager's statistical model. State your hypotheses clearly. Question 6 parts (d) and (e) are continued on page 24 The assistant manager thinks that a binomial distribution is a good model but suggests that the proportion of fake coins is higher than \(5 \%\). She calculates the actual proportion of fake coins in the sample and uses this value to carry out a new hypothesis test on the data. Her expected frequencies are shown in Table 2. \begin{table}[h]
    Number of fake coins in each bag01234 or more
    Observed frequency436226136
    Expected frequency44.555.733.212.54.1
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  4. Explain why there are 2 degrees of freedom in this case.
  5. Given that she obtains a \(\chi ^ { 2 }\) test statistic of 2.67 , test the assistant manager's hypothesis that the binomial distribution is a good model for the number of fake coins in each bag. Use a \(5 \%\) level of significance and state your hypotheses clearly.
Edexcel S3 2014 June Q5
13 marks Standard +0.3
5. A research station is doing some work on the germination of a new variety of genetically modified wheat. They planted 120 rows containing 7 seeds in each row.
The number of seeds germinating in each row was recorded. The results are as follows
Number of seeds germinating in each row01234567
Observed number of rows2611192532169
  1. Write down two reasons why a binomial distribution may be a suitable model.
  2. Show that the probability of a randomly selected seed from this sample germinating is 0.6 The research station used a binomial distribution with probability 0.6 of a seed germinating. The expected frequencies were calculated to 2 decimal places. The results are as follows
    Number of seeds germinating in each row01234567
    Expected number of rows0.202.06\(s\)23.22\(t\)31.3515.683.36
  3. Find the value of \(s\) and the value of \(t\).
  4. Stating your hypotheses clearly, test, at the \(1 \%\) level of significance, whether or not the data can be modelled by a binomial distribution.