5.06c Fit other distributions: discrete and continuous

72 questions

Sort by: Default | Easiest first | Hardest first
Edexcel FS1 2021 June Q1
7 marks Moderate -0.3
  1. Kelly throws a tetrahedral die \(n\) times and records the number on which it lands for each throw.
She calculates the expected frequency for each number to be 43 if the die was unbiased.
The table below shows three of the frequencies Kelly records but the fourth one is missing.
Number1234
Frequency473436\(x\)
  1. Show that \(x = 55\) Kelly wishes to test, at the \(5 \%\) level of significance, whether or not there is evidence that the tetrahedral die is unbiased.
  2. Explain why there are 3 degrees of freedom for this test.
  3. Stating your hypotheses clearly and the critical value used, carry out the test.
Edexcel FS1 2023 June Q3
15 marks Standard +0.8
  1. In a class experiment, each day for 170 days, a child is chosen at random and spins a large cardboard coin 5 times and the number of heads is recorded.
    The results are summarised in the following table.
Number of heads012345
Frequency31045623812
Marcus believes that a \(\mathrm { B } ( 5,0.5 )\) distribution can be used to model these data and he calculates expected frequencies, to 2 decimal places, as follows
Number of heads012345
Expected frequency\(r\)26.56\(s\)\(s\)26.56\(r\)
  1. Find the value of \(r\) and the value of \(s\)
  2. Carry out a suitable test, at the \(5 \%\) level of significance, to determine whether or not the \(\mathrm { B } ( 5,0.5 )\) distribution is a good model for these data.
    You should state clearly your hypotheses, the test statistic and the critical value used. Nima believes that a better model for these data would be \(\mathrm { B } ( 5 , p )\)
  3. Find a suitable estimate for \(p\) To test her model, Nima uses this value of \(p\), to calculate expected frequencies as follows
    Number of heads012345
    Expected frequency2.0714.6541.4458.6341.4711.74
    The test statistic for Nima's test is 1.62 (to 3 significant figures)
  4. State,
    1. giving your reasons, the degrees of freedom
    2. the critical value
      that Nima should use for a test at the 5\% significance level.
  5. With reference to Marcus' and Nima's test results, comment on
    1. the probability of the coin landing on heads,
    2. the independence of the spins of the coin. Give reasons for your answers.
Edexcel FS1 Specimen Q3
14 marks Standard +0.8
  1. Bags of \(\pounds 1\) coins are paid into a bank. Each bag contains 20 coins.
The bank manager believes that \(5 \%\) of the \(\pounds 1\) coins paid into the bank are fakes. He decides to use the distribution \(X \sim B ( 20,0.05 )\) to model the random variable \(X\), the number of fake \(\pounds 1\) coins in each bag. The bank manager checks a random sample of 150 bags of \(\pounds 1\) coins and records the number of fake coins found in each bag. His results are summarised in Table 1. He then calculates some of the expected frequencies, correct to 1 decimal place. \begin{table}[h]
Number of fake coins in each bag01234 or more
Observed frequency436226136
Expected frequency53.856.68.9
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table}
  1. Carry out a hypothesis test, at the \(5 \%\) significance level, to see if the data supports the bank manager's statistical model. State your hypotheses clearly. The assistant manager thinks that a binomial distribution is a good model but suggests that the proportion of fake coins is higher than \(5 \%\). She calculates the actual proportion of fake coins in the sample and uses this value to carry out a new hypothesis test on the data. Her expected frequencies are shown in Table 2. \begin{table}[h]
    Number of fake coins in each bag01234 or more
    Observed frequency436226136
    Expected frequency44.555.733.212.54.1
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  2. Explain why there are 2 degrees of freedom in this case.
  3. Given that she obtains a \(\chi ^ { 2 }\) test statistic of 2.67 , test the assistant manager's hypothesis that the binomial distribution is a good model for the number of fake coins in each bag. Use a \(5 \%\) level of significance and state your hypotheses clearly.
OCR Further Statistics 2018 December Q7
12 marks Standard +0.8
7 Sasha tends to forget his passwords. He investigates whether the number of attempts he needs to log on to a system with a password can be modelled by a geometric distribution. On 60 occasions he records the number of attempts he needs to log on, and the results are shown in the table.
Number of attempts1234 or more
Frequency2019133
  1. Test at the \(1 \%\) significance level whether the results are consistent with the distribution Geo(0.4).
    [0pt]
  2. Suggest which two probabilities should be changed, and in what way, to produce an improved model. (Numerical values are not required.) You should give a reason for your suggestion. [3]
AQA Further AS Paper 2 Statistics 2024 June Q5
6 marks Easy -1.8
5 A spinner has 8 equal areas numbered 1 to 8, as shown in the diagram below. \includegraphics[max width=\textwidth, alt={}, center]{de9f0107-38de-4d0d-8391-4d29b98fa601-06_383_390_319_810} The spinner is spun and lands with one of its edges on the ground. 5
  1. Assume that the spinner lands on each number with equal probability. 5
    1. (i) State a distribution that could be used to model the number that the spinner lands on. 5
    2. (ii) Use your distribution from part 5
      1. to find the probability that the spinner lands on a number greater than 5
        [0pt] [1 mark] 5
    3. Clare spins the spinner 1000 times and records the results in the following table.
      Number
      landed on
      12345678
      Frequency376411216130815610953
      5
      1. Explain how the data shows that the model used in part (a) may not be valid.
        5
    4. (ii) Describe how Clare's results could be used to adjust the model.
CAIE FP2 2019 June Q9
10 marks Standard +0.3
A random sample of 50 observations of the continuous random variable \(X\) was taken and the values are summarised in the following table.
Interval\(0 \leqslant x < 0.8\)\(0.8 \leqslant x < 1.6\)\(1.6 \leqslant x < 2.4\)\(2.4 \leqslant x < 3.2\)\(3.2 \leqslant x < 4\)
Observed frequency1816862
It is required to test the goodness of fit of the distribution with probability density function f given by $$f(x) = \begin{cases} \frac{3}{16}(4 - x)^{\frac{1}{2}} & 0 \leqslant x < 4, \\ 0 & \text{otherwise}. \end{cases}$$ The relevant expected frequencies, correct to 2 decimal places, are given in the following table.
Interval\(0 \leqslant x < 0.8\)\(0.8 \leqslant x < 1.6\)\(1.6 \leqslant x < 2.4\)\(2.4 \leqslant x < 3.2\)\(3.2 \leqslant x < 4\)
Expected frequency14.2212.5410.598.184.47
  1. Show how the expected frequency for \(1.6 \leqslant x < 2.4\) is obtained. [3]
  2. Carry out a goodness of fit test at the 5% significance level. [7]
CAIE FP2 2018 November Q10
12 marks Standard +0.8
The number of accidents, \(x\), that occur each day on a motorway are recorded over a period of 40 days. The results are shown in the following table.
Number of accidents0123456\(\geqslant 7\)
Observed frequency358105720
\begin{enumerate}[label=(\roman*)] \item Show that the mean number of accidents each day is 2.95 and calculate the variance for this sample. Explain why these values suggest that a Poisson distribution might fit the data. [3] \item A Poisson distribution with mean 2.95, as found from the data, is used to calculate the expected frequencies, correct to 2 decimal places. The results are shown in the following table.
Number of accidents0123456\(\geqslant 7\)
Observed frequency358105720
Expected frequency2.096.189.118.966.613.901.921.23
Show how the expected frequency of 6.61 for \(x = 4\) is obtained. [2] \item Test at the 5% significance level the goodness of fit of this Poisson distribution to the data. [7] \end{enumerate]
Edexcel S3 2015 June Q3
11 marks Standard +0.3
The number of accidents on a particular stretch of motorway was recorded each day for 200 consecutive days. The results are summarised in the following table.
Number of accidents012345
Frequency4757463596
  1. Show that the mean number of accidents per day for these data is 1.6 [1]
A motorway supervisor believes that the number of accidents per day on this stretch of motorway can be modelled by a Poisson distribution. She uses the mean found in part (a) to calculate the expected frequencies for this model. Her results are given in the following table.
Number of accidents012345 or more
Frequency40.3864.61\(r\)27.5711.03\(s\)
  1. Find the value of \(r\) and the value of \(s\), giving your answers to 2 decimal places. [3]
  2. Stating your hypotheses clearly, use a 10\% level of significance to test the motorway supervisor's belief. Show your working clearly. [7]
Edexcel S3 Q6
12 marks Standard +0.3
Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
Number of femalesObserved number of littersExpected number of litters
010.78
196.25
22721.88
346\(R\)
449\(S\)
535\(T\)
62621.88
756.25
820.78
  1. Find the values of \(R\), \(S\) and \(T\). [4]
  2. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a 5\% level of significance. [7]
An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
  1. Explain how this would have affected the test. [1]
Edexcel S3 2002 June Q6
12 marks Standard +0.3
Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
Number of femalesObserved number of littersExpected number of litters
010.78
196.25
22721.88
346\(R\)
449\(S\)
535\(T\)
62621.88
756.25
820.78
  1. Find the values of \(R\), \(S\) and \(T\). [4]
  2. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a 5\% level of significance. [7]
An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
  1. Explain how this would have affected the test. [1]
Edexcel S3 2005 June Q5
Standard +0.3
The number of times per day a computer fails and has to be restarted is recorded for 200 days. The results are summarised in the table.
Number of restartsFrequency
099
165
222
312
42
Test whether or not a Poisson model is suitable to represent the number of restarts per day. Use a 5\% level of significance and state your hypothesis clearly. (Total 12 marks)
Edexcel S3 2009 June Q5
12 marks Standard +0.3
The number of goals scored by a football team is recorded for 100 games. The results are summarised in Table 1 below.
Number of goalsFrequency
040
133
214
38
45
Table 1
  1. Calculate the mean number of goals scored per game. [2]
The manager claimed that the number of goals scored per match follows a Poisson distribution. He used the answer in part (a) to calculate the expected frequencies given in Table 2.
Number of goalsExpected Frequency
034.994
1\(r\)
2\(s\)
36.752
\(\geqslant 4\)2.221
Table 2
  1. Find the value of \(r\) and the value of \(s\) giving your answers to 3 decimal places. [3]
  2. Stating your hypotheses clearly, use a 5\% level of significance to test the manager's claim. [7]
Edexcel S3 2016 June Q6
Standard +0.3
An airport manager carries out a survey of families and their luggage. Each family is allowed to check in a maximum of 4 suitcases. She observes 50 families at the check-in desk and counts the total number of suitcases each family checks in. The data are summarised in the table below.
Number of suitcases01234
Frequency6251261
The manager claims that the data can be modelled by a binomial distribution with \(p = 0.3\)
  1. Test the manager's claim at the 5\% level of significance. State your hypotheses clearly. Show your working clearly and give your expected frequencies to 2 decimal places. (8) The manager also carries out a survey of the time taken by passengers to check in. She records the number of passengers that check in during each of 100 five-minute intervals. The manager makes a new claim that these data can be modelled by a Poisson distribution. She calculates the expected frequencies given in the table below.
    Number of passengers012345 or more
    Observed frequency540311860
    Expected frequency16.5329.75\(r\)\(s\)7.233.64
  2. Find the value of \(r\) and the value of \(s\) giving your answers to 2 decimal places. (3)
  3. Stating your hypotheses clearly, use a 1\% level of significance to test the manager's new claim. (6)
Edexcel S3 Specimen Q5
11 marks Moderate -0.3
For a six-sided die it is assumed that each of the sides has an equal chance of landing uppermost when the die is rolled.
  1. Write down the probability function for the random variable \(X\), the number showing on the uppermost side after the die has been rolled. [2]
  2. State the name of the distribution. [1]
A student wishing to check the above assumption rolled the die 300 times and for the sides 1 to 6, obtained the frequencies 41, 49, 52, 58, 37 and 63 respectively.
  1. Analyse these data and comment on whether or not the assumption is valid for this die. Use a 5\% level of significance and state your hypotheses clearly. [8]
Edexcel S3 Q8
20 marks Standard +0.3
A physicist believes that the number of particles emitted by a radioactive source with a long half-life can be modelled by a Poisson distribution. She records the number of particles emitted in 80 successive 5-minute periods and her results are shown in the table below.
No. of Particles012345 or more
No. of Intervals233214830
  1. Comment on the suitability of a Poisson distribution for this situation. [3 marks]
  2. Show that an unbiased estimate of the mean number of particles emitted in a 5-minute period is 1.2 and find an unbiased estimate of the variance. [5 marks]
  3. Explain how your answers to part (b) support the fitting of a Poisson distribution. [1 mark]
  4. Stating your hypotheses clearly and using a 5\% level of significance, test whether or not these data can be modelled by a Poisson distribution. [11 marks]
Edexcel S3 Q7
17 marks Standard +0.3
A shoe manufacturer sees a report from another country stating that the length of adult male feet is normally distributed with a mean of 22.4 cm and a standard deviation of 2.8 cm. The manufacturer wishes to see if this model is appropriate for his customers and collects data on the length, correct to the nearest cm, of the right foot of a random sample of 200 males giving the following results:
Length (cm)\(\leq 18\)\(19 - 21\)\(22 - 24\)\(25 - 27\)\(\geq 28\)
No. of Men2448694118
The expected frequencies for the \(\leq 18\) and \(19 - 21\) groups are calculated as 16.46 and 58.44 respectively, correct to 2 decimal places.
  1. Calculate expected frequencies for the other three classes. [7]
  2. Stating your hypotheses clearly, test at the 10\% level of significance whether or not this data can be modelled by the distribution N(22.4, 2.8²). [7]
The manufacturer wishes to refine the model by not assuming a mean and standard deviation.
  1. Explain briefly how the manufacturer should proceed. [3]
OCR MEI Further Statistics Minor Specimen Q6
16 marks Standard +0.3
At a bird feeding station, birds are captured and ringed. If a bird is recaptured, the ring enables it to be identified. The table below shows the number of recaptures, \(x\), during a period of a month, for each bird of a particular species in a random sample of \(40\) birds.
Number of recaptures, \(x\)012345678910
Frequency255910431010
  1. The sample mean of \(x\) is \(3.4\). Calculate the sample variance of \(x\). [2]
  2. Briefly comment on whether the results of part (i) support a suggestion that a Poisson model might be a good fit to the data. [1]
The screenshot below shows part of a spreadsheet for a \(\chi^2\) test to assess the goodness of fit of a Poisson model. The sample mean of \(3.4\) has been used as an estimate of the Poisson parameter. Some values in the spreadsheet have been deliberately omitted. \includegraphics{figure_2}
  1. State the null and alternative hypotheses for the test. [1]
  2. Calculate the missing values in cells
  3. Complete the test at the \(10\%\) significance level. [5]
  4. The screenshot below shows part of a spreadsheet for a \(\chi^2\) test for a different species of bird. Find the value of the Poisson parameter used. \includegraphics{figure_3} [3]
WJEC Further Unit 2 2018 June Q5
12 marks Standard +0.3
A life insurance saleswoman investigates the number of policies she sells per day. The results for a random sample of 50 days are shown in the table below.
Number of policies sold0123456
Number of days229121591
She sees the same fixed number of clients each day. She would like to know whether the binomial distribution with parameters 6 and 0·6 is a suitable model for the number of policies she sells per day.
  1. State suitable hypotheses for a goodness of fit test. [1]
  2. Here is part of the table for a \(\chi^2\) goodness of fit test on the data.
    Number of policies sold0123456
    Observed229121591
    Expected0·2051·8436·912\(d\)\(e\)9·3312·333
    1. Calculate the values of \(d\) and \(e\).
    2. Carry out the test using a 10% level of significance and draw a conclusion in context. [10]
  3. What do the parameters 6 and 0·6 mean in this context? [1]
OCR FS1 AS 2021 June Q4
9 marks Challenging +1.2
The table shows the results of a random sample drawn from a population which is thought to have the distribution \(U(20)\).
Range\(1 \leq x \leq 8\)\(9 \leq x \leq 12\)\(13 \leq x \leq 20\)
Observed frequency12\(y\)\(28 - y\)
Find the range of values of \(y\) for which the data are not consistent with the distribution at the \(5\%\) significance level. [9]
OCR Further Statistics 2021 June Q4
10 marks Standard +0.3
A biased spinner has five sides, numbered 1 to 5. Elmer spins the spinner repeatedly and counts the number of spins, \(X\), up to and including the first time that the number 2 appears. He carries out this experiment 100 times and records the frequency \(f\) with which each value of \(X\) is obtained. His results are shown in Table 1, together with the values of \(xf\).
\(x\)123456\(\geq 7\)Total
Frequency \(f\)2015913101023100
\(xf\)203027525060161400
Table 1
  1. State an appropriate distribution with which to model \(X\), determining the value(s) of any parameter(s). [3]
Elmer carries out a goodness-of-fit test, at the 5\% level, for the distribution in part (a). Table 2 gives some of his calculations, in which numbers that are not exact have been rounded to 3 decimal places.
\(x\)123456\(\geq 7\)
Observed frequency \(O\)2015913101023
Expected frequency \(E\)2518.7514.06310.5477.9105.93317.798
\((O-E)^2/E\)10.751.8230.5710.5522.7891.520
Table 2
  1. Show how the expected frequency corresponding to \(x \geq 7\) was obtained. [2]
  2. Carry out the test. [5]
OCR Further Statistics 2017 Specimen Q8
15 marks Standard +0.8
A continuous random variable \(X\) has probability density function given by $$f(x) = \begin{cases} 0.8e^{-0.8x} & x \geq 0, \\ 0 & x < 0. \end{cases}$$
  1. Find the mean and variance of \(X\). [4]
The lifetime of a certain organism is thought to have the same distribution as \(X\). The lifetimes in days of a random sample of 60 specimens of the organism were found. The observed frequencies, together with the expected frequencies correct to 3 decimal places, are given in the table.
Range\(0 \leq x < 1\)\(1 \leq x < 2\)\(2 \leq x < 3\)\(3 \leq x < 4\)\(x \geq 4\)
Observed24221031
Expected33.04014.8466.6712.9972.446
  1. Show how the expected frequency for \(1 \leq x < 2\) is obtained. [4]
  2. Carry out a goodness of fit test at the 5\% significance level. [7]
OCR FS1 AS 2017 Specimen Q7
4 marks Standard +0.3
The discrete random variable \(X\) is equally likely to take values 0, 1 and 2. \(3N\) observations of \(X\) are obtained, and the observed frequencies corresponding to \(X = 0\), \(X = 1\) and \(X = 2\) are given in the following table.
\(x\)012
Observed frequency\(N - 1\)\(N - 1\)\(N + 2\)
The test statistic for a chi-squared goodness of fit test for the data is 0.3. Find the value of \(N\). [4]