Chi-squared goodness of fit: Uniform

A question is this type if and only if it tests whether data fits a uniform (discrete or continuous) distribution, including equal proportions.

18 questions · Standard +0.2

Sort by: Default | Easiest first | Hardest first
OCR S3 2006 June Q2
6 marks Standard +0.3
2 The manager of a factory with a large number of employees investigated when accidents to employees occurred during 8-hour shifts. An analysis was made of 600 randomly chosen accidents that occurred over a year. The following table shows the numbers of accidents occurring in the four consecutive 2-hour periods of the 8-hour shifts.
Period1234
Number of accidents138127165170
Test, at the \(5 \%\) significance level, whether the proportions of all accidents that occur in the four time periods differ.
OCR Further Statistics AS 2018 June Q8
9 marks Challenging +1.2
8 The table shows the results of a random sample drawn from a population which is thought to have the distribution \(\mathrm { U } ( 20 )\).
Range\(1 \leqslant x \leqslant 8\)\(9 \leqslant x \leqslant 12\)\(13 \leqslant x \leqslant 20\)
Observed frequency12\(y\)\(28 - y\)
Find the range of values of \(y\) for which the data are not consistent with the distribution at the \(5 \%\) significance level. \section*{END OF QUESTION PAPER}
OCR Further Statistics AS 2021 November Q6
9 marks Moderate -0.3
6 A student believes that if you ask people to choose an integer between 1 and 10, not all integers are equally likely to be chosen. The student asks a random sample of 100 people to choose an integer between 1 and 10 inclusive. The observed frequencies \(O\), together with the values of \(\frac { ( O - E ) ^ { 2 } } { E }\) where \(E\) is the corresponding expected frequency, are shown in the table.
Integer12345678910
O7820876197810
\(\frac { ( \mathrm { O } - \mathrm { E } ) ^ { 2 } } { \mathrm { E } }\)0.90.410.00.40.91.68.10.90.40
  1. Show how the value of 8.1 for integer 7 is obtained.
  2. Show that there is evidence at the \(1 \%\) significance level that the student’s belief is correct. The student wishes to suggest an alternative model for the probabilities associated with each integer. In this model, two of the integers have the same probability \(p _ { 1 }\) of being chosen and the other eight integers each have probability \(p _ { 2 }\) of being chosen.
  3. Suggest which two integers should have probability \(p _ { 1 }\) and suggest a possible value of \(p _ { 1 }\).
OCR Further Statistics AS Specimen Q7
4 marks Standard +0.3
7 The discrete random variable \(X\) is equally likely to take values 0,1 and 2 . \(3 N\) observations of \(X\) are obtained, and the observed frequencies corresponding to \(X = 0 , X = 1\) and \(X = 2\) are given in the following table.
\(x\)012
Observed
frequency
\(N - 1\)\(N - 1\)\(N + 2\)
The test statistic for a chi-squared goodness of fit test for the data is 0.3 . Find the value of \(N\).
Edexcel S3 2021 January Q3
10 marks Standard +0.3
3. The students in a group of schools can choose a club to join. There are 4 clubs available: Music, Art, Sports and Computers. The director collected information about the number of students in each club, using a random sample of 88 students from across the schools. The results are given in Table 1 below. \begin{table}[h]
\cline { 2 - 5 } \multicolumn{1}{c|}{}MusicArtSportsComputers
No. of students14282719
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} The director uses a chi-squared test to determine whether or not the students are uniformly distributed across the 4 clubs.
    1. Find the expected frequencies he should use. Given that the test statistic he calculated was 6.09 (to 3 significant figures)
    2. use a \(5 \%\) level of significance to complete the test. You should state the degrees of freedom and the critical value used. The director wishes to examine the situation in more detail and takes a second random sample of 88 students. The director assumes that within each school, students select their clubs independently. The students come from 3 schools and the distribution of the students from each school amongst the clubs is given in Table 2 below. \begin{table}[h]
      School ClubMusicArtSportsComputers
      School \(\boldsymbol { A }\)31098
      School \(\boldsymbol { B }\)111135
      School \(\boldsymbol { C }\)11674
      \captionsetup{labelformat=empty} \caption{Table 2}
      \end{table} The director wishes to test for an association between a student's school and the club they choose.
  1. State hypotheses suitable for such a test.
  2. Calculate the expected frequency for School \(C\) and the Computers club. The director calculates the test statistic to be 7.29 (to 3 significant figures) with 4 degrees of freedom.
  3. Explain clearly why his test has 4 degrees of freedom.
  4. Complete the test using a \(5 \%\) level of significance and stating clearly your critical value.
Edexcel S3 2016 June Q5
9 marks Standard +0.3
5. Kylie used video technology to monitor the direction of flight, as a bearing, \(x\) degrees, for 450 honeybees that left her beehive during a particular morning. Kylie's results are summarised in the table below.
Direction of flightFrequency
\(0 \leqslant x < 72\)78
\(72 \leqslant x < 140\)69
\(140 \leqslant x < 190\)51
\(190 \leqslant x < 260\)108
\(260 \leqslant x < 360\)144
Kylie believes that a continuous uniform distribution over the interval [0,360] is a suitable model for the direction of flight. Stating your hypotheses clearly, use a 1\% level of significance to test Kylie's belief. Show your working clearly.
Edexcel S3 2020 October Q4
15 marks Standard +0.3
4. Luka wants to carry out a survey of students at his school. He obtains a list of all 280 students.
  1. Explain how he can use this list to select a systematic sample of 40 students. Luka is trying to make his own random number table. He generates 400 digits to put in his table. Figure 1 shows the frequency of each digit in his table. \begin{table}[h]
    Digit generated0123456789
    Frequency36423341444348383243
    \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{table} A test is carried out at the \(10 \%\) level of significance to see if the digits Luka generates follow a uniform distribution. For this test \(\sum \frac { ( \mathrm { O } - \mathrm { E } ) ^ { 2 } } { \mathrm { E } } = 5.9\)
  2. Determine the conclusion of this test.
    (3) The digits generated by Luka are taken two at a time to form two-digit numbers. Figure 2 shows the frequency of two-digit numbers in his table. \begin{table}[h]
    Two-digit numbers generated\(00 - 19\)\(20 - 39\)\(40 - 59\)\(60 - 79\)\(80 - 99\)
    Frequency3149304248
    \captionsetup{labelformat=empty} \caption{Figure 2}
    \end{table}
  3. Test, at the \(10 \%\) level of significance, whether the two-digit numbers generated by Luka follow a uniform distribution. You should state the hypotheses, the degrees of freedom and the critical value used for this test. There are 70 students in Year 12 at his school.
  4. State, giving a reason, the advice you would give to Luka regarding the use of his table of numbers for generating a simple random sample of 10 of the Year 12 students.
Edexcel S3 Specimen Q6
12 marks Standard +0.8
6. A total of 228 items are collected from an archaeological site. The distance from the centre of the site is recorded for each item. The results are summarised in the table below.
Distance from the
centre of the site (m)
\(0 - 1\)\(1 - 2\)\(2 - 4\)\(4 - 6\)\(6 - 9\)\(9 - 12\)
Number of items221544375258
Test, at the \(5 \%\) level of significance, whether or not the data can be modelled by a continuous uniform distribution. State your hypotheses clearly.
Edexcel S3 2010 June Q6
12 marks Standard +0.8
  1. A total of 228 items are collected from an archaeological site. The distance from the centre of the site is recorded for each item. The results are summarised in the table below.
Distance from the
centre of the site \(( \mathrm { m } )\)
\(0 - 1\)\(1 - 2\)\(2 - 4\)\(4 - 6\)\(6 - 9\)\(9 - 12\)
Number of items221544375258
Test, at the \(5 \%\) level of significance, whether or not the data can be modelled by a continuous uniform distribution. State your hypotheses clearly.
Edexcel S3 2015 June Q6
19 marks Standard +0.3
6. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{740f7555-3a9a-4526-9048-39908aa8f8dd-10_684_694_239_625} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure} The sketch in Figure 1 represents a target which consists of 4 regions formed from 4 concentric circles of radii \(4 \mathrm {~cm} , 7 \mathrm {~cm} , 9 \mathrm {~cm}\) and 10 cm . The regions are coloured as labelled in Figure 1.
A random sample of 100 children each choose a point on the target and their results are summarised in the table below. (b) Find the value of \(r\) and the value of \(s\). Henry obtained a test statistic of 6.188 and no groups were pooled.
(c) State what conclusion Henry should make about his claim. Phoebe believes that the children chose the region of the target according to colour. She believes that boys and girls would favour different colours and splits the original data by gender to obtain the following table. \section*{Observed frequencies}
Colour of regionGreenRedBlueYellowTotal
Boys101210335
Girls1227151165
(d) State suitable hypotheses to test Phoebe's belief. Phoebe calculated the following expected frequencies to carry out a suitable test. \section*{Expected frequencies}
Colour of regionGreenRedBlueYellow
Boys7.713.658.754.9
Girls14.325.3516.259.1
(e) Show how the value of 25.35 was obtained. Phoebe carried out the test using 2 degrees of freedom and a \(10 \%\) level of significance. She obtained a test statistic of 1.411
(f) Explain clearly why Phoebe used 2 degrees of freedom.
(g) Stating your critical value clearly, determine whether or not these data support Phoebe's belief.
Edexcel S3 2017 June Q2
10 marks Standard +0.3
2. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{585de4b0-906e-40c4-9045-966d68505eff-04_430_438_260_753} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure} The pointer shown in Figure 1 is spun so that it comes to rest between 0 and 360 degrees.
Linda claims that it is equally likely to come to rest at any point between 0 and 360 degrees. She spins the pointer 100 times and her results are summarised in the table below. She calculates expected frequencies for some of the possible outcomes and these are also given in the table below.
Angle (degrees)\(0 - 45\)\(45 - 90\)\(90 - 180\)\(180 - 315\)\(315 - 360\)
Frequency1816182919
Expected frequency12.5\(a\)\(b\)\(c\)12.5
  1. Find the values of the missing expected frequencies \(a , b\) and \(c\).
  2. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test whether or not Linda's claim is supported by these data.
Edexcel S3 Specimen Q5
11 marks Moderate -0.3
5. For a six-sided die it is assumed that each of the sides has an equal chance of landing uppermost when the die is rolled.
  1. Write down the probability function for the random variable \(X\), the number showing on the uppermost side after the die has been rolled.
  2. State the name of the distribution. A student wishing to check the above assumption rolled the die 300 times and for the sides 1 to 6 , obtained the frequencies \(41,49,52,58,37\) and 63 respectively.
  3. Analyse these data and comment on whether or not the assumption is valid for this die. Use a \(5 \%\) level of significance and state your hypotheses clearly.
    (8)
Edexcel S3 Q1
6 marks Standard +0.3
  1. (a) Explain briefly the method of quota sampling.
    (b) Give one disadvantage of quota sampling compared with stratified sampling.
    (c) Describe a situation in which you would choose to use quota sampling rather than stratified sampling and explain why.
    (2 marks)
  2. Commentators on a game of cricket say that a certain batsman is "playing shots all round the ground". A sports statistician wishes to analyse this claim and records the direction of shots played by the batsman during the course of his innings. She divides the \(360 ^ { \circ }\) around the batsman into six sectors, measuring the angle of each shot clockwise from the line between the wickets, and obtains the following results:
Sector\(0 ^ { \circ } -\)\(45 ^ { \circ } -\)\(90 ^ { \circ } -\)\(180 ^ { \circ } -\)\(270 ^ { \circ } -\)\(315 ^ { \circ } - 360 ^ { \circ }\)
No. of Shots18191520915
Stating your hypotheses clearly and using a \(5 \%\) level of significance test whether or not these data can be modelled by a continuous uniform distribution.
(9 marks)
Edexcel S3 Q2
9 marks Standard +0.3
  1. A psychologist is investigating the numbers people choose when asked to pick a number at random in a given interval. He finds that when asked to pick a number between 0 and 100 people are less likely to pick certain numbers, such as multiples of ten. He believes, however that if people are asked to pick an odd number between 0 and 100 they are equally likely to pick a number ending in any of the digits \(1,3,5,7\) or 9 .
To test this theory he asks 80 people to pick an odd number between 0 and 100 and records the last digit of the numbers chosen. His results are shown in the table below.
Last Digit13579
Frequency1620141713
Stating your hypotheses clearly and using a 10\% level of significance test the psychologist’s theory.
(9 marks)
OCR MEI Further Statistics A AS Specimen Q2
6 marks Moderate -0.8
2 The discrete random variable \(Y\) is uniformly distributed over the values \(\{ 12,13 , \ldots , 20 \}\).
  1. Write down \(\mathrm { P } ( Y < 15 )\).
  2. Two independent observations of \(Y\) are taken. Find the probability that one of these values is less than 15 and the other is greater than 15 .
  3. Find \(\mathrm { P } ( Y > \mathrm { E } ( Y ) )\).
WJEC Further Unit 2 2019 June Q5
11 marks Standard +0.3
5. Chris is investigating the distribution of birth months for ice hockey players. He collects data for 869 randomly chosen National Hockey League (NHL) players. He decides to carry out a chi-squared test. Using a spreadsheet, he produces the following output.
ABcD
1Birth MonthObservedExpectedChi-Squared Contributions
2Jan-Mar259217.258.023302647
3Apr-June232217.251.001438435
4Jul-Sept200217.251.369677791
5Oct-Dec178217.257.091196778
6Total86986917.48561565
7
8p value
90.000561458
  1. By considering the output, state the null hypothesis that Chris is testing. State what conclusion Chris should reach and explain why. Chris now wonders if Premier League football players' birth months are distributed uniformly throughout the year. He collects the birth months of 75 randomly selected Premier League footballers. This information is shown in the table below.
    JanFebMarAprMayJunJulAugSepOctNovDec
    37114122665856
  2. Carry out the chi-squared goodness of fit test at the 10\% significance level that Chris should use to conduct his investigation.
Edexcel FS1 2021 June Q1
7 marks Moderate -0.3
  1. Kelly throws a tetrahedral die \(n\) times and records the number on which it lands for each throw.
She calculates the expected frequency for each number to be 43 if the die was unbiased.
The table below shows three of the frequencies Kelly records but the fourth one is missing.
Number1234
Frequency473436\(x\)
  1. Show that \(x = 55\) Kelly wishes to test, at the \(5 \%\) level of significance, whether or not there is evidence that the tetrahedral die is unbiased.
  2. Explain why there are 3 degrees of freedom for this test.
  3. Stating your hypotheses clearly and the critical value used, carry out the test.
OCR FS1 AS 2017 December Q7
11 marks Standard +0.3
7 Josh is investigating whether sticking pins into a map at random, while blindfolded, provides a random sample of regions of the map. Josh divides the map into 49 squares of equal size and asks each of 98 friends to stick a pin into the map at random, while blindfolded. He then notes the number of pins in each square. To analyse the results he groups the squares as shown in the diagram.
DDDDDDD
DCCCCCD
DCBBBCD
DCBABCD
DCBBBCD
DCCCCCD
DDDDDDD
The results are summarised in the table.
RegionABCD
Number of squares181624
Number of pins6213338
  1. Test at the 10\% significance level whether the use of pins in this way provides a random sample of regions of the map.
  2. What can be deduced from considering the different contributions to the test statistic? \section*{OCR} \section*{Oxford Cambridge and RSA}