5.06a Chi-squared: contingency tables

179 questions

Sort by: Default | Easiest first | Hardest first
Edexcel FS1 AS 2020 June Q2
15 marks Standard +0.3
  1. In an experiment, James flips a coin 3 times and records the number of heads. He carries out the experiment 100 times with his left hand and 100 times with his right hand.
\multirow{2}{*}{}Number of heads
0123
Left hand7294222
Right hand13353616
  1. Test, at the \(5 \%\) level of significance, whether or not there is an association between the hand he flips the coin with and the number of heads. You should state your hypotheses, the degrees of freedom and the critical value used for this test.
  2. Assuming the coin is unbiased, write down the distribution of the number of heads in 3 flips.
  3. Carry out a \(\chi ^ { 2 }\) test, at the \(10 \%\) level of significance, to test whether or not the distribution you wrote down in part (b) is a suitable model for the number of heads obtained in the 200 trials of James' experiment. You should state your hypotheses, the degrees of freedom and the critical value used for this test.
Edexcel FS1 AS 2021 June Q4
7 marks Challenging +1.2
  1. Charlie carried out a survey on the main type of investment people have.
The contingency table below shows the results of a survey of a random sample of people.
\cline { 3 - 5 } \multicolumn{2}{c|}{}Main type of investment
\cline { 3 - 5 } \multicolumn{2}{c|}{}BondsCashStocks
\multirow{2}{*}{Age}\(25 - 44\)\(a\)\(b - e\)\(e\)
\cline { 2 - 5 }\(45 - 75\)\(c\)\(d - 59\)59
  1. Find an expression, in terms of \(a , b , c\) and \(d\), for the difference between the observed and the expected value \(( O - E )\) for the group whose main type of investment is Bonds and are aged 45-75
    Express your answer as a single fraction in its simplest form. Given that \(\sum \frac { ( O - E ) ^ { 2 } } { E } = 9.62\) for this information,
  2. test, at the \(5 \%\) level of significance, whether or not there is evidence of an association between the age of a person and the main type of investment they have. You should state your hypotheses, critical value and conclusion clearly. You may assume that no cells need to be combined.
Edexcel FS1 AS 2022 June Q1
7 marks Moderate -0.3
  1. Stuart is investigating a treatment for a disease that affects fruit trees. He has 400 fruit trees and applies the treatment to a random sample of these trees. The remainder of the trees have no treatment. He records the number of years, \(y\), that each fruit tree remains free from this disease.
The results are summarised in the table below.
\cline { 3 - 3 } \multicolumn{2}{c|}{}Treatment
\cline { 3 - 4 } \multicolumn{2}{c|}{}AppliedNot applied
\multirow{3}{*}{
Number of years free
from this disease
}
\(y < 1\)1525
\cline { 2 - 4 }\(1 \leqslant y < 2\)3561
\cline { 2 - 4 }\(2 \leqslant y\)124140
The data are to be used to determine whether or not there is an association between the application of the treatment and the number of years that a fruit tree remains free from this disease.
  1. Calculate the expected frequencies for
    1. Applied and \(y < 1\)
    2. Not applied and \(1 \leqslant y < 2\) The value of \(\sum \frac { ( O - E ) ^ { 2 } } { E }\) for the other four classes is 2.642 to 3 decimal places.
  2. Test, at the \(5 \%\) level of significance, whether or not there is an association between the application of the treatment and the number of years a fruit tree remains free from this disease. You should state your hypotheses, test statistic, critical value and conclusion clearly.
Edexcel FS1 AS 2023 June Q2
6 marks Standard +0.3
  1. A bag contains a large number of balls, all of the same size and weight. The balls are coloured Red, Blue or Yellow.
Jasmine asks each child in a group of 150 children to close their eyes, select a ball from the bag and show it to her. The child then replaces the ball and repeats the process a second time. If both balls are the same colour the child receives a prize.
The results are given in the table below.
\backslashbox{2nd colour}{1st colour}RedBlueYellowTotal
Red31111860
Blue810927
Yellow2193363
Total603060150
Jasmine carries out a test, at the \(5 \%\) level of significance, to see whether or not the colour of the 2nd ball is independent of the colour of the 1st ball.
  1. Calculate the expected frequencies for the cases where both balls are the same colour. The test statistic Jasmine obtained was 12.712 to three decimal places.
  2. Use this value to complete the test, stating the critical value and conclusion clearly. With reference to your calculations in part (a) and the nature of the experiment, (c) give a plausible reason why Jasmine may have obtained her conclusion in part (b).
Edexcel FS1 AS 2024 June Q1
6 marks Moderate -0.3
  1. Sharma believes that each computer game he sells appeals equally to all age ranges.
To investigate this, he takes a random sample of 100 people who play these games and asks them which of the games \(A , B\) or \(C\) they prefer.
The results are summarised in the table below.
Computer game\(A\)\(B\)\(C\)
\multirow{3}{*}{Age range}\(< 20\)8156
\cline { 2 - 5 }\(20 - 30\)21129
\cline { 2 - 5 }\(> 30\)61013
  1. Write down hypotheses for a suitable test to assess Sharma's belief.
  2. For the test, calculate the expected frequency for
    1. those players aged under 20 who prefer game \(C\)
    2. those players aged between 20 and 30 who prefer game \(A\)
  3. State the degrees of freedom of the test statistic for this test. Sharma correctly calculates the test statistic for this test to be 11.542 (to 3 decimal places).
  4. Using a \(5 \%\) significance level, and stating your critical value, comment on Sharma's belief.
Edexcel FS1 AS Specimen Q1
8 marks Standard +0.3
  1. A university foreign language department carried out a survey of prospective students to find out which of three languages they were most interested in studying.
A random sample of 150 prospective students gave the following results.
\cline { 3 - 5 } \multicolumn{2}{c|}{}Language
\cline { 3 - 5 } \multicolumn{2}{c|}{}FrenchSpanishM andarin
\multirow{2}{*}{Gender}M ale232220
\cline { 2 - 5 }Female383215
A test is carried out at the \(1 \%\) level of significance to determine whether or not there is an association between gender and choice of language.
  1. State the null hypothesis for this test.
  2. Show that the expected frequency for females choosing Spanish is 30.6
  3. Calculate the test statistic for this test, stating the expected frequencies you have used.
  4. State whether or not the null hypothesis is rejected. Justify your answer.
  5. Explain whether or not the null hypothesis would be rejected if the test was carried out at the \(10 \%\) level of significance. \section*{Q uestion 1 continued} \section*{Q uestion 1 continued} \section*{Q uestion 1 continued}
Edexcel FS1 2024 June Q3
6 marks Standard +0.3
  1. Tisam took a survey of students' favourite colours. The results are summarised in the table below.
\multirow{2}{*}{}Colour
RedBlueGreenYellowBlackTotal
\multirow{3}{*}{Year group}1-534151422388
6-92332129884
10-12528198868
Total6275453919240
Tisam carries out a suitable test to see if there is any association between favourite colour and year group.
  1. Write down the hypotheses for a suitable test. For her table, Tisam only needs to check one cell to show that none of the expected frequencies are less than 5
    1. Identify this cell, giving your reason.
    2. Calculate the expected frequency for this cell. The test statistic for Tisam's test is 38.449
  2. Using a \(1 \%\) level of significance, complete the test. You should state your critical value and conclusion clearly.
AQA S2 2011 June Q3
10 marks Standard +0.3
  1. State the null hypothesis that Emily used.
  2. Find the value of the test statistic, \(X ^ { 2 }\), giving your answer to one decimal place.
  3. State, in context, the conclusion that Emily should reach based on the results of her \(\chi ^ { 2 }\) test.
  4. Make one comment on the GCSE performances of 16-year-old students attending Bailey Language School.
  5. Emily's friend, Joanna, used the same data to correctly conduct a \(\chi ^ { 2 }\) test using the \(10 \%\) level of significance. State, with justification, the conclusion that Joanna should reach.
OCR MEI Further Statistics A AS Specimen Q3
10 marks Standard +0.3
3 In this question you must show detailed reasoning. A student is investigating what people think about organic food. She wishes to see if there is any difference between the opinions of females and males. She takes a random sample of 100 people and asks each of them if they think that organic food is better for their health than non-organic food. She will use the data to conduct a hypothesis test. The table below shows the opinions of these 100 people.
\cline { 3 - 4 } \multicolumn{2}{c|}{}Sex
\cline { 3 - 4 } \multicolumn{2}{c|}{}FemaleMale
\multirow{2}{*}{
Opinion on
organic food
}
Organic better3518
\cline { 2 - 4 }Not better2225
  1. Explain why the student should use a random sample.
  2. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between a person's sex and their opinion on organic food. Show your calculations.
OCR MEI Further Statistics Minor 2020 November Q3
8 marks Standard +0.3
3 In this question you must show detailed reasoning. In a survey into pet ownership, one of the questions was 'Do you own either a cat or a dog (or both)?'. A total of 121 people took part in the survey and you should assume that they form a random sample of people in a particular town. The results, classified by the age of the person being surveyed, are shown in Table 3. \begin{table}[h]
\multirow{2}{*}{}Ownership of cat or dog
Does ownDoes not own
\multirow{2}{*}{Age}Over 45 years3829
Under 45 years2331
\captionsetup{labelformat=empty} \caption{Table 3}
\end{table} Carry out a test at the 10\% significance level to investigate whether, for people in this town, there is any association between age and ownership of a cat or dog.
OCR FS1 AS 2017 December Q7
11 marks Standard +0.3
7 Josh is investigating whether sticking pins into a map at random, while blindfolded, provides a random sample of regions of the map. Josh divides the map into 49 squares of equal size and asks each of 98 friends to stick a pin into the map at random, while blindfolded. He then notes the number of pins in each square. To analyse the results he groups the squares as shown in the diagram.
DDDDDDD
DCCCCCD
DCBBBCD
DCBABCD
DCBBBCD
DCCCCCD
DDDDDDD
The results are summarised in the table.
RegionABCD
Number of squares181624
Number of pins6213338
  1. Test at the 10\% significance level whether the use of pins in this way provides a random sample of regions of the map.
  2. What can be deduced from considering the different contributions to the test statistic? \section*{OCR} \section*{Oxford Cambridge and RSA}
OCR Further Statistics 2018 March Q6
10 marks Standard +0.3
6 The captain of a sports team analyses the team's results according to the weather conditions, classified as "sunny" and "not sunny". The frequencies are shown in the following table.
\cline { 3 - 5 } \multicolumn{2}{c|}{}Results
\cline { 3 - 5 } \multicolumn{2}{c|}{}WinDrawLose
\multirow{2}{*}{Weather}Sunny1235
\cline { 2 - 5 }Not sunny81210
  1. Test at the \(5 \%\) significance level whether the team's performances are associated with weather conditions.
  2. (a) Identify the cell that gives the largest contribution to the test statistic.
    (b) Interpret your answer to part (ii)(a).
OCR FS1 AS 2018 March Q7
11 marks Standard +0.3
7 The numbers of students taking A levels in three subjects at a school were classified by the year in which they entered the school as follows.
\cline { 2 - 5 } \multicolumn{1}{c|}{}SubjectMathematicsEnglishPhysics
\multirow{3}{*}{
Year of
Entry
}
Year 717167
\cline { 2 - 5 }Year 121325
The Head of the school carries out a significance test at the \(10 \%\) level to test whether subjects taken are independent of year of entry.
  1. Show that in carrying out the test it is necessary to combine columns.
  2. Suggest a reason why it is more sensible to combine the columns for Mathematics and Physics than the columns for Physics and English.
  3. Carry out the test.
  4. State which cell gives the largest contribution to the test statistic.
  5. Interpret your answer to part (iv).
AQA S2 2009 January Q1
11 marks Standard +0.3
1 Fortune High School gave its students a wider choice of subjects to study. The table shows the number of students, of each gender, who chose to study each of the additional subjects during the school year 2007/08.
\cline { 2 - 5 } \multicolumn{1}{c|}{}Bulgarian
Climate
Change
FinancePolish
Male7312540
Female2242219
Assuming that these data form a random sample, use a \(\chi ^ { 2 }\) test, at the \(10 \%\) level of significance, to test whether the choice of these subjects is independent of gender.
(11 marks)
AQA S2 2007 June Q1
10 marks Standard +0.3
1 Two groups of patients, suffering from the same medical condition, took part in a clinical trial of a new drug. One of the groups was given the drug whilst the other group was given a placebo, a drug that has no physical effect on their medical condition. The table shows the number of patients in each group and whether or not their condition improved.
\cline { 2 - 3 } \multicolumn{1}{c|}{}PlaceboDrug
Condition improved2046
Condition did not improve5529
Conduct a \(\chi ^ { 2 }\) test, at the \(5 \%\) level of significance, to determine whether the condition of the patients at the conclusion of the trial is associated with the treatment that they were given.
(10 marks)
AQA S2 2009 June Q3
12 marks Standard +0.3
3 A sample survey, conducted to determine the attitudes of residents to a proposed reorganisation of local schools, gave the following results.
Against reorganisationNot against reorganisation
\multirow{5}{*}{Age of resident}16-1792
18-211710
22-4911590
50-654134
Over 6534
Use a \(\chi ^ { 2 }\) test, at the \(5 \%\) level of significance, to determine whether there is an association between the ages of residents and their attitudes to the proposed reorganisation of local schools.
AQA Further AS Paper 2 Statistics 2018 June Q8
10 marks Standard +0.3
8 An insurance company groups its vehicle insurance policies into two categories, car insurance and motorbike insurance. The number of claims in a random sample of 80 policies was monitored and the results summarised in contingency Table 1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 1}
\multirow{2}{*}{}Number of claims
0123 or moreTotal
\multirow[b]{3}{*}{Type of insurance policy}Car91011535
Motorbike19138545
Total2823191080
\end{table} The insurance company decides to carry out a \(\chi ^ { 2 }\)-test for association between number of claims and type of insurance policy using the information given in Table 1. 8
  1. The contingency table shown in Table 2 gives some of the exact expected frequencies for this test. Complete Table 2 with the missing exact expected values. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 2}
    \multirow{2}{*}{}Number of claims
    0123 or more
    \multirow{2}{*}{Type of insurance policy}Car10.06254.375
    Motorbike10.6875
    \end{table} 8
  2. Carry out the insurance company's test, using the \(10 \%\) level of significance. \includegraphics[max width=\textwidth, alt={}, center]{313cd5ce-07ff-4781-a134-565b8b221145-12_2488_1719_219_150} Additional page, if required.
    Write the question numbers in the left-hand margin. Additional page, if required.
    Write the question numbers in the left-hand margin. Additional page, if required.
    Write the question numbers in the left-hand margin.
AQA Further AS Paper 2 Statistics 2022 June Q7
8 marks Standard +0.3
7 Wade and Odelia are investigating whether there is an association between the region where a person lives and the brand of washing powder they use. They decide to conduct a \(\chi ^ { 2 }\)-test for association and survey a random sample of 200 people. The expected frequencies for the test have been calculated and are shown in the contingency table below.
AQA Further AS Paper 2 Statistics 2023 June Q7
10 marks Standard +0.3
7 A theatre has morning, afternoon and evening shows. On one particular day, the theatre asks all of its customers to state whether they enjoyed or did not enjoy the show. The results are summarised in the table.
Morning showAfternoon showEvening showTotal
Enjoyed6291172325
Not enjoyed2535115175
Total87126287500
The theatre claims that there is no association between the show that a customer attends and whether they enjoyed the show. 7
  1. Investigate the theatre's claim, using a \(2.5 \%\) level of significance.
    7
  2. By considering observed and expected frequencies, interpret in context the association between the show that a customer attends and whether they enjoyed the show.
AQA Further AS Paper 2 Statistics 2024 June Q2
1 marks Easy -1.2
2 A test for association is to be carried out. The tables below show the observed frequencies and the expected frequencies that are to be used for the test.
ObservedXYZ
A28666
B884
C541610
Expected\(\mathbf { X }\)\(\mathbf { Y }\)\(\mathbf { Z }\)
\(\mathbf { A }\)451540
\(\mathbf { B }\)938
\(\mathbf { C }\)361232
It is necessary to merge some rows or columns before the test can be carried out.
Find the entry in the tables that provides evidence for this.
Circle your answer.
[0pt] [1 mark]
Observed A-Z
Observed B-Z
Expected A-X
Expected B-Y
AQA Further Paper 3 Statistics 2019 June Q6
9 marks Standard +0.3
6 During August, 102 candidates took their driving test at centre \(A\) and 60 passed. During the same month, 110 candidates took their driving test at centre \(B\) and 80 passed. 6
  1. Test whether the driving test result is independent of the driving test centre using the \(5 \%\) level of significance. 6
  2. Rebecca claims that if the result of the test in part (a) is to reject the null hypothesis then it is easier to pass a driving test at centre \(B\) than centre \(A\). State, with a reason, whether or not you agree with Rebecca's claim.
AQA Further Paper 3 Statistics 2020 June Q8
6 marks Standard +0.3
8 Ray is conducting a hypothesis test with the hypotheses \(\mathrm { H } _ { 0 }\) : There is no association between time of day and number of snacks eaten \(\mathrm { H } _ { 1 }\) : There is an association between time of day and number of snacks eaten
He calculates expected frequencies correct to two decimal places, which are given in the following table.
Number of snacks eaten
\cline { 2 - 5 }\cline { 2 - 4 }012 or more
\cline { 2 - 4 } Time of Day23.6821.055.26
\cline { 2 - 5 }Night21.3218.954.74
\cline { 2 - 5 }
\cline { 2 - 5 }
Ray calculates his test statistic using \(\sum \frac { ( O - E ) ^ { 2 } } { E }\) 8
  1. State, with a reason, the error Ray has made and describe any changes Ray will need to make to his test.
    8
  2. Having made the necessary corrections as described in part (a), the correct value of the test statistic is 8.74 Complete Ray's hypothesis test using a \(1 \%\) level of significance.
AQA Further Paper 3 Statistics 2021 June Q6
7 marks Moderate -0.5
6 Danai is investigating the number of speeding offences in different towns in a country. She carries out a hypothesis test to test for association between town and number of speeding offences per year. 6
  1. State the hypotheses for this test. 6
  2. The observed frequencies, \(O\), have been collected and the expected frequencies, \(E\), have been calculated in an \(n \times m\) contingency table, where \(n > 3\) and \(m > 3\) One of the values of \(E\) is less than 5 6 (b) (i) Explain what steps Danai should take before calculating the test statistic.
    6 (b) (ii) State an expression for the test statistic Danai should calculate.
    6
  3. Danai correctly calculates the value of the test statistic to be 45.22 The number of degrees of freedom for the test is 25
    Determine the outcome of Danai's test, using the \(1 \%\) level of significance.
AQA Further Paper 3 Statistics 2023 June Q5
8 marks Standard +0.3
5 A school management team oversees 11 different schools.
The school management team allows each student in the schools to choose one enrichment activity from 11 possible activities. The school management team count the number of students in each school choosing each of the possible activities. They then conduct a \(\chi ^ { 2 }\)-test for association with the data they have gathered. 5
  1. Exactly one of the calculated expected frequencies for the \(\chi ^ { 2 }\)-test is less than 5
    Explain why the number of degrees of freedom for the test is 90
    5
  2. The school management team claims that there is an association between the school a student attends and the activity they choose. The test statistic is 124.8 Test the claim using the \(1 \%\) level of significance.
    5
  3. During the hypothesis test, the value of \(\frac { ( O - E ) ^ { 2 } } { E }\), where \(O\) is the observed frequency and \(E\) is the expected frequency, was calculated for each group of students. The values for four groups of students are shown in the table below.
    Group\(\frac { ( O - E ) ^ { 2 } } { E }\)
    Attends school 3 and chose activity 10.01
    Attends school 8 and chose activity 318.5
    Attends school 8 and chose activity 724.2
    Attends school 11 and chose activity 749.0
    State, with a reason, which of the four groups of students represents the strongest source of association.
AQA Further Paper 3 Statistics 2024 June Q9
11 marks Standard +0.3
9 A company owns three shops, A, B and C, which are based in different towns. Each shop gives a questionnaire to 250 of their customers, and every customer completes the questionnaire. One of the questions asks whether the customer rates the shop as good, satisfactory or poor. For shop A, 26\% of customers rate the shop as good and 38\% of customers rate the shop as poor. For shop B, 32\% of customers rate the shop as good and 40\% of customers rate the shop as satisfactory. Altogether, there are 210 good ratings and 261 satisfactory ratings. 9
  1. Complete the following table with the observed frequencies.
    \multirow{2}{*}{}Rating
    GoodSatisfactoryPoor
    \multirow{3}{*}{Shop}A
    B
    C
    9
  2. Carry out a test for association between shop and rating, using the 1\% level of significance.