Chi-squared test of independence

A question is this type if and only if it involves testing whether two categorical variables are independent using a contingency table and chi-squared test.

157 questions · Standard +0.2

Sort by: Default | Easiest first | Hardest first
AQA S2 2008 January Q6
11 marks Moderate -0.3
6 A survey is carried out in an attempt to determine whether the salary achieved by the age of 30 is associated with having had a university education. The results of this survey are given in the table.
Salary < £30000Salary \(\boldsymbol { \geq }\) £30000Total
University education5278130
No university education6357120
Total115135250
  1. Use a \(\chi ^ { 2 }\) test, at the \(10 \%\) level of significance, to determine whether the salary achieved by the age of 30 is associated with having had a university education.
  2. What do you understand by a Type I error in this context?
AQA S2 2010 January Q4
10 marks Standard +0.3
4 Julie, a driving instructor, believes that the first-time performances of her students in their driving tests are associated with their ages. Julie's records of her students' first-time performances in their driving tests are shown in the table.
AgePassFail
\(\mathbf { 1 7 } - \mathbf { 1 8 }\)2820
\(\mathbf { 1 9 } - \mathbf { 3 0 }\)214
\(\mathbf { 3 1 } - \mathbf { 3 9 }\)1233
\(\mathbf { 4 0 } - \mathbf { 6 0 }\)65
  1. Use a \(\chi ^ { 2 }\) test at the \(1 \%\) level of significance to investigate Julie's belief.
  2. Interpret your result in part (a) as it relates to the 17-18 age group.
AQA S2 2011 January Q2
11 marks Standard +0.3
2 It is claimed that the way in which students voted at a particular general election was independent of their gender. In order to investigate this claim, 480 male and 540 female students who voted at this general election were surveyed. These students may be regarded as a random sample. The percentages of males and females who voted for the different parties are recorded in the table.
ConservativeLabourLiberal DemocratOther parties
Male32.5302512.5
Female40252015
  1. Complete the contingency table below.
  2. Hence determine, at the \(1 \%\) level of significance, whether the way in which students voted at this general election was independent of their gender.
    ConservativeLabourLiberal DemocratOther partiesTotal
    Male480
    Female540
    Total1020
AQA S2 2012 January Q3
13 marks Standard +0.3
3
  1. Table 1 contains the observed frequencies, \(a , b , c\) and \(d\), relating to the two attributes, \(X\) and \(Y\), required to perform a \(\chi ^ { 2 }\) test. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 1}
    \cline { 2 - 4 } \multicolumn{1}{c|}{}\(\boldsymbol { Y }\)Not \(\boldsymbol { Y }\)Total
    \(\boldsymbol { X }\)\(a\)\(b\)\(m\)
    Not \(\boldsymbol { X }\)\(c\)\(d\)\(n\)
    Total\(p\)\(q\)\(N\)
    \end{table}
    1. Write down, in terms of \(m , n , p , q\) and \(N\), expressions for the 4 expected frequencies corresponding to \(a , b , c\) and \(d\).
    2. Hence prove that the sum of the expected frequencies is \(N\).
  2. Andy, a tennis player, wishes to investigate the possible effect of wind conditions on the results of his matches. The results of his matches for the 2011 season are represented in Table 2. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 2}
    \cline { 2 - 4 } \multicolumn{1}{c|}{}WindyNot windyTotal
    Won151833
    Lost12517
    Total272350
    \end{table} Conduct a \(\chi ^ { 2 }\) test, at the \(10 \%\) level of significance, to investigate whether there is an association between Andy's results and wind conditions.
    (8 marks)
AQA S2 2013 January Q2
12 marks Moderate -0.3
2 A large estate agency would like all the properties that it handles to be sold within three months. A manager wants to know whether the type of property affects the time taken to sell it. The data for a random sample of properties sold are tabulated below.
\multirow{2}{*}{}Type of property
FlatTerracedSemidetachedDetachedTotal
Sold within three months434281884
Sold in more than three months9188641
Total13523624125
  1. Conduct a \(\chi ^ { 2 }\)-test, at the \(10 \%\) level of significance, to determine whether there is an association between the type of property and the time taken to sell it. Explain why it is necessary to combine two columns before carrying out this test.
  2. The manager plans to spend extra money on advertising for one type of property in an attempt to increase the number sold within three months. Explain why the manager might choose:
    1. terraced properties;
    2. flats.
      (2 marks)
AQA S2 2005 June Q2
10 marks Standard +0.3
2 Syd, a snooker player, believes that the outcome of any frame of snooker in which he plays may be influenced by the time of day that the frame takes place. The results of 100 randomly selected frames of snooker, played by Syd, are recorded below.
\cline { 2 - 4 } \multicolumn{1}{c|}{}AfternoonEveningTotal
Win302454
Lose182846
Total4852100
Use a \(\chi ^ { 2 }\) test, at the \(5 \%\) level of significance, to test Syd's belief.
(10 marks)
AQA S2 2006 June Q4
13 marks Moderate -0.3
4 It is claimed that the area within which a school is situated affects the age profile of the staff employed at that school. In order to investigate this claim, the age profiles of staff employed at two schools with similar academic achievements are compared. Academia High School, situated in a rural community, employs 120 staff whilst Best Manor Grammar School, situated in an inner-city community, employs 80 staff. The percentage of staff within each age group, for each school, is given in the table.
Age
Academia
High School
Best Manor
Grammar School
\(\mathbf { 2 2 - } \mathbf { 3 4 }\)17.540.0
\(\mathbf { 3 5 - } \mathbf { 3 9 }\)60.045.0
\(\mathbf { 4 0 - } \mathbf { 5 9 }\)22.515.0
    1. Form the data into a contingency table suitable for analysis using a \(\chi ^ { 2 }\) distribution.
      (2 marks)
    2. Use a \(\chi ^ { 2 }\) test, at the \(1 \%\) level of significance, to determine whether there is an association between the age profile of the staff employed and the area within which the school is situated.
  1. Interpret your result in part (a)(ii) as it relates to the 22-34 age group.
AQA S2 2008 June Q1
9 marks Standard +0.3
1 It is thought that the incidence of asthma in children is associated with the volume of traffic in the area where they live. Two surveys of children were conducted: one in an area where the volume of traffic was heavy and the other in an area where the volume of traffic was light. For each area, the table shows the number of children in the survey who had asthma and the number who did not have asthma.
\cline { 2 - 4 } \multicolumn{1}{c|}{}AsthmaNo asthmaTotal
Heavy traffic5258110
Light traffic286290
Total80120200
  1. Use a \(\chi ^ { 2 }\) test, at the \(5 \%\) level of significance, to determine whether the incidence of asthma in children is associated with the volume of traffic in the area where they live.
  2. Comment on the number of children in the survey who had asthma, given that they lived in an area where the volume of traffic was heavy.
AQA S2 2011 June Q2
11 marks Moderate -0.3
2
  1. The continuous random variable \(X\) has a rectangular distribution defined by the probability density function $$f ( x ) = \begin{cases} 0.01 \pi & u \leqslant x \leqslant 11 u \\ 0 & \text { otherwise } \end{cases}$$ where \(u\) is a constant.
    1. Show that \(u = \frac { 10 } { \pi }\).
    2. Using the formulae for the mean and the variance of a rectangular distribution, find, in terms of \(\pi\), values for \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
    3. Calculate exact values for the mean and the variance of the circumferences of circles having diameters of length \(\left( X + \frac { 10 } { \pi } \right)\).
  2. A machine produces circular discs which have an area of \(Y \mathrm {~cm} ^ { 2 }\). The distribution of \(Y\) has mean \(\mu\) and variance 25 . A random sample of 100 such discs is selected. The mean area of the discs in this sample is calculated to be \(40.5 \mathrm {~cm} ^ { 2 }\). Calculate a 95\% confidence interval for \(\mu\). Emily believed that the performances of 16-year-old students in their GCSEs are associated with the schools that they attend. To investigate her belief, Emily collected data on the GCSE results for 2010 from four schools in her area. The table shows Emily's collected data, denoted by \(O _ { i }\), together with the corresponding expected frequencies, \(E _ { i }\), necessary for a \(\chi ^ { 2 }\) test.
    \multirow{2}{*}{}\(\boldsymbol { \geq } \mathbf { 5 }\) GCSEs\(\mathbf { 1 } \boldsymbol { \leqslant }\) GCSEs < \(\mathbf { 5 }\)No GCSEs
    \(O _ { i }\)\(E _ { i }\)\(O _ { i }\)\(E _ { i }\)\(O _ { i }\)\(E _ { i }\)
    Jolliffe College for the Arts187193.159390.623026.23
    Volpe Science Academy175184.439786.522425.05
    Radok Music School183183.817886.233424.96
    Bailey Language School265248.61112116.632233.76
    Emily used these values to correctly conduct a \(\chi ^ { 2 }\) test at the \(1 \%\) level of significance.
AQA S2 2012 June Q6
11 marks Standard +0.3
6 Fiona, a lecturer in a school of engineering, believes that there is an association between the class of degree obtained by her students and the grades that they had achieved in A-level Mathematics. In order to investigate her belief, she collected the relevant data on the performances of a random sample of 200 recent graduates who had achieved grades A or B in A-level Mathematics. These data are tabulated below.
\multirow{2}{*}{}Class of degree
12(i)2(ii)3Total
\multirow{2}{*}{A-level grade}A203622280
B955488120
Total29917010200
  1. Conduct a \(\chi ^ { 2 }\) test, at the \(1 \%\) level of significance, to determine whether Fiona's belief is justified.
  2. Make two comments on the degree performance of those students in this sample who achieved a grade B in A-level Mathematics.
AQA S2 2013 June Q2
10 marks Standard +0.3
2 A town council wanted residents to apply for grants that were available for home insulation. In a trial, a random sample of 200 residents was encouraged, either in a letter or by a phone call, to apply for the grants. The outcomes are shown in the table.
Applied for grantDid not apply for grantTotal
Letter30130160
Phone call142640
Total44156200
  1. The council believed that a phone call was more effective than a letter in encouraging people to apply for a grant. Use a \(\chi ^ { 2 }\)-test to investigate this belief at the \(5 \%\) significance level.
  2. After the trial, all the residents in the town were encouraged, either in a letter or by a phone call, to apply for the grants. It was found that there was no association between the method of encouragement and the outcome. State, with a reason, whether a Type I error, a Type II error or neither occurred in carrying out the test in part (a).
    (2 marks)
AQA S2 2014 June Q2
11 marks Moderate -0.3
2 A large multinational company recruits employees from all four countries in the UK. For a sample of 250 recruits, the percentages of males and females from each of the countries are shown in Table 1. \begin{table}[h]
\captionsetup{labelformat=empty} \caption{Table 1}
\cline { 2 - 5 } \multicolumn{1}{c|}{}EnglandScotlandWales
Northern
Ireland
Male22.817.610.86.8
Female15.617.27.61.6
\end{table}
  1. Add the frequencies to the contingency table, Table 2, below.
  2. Carry out a \(\chi ^ { 2 }\)-test at the \(10 \%\) significance level to investigate whether there is an association between country and gender of recruits.
  3. By comparing observed and expected values, make one comment about the distribution of female recruits.
    [0pt] [1 mark] \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 2}
    EnglandScotlandWalesNorthern IrelandTotal
    Male145
    Female105
    Total250
    \end{table}
AQA S2 2015 June Q5
10 marks Standard +0.3
5 In a particular town, a survey was conducted on a sample of 200 residents aged 41 years to 50 years. The survey questioned these residents to discover the age at which they had left full-time education and the greatest rate of income tax that they were paying at the time of the survey. The summarised data obtained from the survey are shown in the table.
\multirow{2}{*}{Greatest rate of income tax paid}Age when leaving education (years)\multirow[b]{2}{*}{Total}
16 or less17 or 1819 or more
Zero323439
Basic1021217131
Higher175830
Total1512029200
  1. Use a \(\chi ^ { 2 }\)-test, at the \(5 \%\) level of significance, to investigate whether there is an association between age when leaving education and greatest rate of income tax paid.
  2. It is believed that residents of this town who had left education at a later age were more likely to be paying the higher rate of income tax. Comment on this belief.
    [0pt] [1 mark]
AQA S2 2016 June Q5
13 marks Standard +0.3
5 A car manufacturer keeps a record of how many of the new cars that it has sold experience mechanical problems during the first year. The manufacturer also records whether the cars have a petrol engine or a diesel engine. Data for a random sample of 250 cars are shown in the table.
Problems during first 3 monthsProblems during first year but after first 3 monthsNo problems during first yearTotal
Petrol engine1035170215
Diesel engine482335
Total1443193250
  1. Use a \(\chi ^ { 2 }\)-test to investigate, at the \(10 \%\) significance level, whether there is an association between the mechanical problems experienced by a new car from this manufacturer and the type of engine.
  2. Arisa is planning to buy a new car from this manufacturer. She would prefer to buy a car with a diesel engine, but a friend has told her that cars with diesel engines experience more mechanical problems. Based on your answer to part (a), state, with a reason, the advice that you would give to Arisa.
    [0pt] [2 marks]
Edexcel S3 Q4
11 marks Standard +0.3
4. A group of 40 males and 40 females were asked which of three "Reality TV" shows they liked most - Watched, Stranded or One-2-Win. The results were as follows:
\cline { 2 - 4 } \multicolumn{1}{c|}{}WatchedStrandedOne-2-Win
Males21613
Females151015
Stating your hypotheses clearly, test at the \(10 \%\) level whether or not there is a significant difference in the preferences of males and females.
Edexcel S3 Q4
11 marks Standard +0.3
4. A hospital administrator is assessing staffing needs for its Accident and Emergency Department at different times of day. The administrator already has data on the number of admissions at different times of day but needs to know if the proportion of the cases that are serious remains constant. Staff are asked to assess whether each person arriving at Accident and Emergency has a "minor" or "serious" problem and the results for three different time periods are shown below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}MinorSerious
8 a.m. - 6 p.m.4511
6 p.m. - 2 a.m.4922
2 a.m. - 8 a.m.147
Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is evidence of the proportion of serious injuries being different at different times of day.
(11 marks)
Edexcel S3 Q6
11 marks Standard +0.3
6. Two schools in the same town advertise at the same time for new heads of English and History departments. The number of applicants for each post are shown in the table below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}EnglishHistory
Highfield School3214
Rowntree School4826
Stating your hypotheses clearly, test at the \(10 \%\) level of significance whether or not there is evidence of the proportion of applicants for each job being different in the two schools.
(11 marks) Turn over
Edexcel S3 Q6
15 marks Standard +0.3
6. A survey found that of the 320 people questioned who had passed their driving test aged under twenty-five, 104 had been involved in an accident in the two years following their test. Of the 80 people in the survey who were aged twenty-five or over when they passed their test, 16 had been involved in an accident in the following two years.
  1. Draw up a contingency table showing this information. It is desired to test whether the proportion of drivers having accidents within two years of passing their test is different for those who were aged under twenty-five at the time of passing their test than for those aged twenty-five or over.
    1. Stating your hypotheses clearly, carry out the test at the \(5 \%\) level of significance.
    2. Explain clearly why there is only one degree of freedom. It is found that 12 people who were aged under twenty-five when they took their test and had been involved in an accident in the following two years had been omitted from the information given.
  2. Explain why you do not need to repeat the calculation to know the correct result of the test.
    (2 marks)
Edexcel S3 Q6
14 marks Standard +0.3
6. A market researcher recorded the number of adverts for vehicles in each of three categories on ITV, Channel 4 and Channel 5 over a period of time. The results are shown in the table below.
ITVChannel 4Channel 5
Family Saloon693528
Sports Car202818
Off-road Vehicle12228
  1. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not there is evidence of the proportion of adverts for each type of vehicle being dependent on the channel.
  2. Suggest a reason for your result in part (a).
Edexcel S3 Q5
13 marks Standard +0.3
5. A Policy Unit wished to find out whether attitudes to the European Union varied with age. It conducted a survey asking 200 individuals to which of three age groups they belonged and whether they regarded themselves as generally pro-Europe or Eurosceptic. The results are shown in the table below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}Pro-EuropeEurosceptic
\(18 - 34\) years4321
\(35 - 54\) years3036
55 years or over2743
  1. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether attitudes to Europe are associated with age.
    (11 marks)
    The survey also asked people if they voted at the last election. When the above test was repeated using only the results from those who had voted a value of 4.872 was calculated for \(\sum \frac { ( O - E ) ^ { 2 } } { E }\). No classes were combined.
  2. Find if this value leads to a different result.
OCR MEI Further Statistics Minor 2019 June Q5
16 marks Standard +0.3
5 A student wants to know if there is a positive correlation between the amounts of two pollutants, sulphur dioxide and PM10 particulates, on different days in the area of London in which he lives; these amounts, measured in suitable units, are denoted by \(s\) and \(p\) respectively.
He uses a government website to obtain data for a random sample of 15 days on which the amounts of these pollutants were measured simultaneously. Fig. 5.1 is a scatter diagram showing the data. Summary statistics for these 15 values of \(s\) and \(p\) are as follows. \(\sum s _ { 1 } = 155.4 \quad \sum p = 518.9 \quad \sum s ^ { 2 } = 2322.7 \quad \sum p ^ { 2 } = 21270.5 \quad \sum s p = 6009.1\) \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-4_935_1134_683_260} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. Explain why the student might come to the conclusion that a test based on Pearson's product moment correlation coefficient may be valid.
  2. Find the value of Pearson's product moment correlation coefficient.
  3. Carry out a test at the \(5 \%\) significance level to investigate whether there is positive correlation between the amounts of sulphur dioxide and PM10 particulates.
  4. Explain why the student made sure that the sample chosen was a random sample. The student also wishes to model the relationship between the amounts of nitrogen dioxide \(n\) and PM10 particulates \(p\).
    He takes a random sample of 54 values of the two variables, both measured at the same times. Fig. 5.2 is a scatter diagram which shows the data, together with the regression line of \(n\) on \(p\), the equation of the regression line and the value of \(r ^ { 2 }\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-5_824_1230_495_258} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  5. Predict the value of \(n\) for \(p = 150\).
  6. Discuss the reliability of your prediction in part (e).
OCR MEI Further Statistics Minor 2024 June Q4
12 marks Moderate -0.3
4 A genetics researcher is investigating whether there is any association between natural hair colour and natural eye colour. A random sample of 800 adults is selected. Each adult can categorise their natural hair colour as blonde, brown, black or red and their natural eye colour as brown, blue or green.
  1. Explain the benefit of using a random sample in this investigation. The data collected from the sample are summarised in Table 4.1. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.1}
    \multirow{2}{*}{Observed frequency}Hair Colour
    BlondeBrownBlackRedTotal
    \multirow{3}{*}{Eye Colour}Brown4715319636432
    Blue617811526280
    Green1922311688
    Total12725334278800
    \end{table} The researcher decides to carry out a chi-squared test.
  2. Determine the expected frequencies for each eye colour in the blonde hair category. You are given that the test statistic is 28.62 to 2 decimal places.
  3. Carry out the chi-squared test at the 10\% significance level. Table 4.2 shows the chi-squared contributions for some of the categories. The contributions for the categories relating to green eye colour have been deliberately omitted. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 4.2}
    Hair Colour
    \cline { 2 - 6 }BlondeBrownBlackRed
    \multirow{3}{*}{
    Eye
    Colour
    }
    Brown6.7911.9640.6940.889
    \cline { 2 - 6 }Blue6.1621.2570.1850.062
    \cline { 2 - 6 }Green
    \end{table}
  4. Calculate the chi-squared contribution for the green eye and blonde hair category.
  5. With reference to the values in Table 4.2, discuss what the data suggest about brown eye colour and blue eye colour for people with blonde hair.
  6. A different researcher, carrying out the same investigation, independently takes a different random sample of size 800 and performs the same hypothesis test, but at the 1\% significance level, reaching the same conclusion as the original test. By comparing only the significance level of the two tests, specify which test, the one at the 10\% significance level or the one at the 1\% significance level, provides stronger evidence for the conclusion. Justify your answer.
  7. OCR MEI Further Statistics Major 2022 June Q10
    13 marks Standard +0.3
    10 A scientist is researching dietary fat intake and cholesterol level. A random sample of 60 people is selected and their dietary fat intakes and cholesterol levels are measured. Dietary fat intakes are classified as low, medium and high, and cholesterol levels are classified as normal and high. The scientist decides to carry out a chi-squared test to investigate whether there is any association between dietary fat intake and cholesterol level. Tables \(\mathbf { 1 0 . 1 }\) and \(\mathbf { 1 0 . 2 }\) show the data and some of the expected frequencies for the test. \begin{table}[h]
    \multirow{2}{*}{}Dietary fat intake
    LowMediumHighTotal
    \multirow{2}{*}{Cholesterol level}Normal918532
    High3131228
    Total12311760
    \captionsetup{labelformat=empty} \caption{Table 10.1}
    \end{table} \begin{table}[h]
    Expected frequencyDietary fat intake
    \cline { 3 - 5 }LowMediumHigh
    \multirow{2}{*}{
    Cholesterol
    level
    }
    Normal9.0667
    \cline { 2 - 5 }High7.9333
    \captionsetup{labelformat=empty} \caption{Table 10.2}
    \end{table}
    1. Complete the table of expected frequencies in the Printed Answer Booklet.
    2. Determine the contribution to the chi-squared test statistic for people with normal cholesterol level and high dietary fat intake, giving your answer to \(\mathbf { 4 }\) decimal places. The contributions to the chi-squared test statistic for the remaining categories are shown in Table 10.3. \begin{table}[h]
      Dietary fat intake
      \cline { 2 - 5 }LowMediumHigh
      \multirow{2}{*}{
      Cholesterol
      level
      }
      Normal1.05630.1301
      \cline { 2 - 5 }High1.20710.14872.0846
      \captionsetup{labelformat=empty} \caption{Table 10.3} \end{table}
    3. In this question you must show detailed reasoning. Carry out the test at the 5\% significance level.
    4. For each level of dietary fat intake, give a brief interpretation of what the data suggest about the level of cholesterol.
    5. OCR MEI Further Statistics Major 2023 June Q9
      10 marks Standard +0.3
      9 A cyclist who lives on an island suspects that car drivers with locally registered number plates allow more space when passing her than those with non-locally registered number plates. She decides to carry out a hypothesis test and so over a period of time selects a random sample of 250 cars which pass her. For each car she estimates whether the car driver allows at least the recommended 1.5 metres when passing her. The table shows the data which she collected.
      Where registered
      \cline { 3 - 4 } \multicolumn{2}{|c|}{}LocalNon-local
      \multirow{2}{*}{
      Passing
      distance
      }
      Under 1.5 m1211
      \cline { 2 - 4 }At least 1.5 m15770
      1. In this question you must show detailed reasoning. Carry out the test at the \(5 \%\) significance level to examine whether there is any association between where the car is registered and passing distance.
      2. A friend of the cyclist suggests that there may be a problem with the data, since the cyclist may have introduced some bias in estimating whether cars were allowing the recommended distance. Explain how any bias might have arisen.
      OCR MEI Further Statistics Major 2024 June Q9
      13 marks Standard +0.3
      9 A cyclist has 3 bicycles, a road bike, a gravel bike and an electric bike. She wishes to know if the bicycle which she is riding makes any difference to whether she reaches a speed of 25 mph or greater on a journey. She selects a random sample of 120 journeys and notes the bicycle and whether or not her maximum speed was 25 mph or greater. She decides to carry out a chisquared test to investigate whether there is any association between bicycle type and whether her maximum speed is 25 mph or greater. Tables 9.1 and 9.2 show the data and some of the expected frequencies for the test. \begin{table}[h]
      \captionsetup{labelformat=empty} \caption{Table 9.1}
      \multirow{2}{*}{}Bicycle
      RoadGravelElectricTotal
      \multirow{2}{*}{Maximum speed}Less than 25 mph2211942
      25 mph or greater13471878
      Total156837120
      \end{table} \begin{table}[h]
      \captionsetup{labelformat=empty} \caption{Table 9.2}
      \multirow{2}{*}{Expected frequency}Bicycle
      RoadGravelElectric
      \multirow{2}{*}{Maximum speed}Less than 25 mph12.95
      25 mph or greater24.05
      \end{table}
      1. Complete the table of expected frequencies in the Printed Answer Booklet.
      2. Determine the contribution to the chi-squared test statistic for the Electric bicycle and maximum speed 25 mph or greater. Give your answer correct to 4 decimal places. The contributions to the chi-squared test statistic for the remaining categories are shown in Table 9.3. \begin{table}[h]
        \captionsetup{labelformat=empty} \caption{Table 9.3}
        \multirow{2}{*}{Contribution to the test statistic}Bicycle
        RoadGravelElectric
        \multirow{2}{*}{Maximum speed}Less than 25 mph2.01190.32942.8264
        25 mph or greater1.08330.1774
        \end{table}
      3. In this question you must show detailed reasoning. Carry out the test at the 5\% significance level.
      4. For each type of bicycle, give a brief interpretation of what the data suggest about maximum speed.