5.06a Chi-squared: contingency tables

179 questions

Sort by: Default | Easiest first | Hardest first
OCR MEI S2 2011 January Q4
18 marks Standard +0.3
4 A researcher is investigating the sizes of pebbles at various locations in a river. Three sites in the river are chosen and each pebble sampled at each site is classified as large, medium or small. The results are as follows.
Site\multirow{2}{*}{
Row
totals
}
\cline { 3 - 6 } \multicolumn{2}{|c|}{}ABC
\multirow{3}{*}{
Pebble
size
}
Large15121037
\cline { 2 - 6 }Medium28174590
\cline { 2 - 6 }Small473336116
Column totals906291243
  1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between pebble size and site. Your working should include a table of the contributions of each cell to the test statistic.
  2. By referring to each site, comment briefly on how the size of the pebbles compares with what would be expected if there were no association. You should support your answers by referring to your table of contributions.
OCR MEI S2 2012 January Q4
17 marks Moderate -0.3
4 Birds are observed at feeding stations in three different places - woodland, farm and garden. The numbers of finches, thrushes and tits observed at each site are summarised in the table. The birds observed are regarded as a random sample from the population of birds of these species that use these feeding stations.
\multirow{2}{*}{Observed Frequency}Place
FarmGardenWoodlandTotals
\multirow{4}{*}{Species}Thrushes1174792
Tits702688184
Finches1721029
Totals98102105305
The expected frequencies under the null hypothesis for the usual \(\chi ^ { 2 }\) test are shown in the table below.
\multirow{2}{*}{Expected Frequency}Place
FarmGardenWoodland
\multirow{3}{*}{Species}Thrushes29.560730.767231.6721
Tits59.121361.534463.3443
Finches9.31809.69849.9836
  1. Verify that the entry 9.3180 is correct. The corresponding contributions to the test statistic are shown in the table below.
    \multirow{2}{*}{Contribution}Place
    FarmGardenWoodland
    \multirow{3}{*}{Species}Thrushes11.653960.748919.2192
    Tits2.001720.52019.5969
    Finches6.33326.11080.0000
  2. Verify that the entry 6.3332 is correct.
  3. Carry out the test at the \(1 \%\) level of significance.
  4. For each place, use the table of contributions to comment briefly on the differences between the observed and expected distributions of species.
OCR MEI S2 2013 January Q4
18 marks Moderate -0.3
4
  1. A random sample of 60 students studying mathematics was selected. Their grades in the Core 1 module are summarised in the table below, classified according to whether they worked less than 5 hours per week or at least 5 hours per week. Test, at the \(5 \%\) significance level, whether there is any association between grade and hours worked.
    Hours worked
    \cline { 3 - 4 } \multicolumn{2}{|c|}{}Less than 5At least 5
    \multirow{2}{*}{Grade}A or B2011
    \cline { 2 - 4 }C or lower1316
  2. At a canning factory, cans are filled with tomato purée. The machine which fills the cans is set so that the volume of tomato purée in a can, measured in millilitres, is Normally distributed with mean 420 and standard deviation 3.5. After the machine is recalibrated, a quality control officer wishes to check whether the mean is still 420 millilitres. A random sample of 10 cans of tomato purée is selected and the volumes, measured in millilitres, are as follows. $$\begin{array} { l l l l l l l l l l } 417.2 & 422.6 & 414.3 & 419.6 & 420.4 & 410.0 & 418.3 & 416.9 & 418.9 & 419.7 \end{array}$$ Carry out a test at the \(1 \%\) significance level to investigate whether the mean is still 420 millilitres. You should assume that the volumes are Normally distributed with unchanged standard deviation.
OCR MEI S2 2009 June Q4
17 marks Standard +0.3
4 In a traffic survey a random sample of 400 cars passing a particular location during the rush hour is selected. The type of car and the sex of the driver are classified as follows.
\multirow{2}{*}{}Sex\multirow{2}{*}{Row totals}
MaleFemale
\multirow{5}{*}{Type of car}Hatchback9636132
Saloon7735112
People carrier384482
4WD19827
Sports car222547
Column totals252148400
  1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between type of car and sex of driver. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
  2. For each type of car, comment briefly on how the number of drivers of each sex compares with what would be expected if there were no association. }{www.ocr.org.uk}) after the live examination series.
    If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity. For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1PB.
    OCR is part of the
OCR MEI S2 2010 June Q4
18 marks Standard +0.3
4 In a survey a random sample of 63 runners is selected. The category of runner and the type of running are classified as follows.
\multirow{2}{*}{}Category of runner\multirow{2}{*}{Row totals}
JuniorSeniorVeteran
\multirow{3}{*}{Type of running}Track98219
Road481224
Both410620
Column totals17262063
  1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between category of runner and the type of running. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
  2. For each category of runner, comment briefly on how the type of running compares with what would be expected if there were no association.
OCR MEI S2 2011 June Q4
18 marks Standard +0.3
4
  1. In a survey on internet usage, a random sample of 200 people is selected. The people are asked how much they have spent on internet shopping during the last three months. The results, classified by amount spent and sex, are shown in the table.
    \multirow{2}{*}{}Sex\multirow{2}{*}{Row totals}
    MaleFemale
    \multirow{5}{*}{Amount spent}Nothing283462
    Less than £50172138
    £50 up to £200222648
    £200 up to £1000231639
    £1000 or more8513
    Column totals98102200
    1. Write down null and alternative hypotheses for a test to examine whether there is any association between amount spent and sex of person. The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
      \multirow{2}{*}{}Sex
      MaleFemale
      \multirow{5}{*}{Amount spent}Nothing0.18650.1791
      Less than £500.14090.1354
      £50 up to £2000.09820.0944
      £200 up to £10000.79180.7608
      £1000 or more0.41710.4007
      The sum of these contributions, correct to 3 decimal places, is 3.205.
    2. Calculate the expected frequency for females spending nothing. Verify the corresponding contribution, 0.1791 , to the test statistic.
    3. Carry out the test at the \(5 \%\) level of significance, stating your conclusion clearly.
  2. A bakery sells loaves specified as having a mean weight of 400 grams. It is known that the weights of these loaves are Normally distributed and that the standard deviation is 5.7 grams. An inspector suspects that the true mean weight may be less than 400 grams. In order to test this, the inspector takes a random sample of 6 loaves. Carry out a suitable test at the \(5 \%\) level, given that the weights, in grams, of the 6 loaves are as follows. \(\begin{array} { l l l l l l } 392.1 & 405.8 & 401.3 & 387.4 & 391.8 & 400.6 \end{array}\) RECOGNISING ACHIEVEMENT
OCR MEI S2 2012 June Q4
17 marks Standard +0.3
4
  1. Mary is opening a cake shop. As part of her market research, she carries out a survey into which type of cake people like best. She offers people 4 types of cake to taste: chocolate, carrot, lemon and ginger. She selects a random sample of 150 people and she classifies the people as children and adults. The results are as follows.
    \multirow{2}{*}{}Classification of person\multirow{2}{*}{Row totals}
    ChildAdult
    \multirow{4}{*}{Type of cake}Chocolate342357
    Carrot161834
    Lemon41822
    Ginger132437
    Column totals6783150
    The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
    Classification of person
    \cline { 3 - 4 } \multicolumn{2}{|c|}{}ChildAdult
    \multirow{3}{*}{
    Type
    of
    cake
    }
    Chocolate2.86462.3124
    \cline { 2 - 4 }Carrot0.04360.0352
    \cline { 2 - 4 }Lemon3.45492.7889
    \cline { 2 - 4 }Ginger0.75260.6075
    The sum of these contributions, correct to 2 decimal places, is 12.86 .
    1. Calculate the expected frequency for children preferring chocolate cake. Verify the corresponding contribution, 2.8646, to the test statistic.
    2. Carry out the test at the \(1 \%\) level of significance.
  2. Mary buys flour in bags which are labelled as containing 5 kg . She suspects that the average contents of these bags may be less than 5 kg . In order to test this, she selects a random sample of 8 bags and weighs their contents. Assuming that weights are Normally distributed with standard deviation 0.0072 kg , carry out a test at the \(5 \%\) level, given that the weights of the 8 bags in kg are as follows.
    4.992
    4.981
    4.982
    4.996
    4.991
    5.006
    5.009
    5.003
    [0pt] [9] }{www.ocr.org.uk}) after the live examination series.
    If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
    For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
    OCR is part of the
OCR MEI S2 2013 June Q4
18 marks Standard +0.3
4 An art gallery is holding an exhibition. A random sample of 150 visitors to the exhibition is selected. The visitors are asked which of four artists they prefer. Their preferences, classified according to whether the visitor is female or male, are given in the table.
Artist preferred
\cline { 3 - 6 } \multicolumn{2}{|c|}{}MonetRenoirDegasCézanne
\multirow{2}{*}{Sex}Male8251819
\cline { 2 - 6 }Female18351017
  1. Carry out a test at the \(10 \%\) significance level to examine whether there is any association between artist preferred and sex of visitor. Your working should include a table showing the contributions of each cell to the test statistic.
  2. For each artist, comment briefly on how the preferences of each sex compare with what would be expected if there were no association.
OCR MEI S2 2014 June Q4
18 marks Standard +0.3
4 A researcher at a large company thinks that there may be some relationship between the numbers of working days lost due to illness per year and the ages of the workers in the company. The researcher selects a random sample of 190 workers. The ages of the workers and numbers of days lost for a period of 1 year are summarised below.
\cline { 3 - 5 } \multicolumn{2}{c|}{}Working days lost
\cline { 3 - 5 } \multicolumn{2}{c|}{}0 to 45 to 910 or more
\multirow{3}{*}{Age}Under 3531274
\cline { 2 - 5 }35 to 5028328
\cline { 2 - 5 }Over 50162816
  1. Carry out a test at the \(1 \%\) significance level to investigate whether the researcher's belief appears to be true. Your working should include a table showing the contributions of each cell to the test statistic.
  2. For the 'Over 50' age group, comment briefly on how the working days lost compare with what would be expected if there were no association.
  3. A student decides to reclassify the 'working days lost' into two groups, ' 0 to 4 ' and ' 5 or more', but leave the age groups as before. The test statistic with this classification is 7.08 . Carry out the test at the \(1 \%\) level with this new classification, using the same hypotheses as for the original test.
  4. Comment on the results of the two tests. \section*{END OF QUESTION PAPER}
OCR MEI S2 2015 June Q4
20 marks Standard +0.3
4
  1. As part of an investigation into smoking, a random sample of 120 students was selected. The students were asked whether they were smokers, and also whether either of their parents were smokers. The results are summarised in the table below. Test, at the \(5 \%\) significance level, whether there is any association between the smoking habits of the students and their parents.
    At least one
    parent smokes
    Neither parent
    smokes
    Student smokes2127
    Student does not smoke1755
  2. The manufacturer of a particular brand of cigarette claims that the nicotine content of these cigarettes is Normally distributed with mean 0.87 mg . A researcher suspects that the mean nicotine content of this brand is higher than the value claimed by the manufacturer. The nicotine content, \(x \mathrm { mg }\), is measured for a random sample of 100 cigarettes. The data are summarised as follows. $$\sum x = 88.20 \quad \sum x ^ { 2 } = 78.68$$ Carry out a test at the \(1 \%\) significance level to investigate the researcher's belief. \section*{END OF QUESTION PAPER}
OCR MEI S2 2016 June Q4
20 marks Moderate -0.3
4
  1. A random sample of 80 GCSE students was selected to take part in an investigation into whether attitudes to mathematics differ between girls and boys. The students were asked if they agreed with the statement 'Mathematics is one of my favourite subjects'. They were given three options 'Agree', 'Disagree', 'Neither agree nor disagree'. The results, classified according to sex, are summarised in the table below.
    AgreeDisagreeNeither
    Male17138
    Female121119
    The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
    AgreeDisagreeNeither
    Male0.75500.22461.8153
    Female0.68310.20321.6424
    1. Calculate the expected frequency for females who agree. Verify the corresponding contribution, 0.6831 , to the test statistic.
    2. Carry out the test at the \(5 \%\) level of significance.
  2. The level of radioactivity in limpets (a type of shellfish) in the sea near to a nuclear power station is regularly monitored. Over a period of years it has been found that the level (measured in suitable units) is Normally distributed with mean 5.64. Following an incident at the power station, a researcher suspects that the mean level of radioactivity in limpets may have increased. The researcher selects a random sample of 60 limpets. Their levels of radioactivity, \(x\) (measured in the same units), are summarised as follows. $$\sum x = 373 \quad \sum x ^ { 2 } = 2498$$ Carry out a test at the \(5 \%\) significance level to investigate the researcher's belief.
OCR S3 2010 January Q7
14 marks Standard +0.3
7 A chef wished to ascertain her customers' preference for certain vegetables. She asked a random sample of 120 customers for their preferred vegetable from asparagus, broad beans and cauliflower. The responses, classified according to the gender of the customer, are shown in the table.
  1. Test, at the \(5 \%\) significance level, whether vegetable preference and gender are independent.
  2. Determine whether, at the \(10 \%\) significance level, the vegetables are equally preferred.
OCR S3 2013 January Q6
7 marks Standard +0.3
6 A large population of plants consists of five species \(A , B , C , D\) and \(E\) in the proportions \(p _ { A } , p _ { B } , p _ { C } , p _ { D }\) and \(p _ { E }\) respectively. A random sample of 120 plants consisted of \(23,14,24,27\) and 32 of \(A , B , C , D\) and \(E\) respectively. Carry out a test at the \(10 \%\) significance level of the null hypothesis that the proportions are \(p _ { \mathrm { A } } = p _ { \mathrm { B } } = 0.15 , p _ { \mathrm { C } } = p _ { \mathrm { D } } = 0.25\) and \(p _ { \mathrm { E } } = 0.2\).
OCR S3 2013 January Q8
15 marks Standard +0.3
8 After contracting a particular disease, patients from a hospital are advised to have their blood tested monthly for a year. In order to test whether patients comply with this advice the hospital management commissioned a survey of 100 patients. A hospital statistician selected the patients randomly from records and asked the patients whether or not they had complied with the advice. The results classified by gender are as follows.
Gender
\cline { 2 - 4 }FemaleMale
\cline { 2 - 4 } ComplyYes3430
\cline { 2 - 4 }No1125
\cline { 2 - 4 }
\cline { 2 - 4 }
  1. Test at the \(5 \%\) significance level whether compliance with the advice is independent of gender.
  2. A manager believed that a greater proportion of female patients than male patients comply with the advice. Carry out an appropriate test of proportions at the \(10 \%\) significance level.
OCR S3 2010 June Q3
9 marks Standard +0.3
3 The developers of a shopping mall sponsored a study of the shopping habits of its users. Each of a random sample of 100 users was asked whether their weekend shopping was mainly on Saturday or mainly on Sunday. The results, classified according to whether the user lived in the city or the country, are shown in the table.
City dwellerCountry dweller
Saturday shopper2319
Sunday shopper4216
  1. Test, at the \(10 \%\) significance level, whether there is an association between the area in which shoppers live and the day on which they shop at the weekend.
  2. State, with a reason, whether the conclusion of the test would be different at the \(3 \%\) significance level.
OCR S3 2012 June Q7
16 marks Standard +0.3
7 A study was carried out into whether patients suffering from a certain respiratory disorder would benefit from particular treatments. Each of 90 patients who agreed to take part was given one of three treatments \(A\), \(B\) or \(C\) as shown in the table.
Treatment\(A\)\(B\)\(C\)
Number in group312534
  1. It is claimed that each patient was equally likely to have been given any of the treatments. Test at the \(5 \%\) significance level whether the numbers given each treatment are consistent with this claim.
  2. After 3 months the numbers of patients showing improvement for treatments \(A , B\) and \(C\) were 14, 18 and 25 respectively. By setting up a \(2 \times 3\) contingency table, test whether the outcome is dependent on the treatment. Use a \(5 \%\) significance level.
  3. If one of the treatments is abandoned, explain briefly which it should be. \section*{THERE ARE NO QUESTIONS WRITTEN ON THIS PAGE}
OCR S3 2013 June Q6
13 marks Standard +0.3
6 A random sample of 80 students who had all studied Biology, Chemistry and Art at a college was each asked which they enjoyed most. The results, classified according to gender, are given in the table.
Subject
\cline { 2 - 5 }BiologyChemistryArt
\cline { 2 - 5 } GenderMale13411
\cline { 2 - 5 }Female3787
\cline { 2 - 5 }
\cline { 2 - 5 }
It is required to carry out a test of independence between subject most enjoyed and gender at the \(2 \frac { 1 } { 2 } \%\) significance level.
  1. Calculate the expected values for the cells.
  2. Explain why it is necessary to combine cells, and choose a suitable combination.
  3. Carry out the test.
OCR S3 2016 June Q2
7 marks Standard +0.3
2 A random sample of 200 American voters were asked about which political party they supported and their attitude to a proposed new form of taxation. The voters' responses are summarised in the table. Attitude
\cline { 2 - 5 }In favourNeutralAgainst
\cline { 2 - 5 }Democrat581616
\cline { 2 - 5 } PartyIndependent25411
\cline { 2 - 5 }Republican172033
\cline { 2 - 5 }
\cline { 2 - 5 }
Carry out a \(\chi ^ { 2 }\) test, at the \(1 \%\) level of significance, to investigate whether there is an association between party supported and attitude to the proposed form of taxation.
OCR MEI S3 2011 January Q2
18 marks Standard +0.3
2
    1. What is stratified sampling? Why would it be used?
    2. A local authority official wishes to conduct a survey of households in the borough. He decides to select a stratified sample of 2000 households using Council Tax property bands as the strata. At the time of the survey there are 79368 households in the borough. The table shows the numbers of households in the different tax bands.
      Tax bandA - BC - DE - FG - H
      Number of households322983321197394120
      Calculate the number of households that the official should choose from each stratum in order to obtain his sample of 2000 households so that each stratum is represented proportionally.
    1. What assumption needs to be made when using a Wilcoxon single sample test?
    2. As part of an investigation into trends in local authority spending, one of the categories of expenditure considered was 'Highways and the Environment'. For a random sample of 10 local authorities, the percentages of their total expenditure spent on Highways and the Environment in 1999 and then in 2009 are shown in the table.
      Local authorityABCDEFGHIJ
      19999.608.408.679.329.899.357.918.089.618.55
      20098.948.427.878.4110.1710.118.319.769.549.67
      Use a Wilcoxon test, with a significance level of \(10 \%\), to determine whether there appears to be any change to the average percentage of total expenditure spent on Highways and the Environment between 1999 and 2009.
OCR MEI S3 2016 June Q2
18 marks Standard +0.3
2
  1. A genetic model involving body colour and eye colour of fruit flies predicts that offspring will consist of four phenotypes in the ratio \(9 : 3 : 3 : 1\). A random sample of 200 such offspring is taken. Their phenotypes are found to be as follows.
    PhenotypeBrown body Red eyeBrown body Brown eyeBlack body Red eyeBlack body Brown eye
    Frequency12537326
    Relative proportion from model9331
    Carry out a test, using a \(2.5 \%\) level of significance, of the goodness of fit of the genetic model to these data.
  2. The median length of European fruit flies is 2.5 mm . South American fruit flies are believed to be larger than European fruit flies. A random sample of 12 South American fruit flies is taken. The flies are found to have the following lengths (in mm). \(1.7 \quad 1.4\) \(3.1 \quad 3.5\) 3.8
    4.2
    2.2
    2.9
    4.4
    2.6 \(3.9 \quad 3.2\) Carry out a Wilcoxon signed rank test, using a \(5 \%\) level of significance, to test this belief.
CAIE FP2 2009 June Q9
9 marks Standard +0.3
9 The proportions of blood types \(\mathrm { A } , \mathrm { B } , \mathrm { AB }\) and O in the Australian population are \(38 \% , 10 \% , 3 \%\) and \(49 \%\) respectively. In order to test whether the population in Sydney conforms to these figures, a random sample of 200 residents is selected. The table shows the observed frequencies of these types in the sample.
Blood TypeABABO
Frequency57249110
Carry out a suitable test at the 5\% significance level. Find the smallest sample size that could be used for the test.
CAIE FP2 2010 June Q10
13 marks Standard +0.3
10 Three new flu vaccines, \(A , B\) and \(C\), were tested on 500 volunteers. The vaccines were assigned randomly to the volunteers and 178 received \(A , 149\) received \(B\) and 173 received \(C\). During the following year, 30 of the volunteers given \(A\) caught flu, 29 of the volunteers given \(B\) caught flu, and 16 of the volunteers given \(C\) caught flu. Carry out a suitable test for independence at the 5\% significance level. Without using a statistical test, decide which of the vaccines appears to be most effective.
CAIE FP2 2011 June Q6
7 marks Standard +0.3
6 A random sample of residents in a town took part in a survey. They were asked whether they would prefer the local council to spend money on improving the local bus service or on improving the quality of road surfaces. The responses are shown in the following table, classified according to the area of the town in which the residents live.
Area 1Area 2Area 3
Local bus service733630
Road surfaces474420
Using a \(5 \%\) significance level, test whether there is an association between the area lived in and preference for improving the local bus service or improving the quality of road surfaces.
CAIE FP2 2012 June Q8
9 marks Standard +0.3
8 Residents of three towns \(A , B\) and \(C\) were asked to grade the reliability of their digital television signal as good, satisfactory or poor. A random sample of responses from each town is taken and the numbers in each category are given in the following table.
GoodSatisfactoryPoor
Town \(A\)243414
Town \(B\)586026
Town \(C\)203430
Test, at the 2.5\% significance level, whether grade of reliability is independent of town. Identify which town makes the greatest contribution to the test statistic and relate your answer to the context of the question.
CAIE FP2 2013 June Q11 OR
Challenging +1.8
A researcher is investigating the relationship between the political allegiance of university students and their childhood environment. He chooses a random sample of 100 students and finds that 60 have political allegiance to the Alliance party. He also classifies their childhood environment as rural or urban, and finds that 45 had a rural childhood. The researcher carries out a test, at the \(10 \%\) significance level, on this data and finds that political allegiance is independent of childhood environment. Given that \(A\) is the number of students in the sample who both support the Alliance party and have a rural childhood, find the greatest and least possible values of \(A\). A second random sample of size \(100 N\), where \(N\) is an integer, is taken from the university student population. It is found that the proportions supporting the Alliance party from urban and rural childhoods are the same as in the first sample. Given that the value of \(A\) in the first sample was 29, find the greatest possible value of \(N\) that would lead to the same conclusion (that political allegiance is independent of childhood environment) from a test, at the \(10 \%\) significance level, on this second set of data.