Chi-squared test of independence

A question is this type if and only if it involves testing whether two categorical variables are independent using a contingency table and chi-squared test.

157 questions · Standard +0.2

Sort by: Default | Easiest first | Hardest first
OCR S3 2014 June Q7
9 marks Standard +0.3
7 A random sample of 100 adults with a chronic disease was chosen. Each adult was randomly assigned to one of three different treatments. After six months of treatment, each adult was then assessed and classified as 'much improved', 'improved', 'slightly improved' or 'no change'. The results are summarised in Table 1. \begin{table}[h]
Treatment \(A\)Treatment \(B\)Treatment \(C\)
Much improved12164
Improved13126
Slightly improved767
No change539
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} A \(\chi ^ { 2 }\) test, at the \(5 \%\) significance level, is to be carried out.
  1. State suitable hypotheses. Combining the last two rows of Table 1 gives Table 2. \begin{table}[h]
    Treatment \(A\)Treatment \(B\)Treatment \(C\)
    Much improved12164
    Improved13126
    Slightly improved/ No change12916
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  2. By considering the expected frequencies for Treatment \(C\) in Table 1, explain why it was necessary to combine rows.
  3. Show that the contribution to the \(\chi ^ { 2 }\) value for the cell 'slightly improved/no change, Treatment \(C\) ' is 4.231 , correct to 3 decimal places. You are given that the \(\chi ^ { 2 }\) test statistic is 10.51 , correct to 2 decimal places.
  4. Carry out the test.
OCR S2 2011 January Q10
Moderate -0.5
10
7
7
  • 7
  • \href{http://physicsandmathstutor.com}{physicsandmathstutor.com}
    OCR S2 2011 January Q12
    Moderate -0.5
    12
    8
    \href{http://physicsandmathstutor.com}{physicsandmathstutor.com}
    OCR S2 2011 January Q13
    Moderate -0.5
    13
    8
    (continued)
    8
  • 9
  • 9
  • \href{http://physicsandmathstutor.com}{physicsandmathstutor.com}
    9
  • 9
  • \section*{PLEASE DO NOT WRITE ON THIS PAGE} RECOGNISING ACHIEVEMENT
    OCR MEI S2 2009 January Q4
    17 marks Standard +0.3
    4 A gardening research organisation is running a trial to examine the growth and the size of flowers of various plants.
    1. In the trial, seeds of three types of plant are sown. The growth of each plant is classified as good, average or poor. The results are shown in the table.
      \multirow{2}{*}{}Growth\multirow[t]{2}{*}{Row totals}
      GoodAveragePoor
      \multirow{3}{*}{Type of plant}Coriander12281555
      Aster7182348
      Fennel14221147
      Column totals336849150
      Carry out a test at the \(5 \%\) significance level to examine whether there is any association between growth and type of plant. State carefully your null and alternative hypotheses. Include a table of the contributions of each cell to the test statistic.
    2. It is known that the diameter of marigold flowers is Normally distributed with mean 47 mm and standard deviation 8.5 mm . A certain fertiliser is expected to cause flowers to have a larger mean diameter, but without affecting the standard deviation. A large number of marigolds are grown using this fertiliser. The diameters of a random sample of 50 of the flowers are measured and the mean diameter is found to be 49.2 mm . Carry out a hypothesis test at the \(1 \%\) significance level to check whether flowers grown with this fertiliser appear to be larger on average. Use hypotheses \(\mathrm { H } _ { 0 } : \mu = 47 , \mathrm { H } _ { 1 } : \mu > 47\), where \(\mu \mathrm { mm }\) represents the mean diameter of all marigold flowers grown with this fertiliser.
    OCR MEI S2 2010 January Q4
    18 marks Moderate -0.3
    4 A council provides waste paper recycling services for local businesses. Some businesses use the standard service for recycling paper, others use a special service for dealing with confidential documents, and others use both. Businesses are classified as small or large. A survey of a random sample of 285 businesses gives the following data for size of business and recycling service.
    Recycling Service
    \cline { 3 - 5 } \multicolumn{2}{|c|}{}StandardSpecialBoth
    \multirow{2}{*}{
    Size of
    business
    }
    Small352644
    Large555273
    1. Write down null and alternative hypotheses for a test to examine whether there is any association between size of business and recycling service used. The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
      Recycling Service
      \cline { 3 - 5 } \multicolumn{2}{|c|}{}StandardSpecialBoth
      \multirow{2}{*}{
      Size of
      business
      }
      Small0.10230.26070.0186
      Large0.05970.15200.0108
      The sum of these contributions is 0.6041 .
    2. Calculate the expected frequency for large businesses using the special service. Verify the corresponding contribution 0.1520 to the test statistic.
    3. Carry out the test at the \(5 \%\) level of significance, stating your conclusion clearly. The council is also investigating the weight of rubbish in domestic dustbins. In 2008 the average weight of rubbish in bins was 32.8 kg . The council has now started a recycling initiative and wishes to determine whether there has been a reduction in the weight of rubbish in bins. A random sample of 50 domestic dustbins is selected and it is found that the mean weight of rubbish per bin is now 30.9 kg , and the standard deviation is 3.4 kg .
    4. Carry out a test at the \(5 \%\) level to investigate whether the mean weight of rubbish has been reduced in comparison with 2008 . State carefully your null and alternative hypotheses. {www.ocr.org.uk}) after the live examination series.
      If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
      For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1 GE.
      OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge. }
    OCR MEI S2 2011 January Q4
    18 marks Standard +0.3
    4 A researcher is investigating the sizes of pebbles at various locations in a river. Three sites in the river are chosen and each pebble sampled at each site is classified as large, medium or small. The results are as follows.
    Site\multirow{2}{*}{
    Row
    totals
    }
    \cline { 3 - 6 } \multicolumn{2}{|c|}{}ABC
    \multirow{3}{*}{
    Pebble
    size
    }
    Large15121037
    \cline { 2 - 6 }Medium28174590
    \cline { 2 - 6 }Small473336116
    Column totals906291243
    1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between pebble size and site. Your working should include a table of the contributions of each cell to the test statistic.
    2. By referring to each site, comment briefly on how the size of the pebbles compares with what would be expected if there were no association. You should support your answers by referring to your table of contributions.
    OCR MEI S2 2012 January Q4
    17 marks Moderate -0.3
    4 Birds are observed at feeding stations in three different places - woodland, farm and garden. The numbers of finches, thrushes and tits observed at each site are summarised in the table. The birds observed are regarded as a random sample from the population of birds of these species that use these feeding stations.
    \multirow{2}{*}{Observed Frequency}Place
    FarmGardenWoodlandTotals
    \multirow{4}{*}{Species}Thrushes1174792
    Tits702688184
    Finches1721029
    Totals98102105305
    The expected frequencies under the null hypothesis for the usual \(\chi ^ { 2 }\) test are shown in the table below.
    \multirow{2}{*}{Expected Frequency}Place
    FarmGardenWoodland
    \multirow{3}{*}{Species}Thrushes29.560730.767231.6721
    Tits59.121361.534463.3443
    Finches9.31809.69849.9836
    1. Verify that the entry 9.3180 is correct. The corresponding contributions to the test statistic are shown in the table below.
      \multirow{2}{*}{Contribution}Place
      FarmGardenWoodland
      \multirow{3}{*}{Species}Thrushes11.653960.748919.2192
      Tits2.001720.52019.5969
      Finches6.33326.11080.0000
    2. Verify that the entry 6.3332 is correct.
    3. Carry out the test at the \(1 \%\) level of significance.
    4. For each place, use the table of contributions to comment briefly on the differences between the observed and expected distributions of species.
    OCR MEI S2 2013 January Q4
    18 marks Moderate -0.3
    4
    1. A random sample of 60 students studying mathematics was selected. Their grades in the Core 1 module are summarised in the table below, classified according to whether they worked less than 5 hours per week or at least 5 hours per week. Test, at the \(5 \%\) significance level, whether there is any association between grade and hours worked.
      Hours worked
      \cline { 3 - 4 } \multicolumn{2}{|c|}{}Less than 5At least 5
      \multirow{2}{*}{Grade}A or B2011
      \cline { 2 - 4 }C or lower1316
    2. At a canning factory, cans are filled with tomato purée. The machine which fills the cans is set so that the volume of tomato purée in a can, measured in millilitres, is Normally distributed with mean 420 and standard deviation 3.5. After the machine is recalibrated, a quality control officer wishes to check whether the mean is still 420 millilitres. A random sample of 10 cans of tomato purée is selected and the volumes, measured in millilitres, are as follows. $$\begin{array} { l l l l l l l l l l } 417.2 & 422.6 & 414.3 & 419.6 & 420.4 & 410.0 & 418.3 & 416.9 & 418.9 & 419.7 \end{array}$$ Carry out a test at the \(1 \%\) significance level to investigate whether the mean is still 420 millilitres. You should assume that the volumes are Normally distributed with unchanged standard deviation.
    OCR MEI S2 2009 June Q4
    17 marks Standard +0.3
    4 In a traffic survey a random sample of 400 cars passing a particular location during the rush hour is selected. The type of car and the sex of the driver are classified as follows.
    \multirow{2}{*}{}Sex\multirow{2}{*}{Row totals}
    MaleFemale
    \multirow{5}{*}{Type of car}Hatchback9636132
    Saloon7735112
    People carrier384482
    4WD19827
    Sports car222547
    Column totals252148400
    1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between type of car and sex of driver. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
    2. For each type of car, comment briefly on how the number of drivers of each sex compares with what would be expected if there were no association. OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders whose work is used in this paper. To avoid the issue of disclosure of answer-related information to candidates, all copyright acknowledgements are reproduced in the OCR Copyright Acknowledgements Booklet. This is produced for each series of examinations, is given to all schools that receive assessment material and is freely available to download from our public website (\href{http://www.ocr.org.uk}{www.ocr.org.uk}) after the live examination series.
      If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity. For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1PB.
      OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
    OCR MEI S2 2010 June Q4
    18 marks Standard +0.3
    4 In a survey a random sample of 63 runners is selected. The category of runner and the type of running are classified as follows.
    \multirow{2}{*}{}Category of runner\multirow{2}{*}{Row totals}
    JuniorSeniorVeteran
    \multirow{3}{*}{Type of running}Track98219
    Road481224
    Both410620
    Column totals17262063
    1. Carry out a test at the \(5 \%\) significance level to examine whether there is any association between category of runner and the type of running. State carefully your null and alternative hypotheses. Your working should include a table showing the contributions of each cell to the test statistic.
    2. For each category of runner, comment briefly on how the type of running compares with what would be expected if there were no association.
    OCR MEI S2 2011 June Q4
    18 marks Standard +0.3
    4
    1. In a survey on internet usage, a random sample of 200 people is selected. The people are asked how much they have spent on internet shopping during the last three months. The results, classified by amount spent and sex, are shown in the table.
      \multirow{2}{*}{}Sex\multirow{2}{*}{Row totals}
      MaleFemale
      \multirow{5}{*}{Amount spent}Nothing283462
      Less than £50172138
      £50 up to £200222648
      £200 up to £1000231639
      £1000 or more8513
      Column totals98102200
      1. Write down null and alternative hypotheses for a test to examine whether there is any association between amount spent and sex of person. The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
        \multirow{2}{*}{}Sex
        MaleFemale
        \multirow{5}{*}{Amount spent}Nothing0.18650.1791
        Less than £500.14090.1354
        £50 up to £2000.09820.0944
        £200 up to £10000.79180.7608
        £1000 or more0.41710.4007
        The sum of these contributions, correct to 3 decimal places, is 3.205.
      2. Calculate the expected frequency for females spending nothing. Verify the corresponding contribution, 0.1791 , to the test statistic.
      3. Carry out the test at the \(5 \%\) level of significance, stating your conclusion clearly.
    2. A bakery sells loaves specified as having a mean weight of 400 grams. It is known that the weights of these loaves are Normally distributed and that the standard deviation is 5.7 grams. An inspector suspects that the true mean weight may be less than 400 grams. In order to test this, the inspector takes a random sample of 6 loaves. Carry out a suitable test at the \(5 \%\) level, given that the weights, in grams, of the 6 loaves are as follows. \(\begin{array} { l l l l l l } 392.1 & 405.8 & 401.3 & 387.4 & 391.8 & 400.6 \end{array}\) RECOGNISING ACHIEVEMENT
    OCR MEI S2 2012 June Q4
    17 marks Standard +0.3
    4
    1. Mary is opening a cake shop. As part of her market research, she carries out a survey into which type of cake people like best. She offers people 4 types of cake to taste: chocolate, carrot, lemon and ginger. She selects a random sample of 150 people and she classifies the people as children and adults. The results are as follows.
      \multirow{2}{*}{}Classification of person\multirow{2}{*}{Row totals}
      ChildAdult
      \multirow{4}{*}{Type of cake}Chocolate342357
      Carrot161834
      Lemon41822
      Ginger132437
      Column totals6783150
      The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
      Classification of person
      \cline { 3 - 4 } \multicolumn{2}{|c|}{}ChildAdult
      \multirow{3}{*}{
      Type
      of
      cake
      }
      Chocolate2.86462.3124
      \cline { 2 - 4 }Carrot0.04360.0352
      \cline { 2 - 4 }Lemon3.45492.7889
      \cline { 2 - 4 }Ginger0.75260.6075
      The sum of these contributions, correct to 2 decimal places, is 12.86 .
      1. Calculate the expected frequency for children preferring chocolate cake. Verify the corresponding contribution, 2.8646, to the test statistic.
      2. Carry out the test at the \(1 \%\) level of significance.
    2. Mary buys flour in bags which are labelled as containing 5 kg . She suspects that the average contents of these bags may be less than 5 kg . In order to test this, she selects a random sample of 8 bags and weighs their contents. Assuming that weights are Normally distributed with standard deviation 0.0072 kg , carry out a test at the \(5 \%\) level, given that the weights of the 8 bags in kg are as follows.
      4.992
      4.981
      4.982
      4.996
      4.991
      5.006
      5.009
      5.003
      [0pt] [9] OCR is committed to seeking permission to reproduce all third-party content that it uses in its assessment materials. OCR has attempted to identify and contact all copyright holders whose work is used in this paper. To avoid the issue of disclosure of answer-related information to candidates, all copyright acknowledgements are reproduced in the OCR Copyright Acknowledgements Booklet. This is produced for each series of examinations and is freely available to download from our public website (\href{http://www.ocr.org.uk}{www.ocr.org.uk}) after the live examination series.
      If OCR has unwittingly failed to correctly acknowledge or clear any third-party content in this assessment material, OCR will be happy to correct its mistake at the earliest possible opportunity.
      For queries or further information please contact the Copyright Team, First Floor, 9 Hills Road, Cambridge CB2 1GE.
      OCR is part of the Cambridge Assessment Group; Cambridge Assessment is the brand name of University of Cambridge Local Examinations Syndicate (UCLES), which is itself a department of the University of Cambridge.
    OCR MEI S2 2013 June Q4
    18 marks Standard +0.3
    4 An art gallery is holding an exhibition. A random sample of 150 visitors to the exhibition is selected. The visitors are asked which of four artists they prefer. Their preferences, classified according to whether the visitor is female or male, are given in the table.
    Artist preferred
    \cline { 3 - 6 } \multicolumn{2}{|c|}{}MonetRenoirDegasCézanne
    \multirow{2}{*}{Sex}Male8251819
    \cline { 2 - 6 }Female18351017
    1. Carry out a test at the \(10 \%\) significance level to examine whether there is any association between artist preferred and sex of visitor. Your working should include a table showing the contributions of each cell to the test statistic.
    2. For each artist, comment briefly on how the preferences of each sex compare with what would be expected if there were no association.
    OCR MEI S2 2014 June Q4
    18 marks Standard +0.3
    4 A researcher at a large company thinks that there may be some relationship between the numbers of working days lost due to illness per year and the ages of the workers in the company. The researcher selects a random sample of 190 workers. The ages of the workers and numbers of days lost for a period of 1 year are summarised below.
    \cline { 3 - 5 } \multicolumn{2}{c|}{}Working days lost
    \cline { 3 - 5 } \multicolumn{2}{c|}{}0 to 45 to 910 or more
    \multirow{3}{*}{Age}Under 3531274
    \cline { 2 - 5 }35 to 5028328
    \cline { 2 - 5 }Over 50162816
    1. Carry out a test at the \(1 \%\) significance level to investigate whether the researcher's belief appears to be true. Your working should include a table showing the contributions of each cell to the test statistic.
    2. For the 'Over 50' age group, comment briefly on how the working days lost compare with what would be expected if there were no association.
    3. A student decides to reclassify the 'working days lost' into two groups, ' 0 to 4 ' and ' 5 or more', but leave the age groups as before. The test statistic with this classification is 7.08 . Carry out the test at the \(1 \%\) level with this new classification, using the same hypotheses as for the original test.
    4. Comment on the results of the two tests. \section*{END OF QUESTION PAPER}
    OCR MEI S2 2015 June Q4
    20 marks Standard +0.3
    4
    1. As part of an investigation into smoking, a random sample of 120 students was selected. The students were asked whether they were smokers, and also whether either of their parents were smokers. The results are summarised in the table below. Test, at the \(5 \%\) significance level, whether there is any association between the smoking habits of the students and their parents.
      At least one
      parent smokes
      Neither parent
      smokes
      Student smokes2127
      Student does not smoke1755
    2. The manufacturer of a particular brand of cigarette claims that the nicotine content of these cigarettes is Normally distributed with mean 0.87 mg . A researcher suspects that the mean nicotine content of this brand is higher than the value claimed by the manufacturer. The nicotine content, \(x \mathrm { mg }\), is measured for a random sample of 100 cigarettes. The data are summarised as follows. $$\sum x = 88.20 \quad \sum x ^ { 2 } = 78.68$$ Carry out a test at the \(1 \%\) significance level to investigate the researcher's belief. \section*{END OF QUESTION PAPER}
    OCR MEI S2 2016 June Q4
    20 marks Moderate -0.3
    4
    1. A random sample of 80 GCSE students was selected to take part in an investigation into whether attitudes to mathematics differ between girls and boys. The students were asked if they agreed with the statement 'Mathematics is one of my favourite subjects'. They were given three options 'Agree', 'Disagree', 'Neither agree nor disagree'. The results, classified according to sex, are summarised in the table below.
      AgreeDisagreeNeither
      Male17138
      Female121119
      The contributions to the test statistic for the usual \(\chi ^ { 2 }\) test are shown in the table below.
      AgreeDisagreeNeither
      Male0.75500.22461.8153
      Female0.68310.20321.6424
      1. Calculate the expected frequency for females who agree. Verify the corresponding contribution, 0.6831 , to the test statistic.
      2. Carry out the test at the \(5 \%\) level of significance.
    2. The level of radioactivity in limpets (a type of shellfish) in the sea near to a nuclear power station is regularly monitored. Over a period of years it has been found that the level (measured in suitable units) is Normally distributed with mean 5.64. Following an incident at the power station, a researcher suspects that the mean level of radioactivity in limpets may have increased. The researcher selects a random sample of 60 limpets. Their levels of radioactivity, \(x\) (measured in the same units), are summarised as follows. $$\sum x = 373 \quad \sum x ^ { 2 } = 2498$$ Carry out a test at the \(5 \%\) significance level to investigate the researcher's belief.
    OCR S3 2010 January Q7
    14 marks Standard +0.3
    7 A chef wished to ascertain her customers' preference for certain vegetables. She asked a random sample of 120 customers for their preferred vegetable from asparagus, broad beans and cauliflower. The responses, classified according to the gender of the customer, are shown in the table.
    1. Test, at the \(5 \%\) significance level, whether vegetable preference and gender are independent.
    2. Determine whether, at the \(10 \%\) significance level, the vegetables are equally preferred.
    OCR S3 2013 January Q8
    15 marks Standard +0.3
    8 After contracting a particular disease, patients from a hospital are advised to have their blood tested monthly for a year. In order to test whether patients comply with this advice the hospital management commissioned a survey of 100 patients. A hospital statistician selected the patients randomly from records and asked the patients whether or not they had complied with the advice. The results classified by gender are as follows.
    Gender
    \cline { 2 - 4 }FemaleMale
    \cline { 2 - 4 } ComplyYes3430
    \cline { 2 - 4 }No1125
    \cline { 2 - 4 }
    \cline { 2 - 4 }
    1. Test at the \(5 \%\) significance level whether compliance with the advice is independent of gender.
    2. A manager believed that a greater proportion of female patients than male patients comply with the advice. Carry out an appropriate test of proportions at the \(10 \%\) significance level.
    OCR S3 2010 June Q3
    9 marks Standard +0.3
    3 The developers of a shopping mall sponsored a study of the shopping habits of its users. Each of a random sample of 100 users was asked whether their weekend shopping was mainly on Saturday or mainly on Sunday. The results, classified according to whether the user lived in the city or the country, are shown in the table.
    City dwellerCountry dweller
    Saturday shopper2319
    Sunday shopper4216
    1. Test, at the \(10 \%\) significance level, whether there is an association between the area in which shoppers live and the day on which they shop at the weekend.
    2. State, with a reason, whether the conclusion of the test would be different at the \(3 \%\) significance level.
    OCR S3 2012 June Q7
    16 marks Standard +0.3
    7 A study was carried out into whether patients suffering from a certain respiratory disorder would benefit from particular treatments. Each of 90 patients who agreed to take part was given one of three treatments \(A\), \(B\) or \(C\) as shown in the table.
    Treatment\(A\)\(B\)\(C\)
    Number in group312534
    1. It is claimed that each patient was equally likely to have been given any of the treatments. Test at the \(5 \%\) significance level whether the numbers given each treatment are consistent with this claim.
    2. After 3 months the numbers of patients showing improvement for treatments \(A , B\) and \(C\) were 14, 18 and 25 respectively. By setting up a \(2 \times 3\) contingency table, test whether the outcome is dependent on the treatment. Use a \(5 \%\) significance level.
    3. If one of the treatments is abandoned, explain briefly which it should be. \section*{THERE ARE NO QUESTIONS WRITTEN ON THIS PAGE}
    OCR S3 2013 June Q6
    13 marks Standard +0.3
    6 A random sample of 80 students who had all studied Biology, Chemistry and Art at a college was each asked which they enjoyed most. The results, classified according to gender, are given in the table.
    Subject
    \cline { 2 - 5 }BiologyChemistryArt
    \cline { 2 - 5 } GenderMale13411
    \cline { 2 - 5 }Female3787
    \cline { 2 - 5 }
    \cline { 2 - 5 }
    It is required to carry out a test of independence between subject most enjoyed and gender at the \(2 \frac { 1 } { 2 } \%\) significance level.
    1. Calculate the expected values for the cells.
    2. Explain why it is necessary to combine cells, and choose a suitable combination.
    3. Carry out the test.
    OCR S3 2016 June Q2
    7 marks Standard +0.3
    2 A random sample of 200 American voters were asked about which political party they supported and their attitude to a proposed new form of taxation. The voters' responses are summarised in the table. Attitude
    \cline { 2 - 5 }In favourNeutralAgainst
    \cline { 2 - 5 }Democrat581616
    \cline { 2 - 5 } PartyIndependent25411
    \cline { 2 - 5 }Republican172033
    \cline { 2 - 5 }
    \cline { 2 - 5 }
    Carry out a \(\chi ^ { 2 }\) test, at the \(1 \%\) level of significance, to investigate whether there is an association between party supported and attitude to the proposed form of taxation.
    OCR MEI S3 2014 June Q3
    19 marks Standard +0.3
    3
    1. A personal trainer believes that drinking a glass of beetroot juice an hour before exercising enables endurance tests to be completed more quickly. To test his belief he takes a random sample of 12 of his trainees and, on two occasions, asks them to carry out 100 repetitions of a particular exercise as quickly as possible. Each trainee drinks a glass of water on one occasion and a glass of beetroot juice on the other occasion. The times in seconds taken by the trainees are given in the table.
      TraineeWaterBeetroot juice
      A75.172.9
      B86.279.9
      C77.371.6
      D89.190.2
      E67.968.2
      F101.595.2
      G82.576.5
      H83.380.2
      I102.599.1
      J91.382.2
      K92.590.1
      L77.277.9
      The trainer wishes to test his belief using a paired \(t\) test at the \(1 \%\) level of significance. Assuming any necessary assumptions are valid, carry out a test of the hypotheses \(\mathrm { H } _ { 0 } : \mu _ { D } = 0 , \mathrm { H } _ { 1 } : \mu _ { D } < 0\), where \(\mu _ { D }\) is the population mean difference in times (time with beetroot juice minus time with water).
    2. An ornithologist believes that the number of birds landing on the bird feeding station in her garden in a given interval of time during the morning should follow a Poisson distribution. In order to test her belief, she makes the following observations in 60 randomly chosen minutes one morning.
      Number of birds0123456\(\geqslant 7\)
      Frequency25101714741
      Given that the data in the table have a mean value of 3.3, use a goodness of fit test, with a significance level of \(5 \%\), to investigate whether the ornithologist is justified in her belief.
    OCR MEI S4 Q4
    12 marks Standard +0.8
    4 An experiment is carried out to compare five industrial paints, A, B, C, D, E, that are intended to be used to protect exterior surfaces in polluted urban environments. Five different types of surface (I, II, III, IV, V) are to be used in the experiment, and five specimens of each type of surface are available. Five different external locations ( \(1,2,3,4,5\) ) are used in the experiment. The paints are applied to the specimens of the surfaces which are then left in the locations for a period of six months. At the end of this period, a "score" is given to indicate how effective the paint has been in protecting the surface.
    1. Name a suitable experimental design for this trial and give an example of an experimental layout. Initial analysis of the data indicates that any differences between the types of surface are negligible, as also are any differences between the locations. It is therefore decided to analyse the data by one-way analysis of variance.
    2. State the usual model, including the accompanying distributional assumptions, for the one-way analysis of variance. Interpret the terms in the model.
    3. The data for analysis are as follows. Higher scores indicate better performance. The underlying distributions of strengths are assumed to be Normal for both suppliers, with variances 2.45 for supplier A and 1.40 for supplier B.
    4. Test at the \(5 \%\) level of significance whether it is reasonable to assume that the mean strengths from the two suppliers are equal.
    5. Provide a two-sided 90\% confidence interval for the true mean difference.
    6. Show that the test procedure used in part (i), with samples of sizes 7 and 5 and a \(5 \%\) significance level, leads to acceptance of the null hypothesis of equal means if \(- 1.556 < \bar { x } - \bar { y } < 1.556\), where \(\bar { x }\) and \(\bar { y }\) are the observed sample means from suppliers A and B . Hence find the probability of a Type II error for this test procedure if in fact the true mean strength from supplier A is 2.0 units more than that from supplier B.
    7. A manager suggests that the Wilcoxon rank sum test should be used instead, comparing the median strengths for the samples of sizes 7 and 5 . Give one reason why this suggestion might be sensible and two why it might not.