Standard 2×3 contingency table

A question is this type if and only if the data form a 2-row by 3-column (or 3-row by 2-column) contingency table requiring a chi-squared test of independence with 2 degrees of freedom, with no need to combine cells.

36 questions · Standard +0.1

5.06a Chi-squared: contingency tables
Sort by: Default | Easiest first | Hardest first
Edexcel S3 Q4
11 marks Standard +0.3
4. A group of 40 males and 40 females were asked which of three "Reality TV" shows they liked most - Watched, Stranded or One-2-Win. The results were as follows:
\cline { 2 - 4 } \multicolumn{1}{c|}{}WatchedStrandedOne-2-Win
Males21613
Females151015
Stating your hypotheses clearly, test at the \(10 \%\) level whether or not there is a significant difference in the preferences of males and females.
OCR MEI Further Statistics Minor 2019 June Q5
16 marks Standard +0.3
5 A student wants to know if there is a positive correlation between the amounts of two pollutants, sulphur dioxide and PM10 particulates, on different days in the area of London in which he lives; these amounts, measured in suitable units, are denoted by \(s\) and \(p\) respectively.
He uses a government website to obtain data for a random sample of 15 days on which the amounts of these pollutants were measured simultaneously. Fig. 5.1 is a scatter diagram showing the data. Summary statistics for these 15 values of \(s\) and \(p\) are as follows. \(\sum s _ { 1 } = 155.4 \quad \sum p = 518.9 \quad \sum s ^ { 2 } = 2322.7 \quad \sum p ^ { 2 } = 21270.5 \quad \sum s p = 6009.1\) \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-4_935_1134_683_260} \captionsetup{labelformat=empty} \caption{Fig. 5.1}
\end{figure}
  1. Explain why the student might come to the conclusion that a test based on Pearson's product moment correlation coefficient may be valid.
  2. Find the value of Pearson's product moment correlation coefficient.
  3. Carry out a test at the \(5 \%\) significance level to investigate whether there is positive correlation between the amounts of sulphur dioxide and PM10 particulates.
  4. Explain why the student made sure that the sample chosen was a random sample. The student also wishes to model the relationship between the amounts of nitrogen dioxide \(n\) and PM10 particulates \(p\).
    He takes a random sample of 54 values of the two variables, both measured at the same times. Fig. 5.2 is a scatter diagram which shows the data, together with the regression line of \(n\) on \(p\), the equation of the regression line and the value of \(r ^ { 2 }\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{4a4d5816-5b53-49a1-b72f-f8bcf3b4e8bc-5_824_1230_495_258} \captionsetup{labelformat=empty} \caption{Fig. 5.2}
    \end{figure}
  5. Predict the value of \(n\) for \(p = 150\).
  6. Discuss the reliability of your prediction in part (e).
Edexcel FS1 AS 2022 June Q1
7 marks Moderate -0.3
  1. Stuart is investigating a treatment for a disease that affects fruit trees. He has 400 fruit trees and applies the treatment to a random sample of these trees. The remainder of the trees have no treatment. He records the number of years, \(y\), that each fruit tree remains free from this disease.
The results are summarised in the table below.
\cline { 3 - 3 } \multicolumn{2}{c|}{}Treatment
\cline { 3 - 4 } \multicolumn{2}{c|}{}AppliedNot applied
\multirow{3}{*}{
Number of years free
from this disease
}
\(y < 1\)1525
\cline { 2 - 4 }\(1 \leqslant y < 2\)3561
\cline { 2 - 4 }\(2 \leqslant y\)124140
The data are to be used to determine whether or not there is an association between the application of the treatment and the number of years that a fruit tree remains free from this disease.
  1. Calculate the expected frequencies for
    1. Applied and \(y < 1\)
    2. Not applied and \(1 \leqslant y < 2\) The value of \(\sum \frac { ( O - E ) ^ { 2 } } { E }\) for the other four classes is 2.642 to 3 decimal places.
  2. Test, at the \(5 \%\) level of significance, whether or not there is an association between the application of the treatment and the number of years a fruit tree remains free from this disease. You should state your hypotheses, test statistic, critical value and conclusion clearly.
Edexcel FS1 AS Specimen Q1
8 marks Standard +0.3
  1. A university foreign language department carried out a survey of prospective students to find out which of three languages they were most interested in studying.
A random sample of 150 prospective students gave the following results.
\cline { 3 - 5 } \multicolumn{2}{c|}{}Language
\cline { 3 - 5 } \multicolumn{2}{c|}{}FrenchSpanishM andarin
\multirow{2}{*}{Gender}M ale232220
\cline { 2 - 5 }Female383215
A test is carried out at the \(1 \%\) level of significance to determine whether or not there is an association between gender and choice of language.
  1. State the null hypothesis for this test.
  2. Show that the expected frequency for females choosing Spanish is 30.6
  3. Calculate the test statistic for this test, stating the expected frequencies you have used.
  4. State whether or not the null hypothesis is rejected. Justify your answer.
  5. Explain whether or not the null hypothesis would be rejected if the test was carried out at the \(10 \%\) level of significance. \section*{Q uestion 1 continued} \section*{Q uestion 1 continued} \section*{Q uestion 1 continued}
AQA Further AS Paper 2 Statistics 2019 June Q7
9 marks Standard +0.3
7 Mohammed is conducting a medical trial to study the effect of two drugs, \(A\) and \(B\), on the amount of time it takes to recover from a particular illness. Drug \(A\) is used by one group of 60 patients and drug \(B\) is used by a second group of 60 patients. The results are summarised in the table:
CAIE FP2 2010 November Q8
7 marks Standard +0.3
The owner of three driving schools, \(A\), \(B\) and \(C\), wished to assess whether there was an association between passing the driving test and the school attended. He selected a random sample of learner drivers from each of his schools and recorded the numbers of passes and failures at each school. The results that he obtained are shown in the table below.
Driving school attended
\(A\)\(B\)\(C\)
Passes231517
Failures272543
Using a \(\chi^2\)-test and a 5% level of significance, test whether there is an association between passing or failing the driving test and the driving school attended. [7]
Edexcel S3 2006 June Q6
11 marks Standard +0.3
A research worker studying colour preference and the age of a random sample of 50 children obtained the results shown below.
Age in yearsRedBlueTotals
412618
810717
126915
Totals282250
Using a 5\% significance level, carry out a test to decide whether or not there is an association between age and colour preference. State your hypotheses clearly. [11]
Edexcel S3 2011 June Q3
10 marks Standard +0.3
A factory manufactures batches of an electronic component. Each component is manufactured in one of three shifts. A component may have one of two types of defect, \(D_1\) or \(D_2\), at the end of the manufacturing process. A production manager believes that the type of defect is dependent upon the shift that manufactured the component. He examines 200 randomly selected defective components and classifies them by defect type and shift. The results are shown in the table below.
\(D_1\)\(D_2\)
First shift4518
Second shift5520
Third shift5012
Stating your hypotheses, test, at the 10\% level of significance, whether or not there is evidence to support the manager's belief. Show your working clearly. [10]
Edexcel S3 2016 June Q2
Standard +0.3
A new drug to vaccinate against influenza was given to 110 randomly chosen volunteers. The volunteers were given the drug in one of 3 different concentrations, \(A\), \(B\) and \(C\), and then were monitored to see if they caught influenza. The results are shown in the table below.
\(A\)\(B\)\(C\)
Influenza12299
No influenza152322
Test, at the 10\% level of significance, whether or not there is an association between catching influenza and the concentration of the new drug. State your hypotheses and show your working clearly. You should state your expected frequencies to 2 decimal places. (10)
Edexcel S3 Q4
11 marks Standard +0.3
A hospital administrator is assessing staffing needs for its Accident and Emergency Department at different times of day. The administrator already has data on the number of admissions at different times of day but needs to know if the proportion of the cases that are serious remains constant. Staff are asked to assess whether each person arriving at Accident and Emergency has a "minor" or "serious" problem and the results for three different time periods are shown below.
MinorSerious
8 a.m. – 6 p.m.4511
6 p.m. – 2 a.m.4922
2 a.m. – 8 a.m.147
Stating your hypotheses clearly, test at the 5% level of significance whether or not there is evidence of the proportion of serious injuries being different at different times of day. [11]
WJEC Further Unit 2 Specimen Q7
12 marks Moderate -0.5
The Pew Research Center's Internet Project offers scholars access to raw data sets from their research. One of the Pew Research Center's projects was on teenagers and technology. A random sample of American families was selected to complete a questionnaire. For each of their children, between and including the ages of 13 and 15, parents of these families were asked: Do you know your child's password for any of [his/her] social media accounts? Responses to this question were received from 493 families. The table below provides a summary of their responses.
Age (years)\multirow{2}{*}{Total}
\cline{2-4} Parent know password131415
Yes767567218
No66103106275
Total142178173493
  1. A test for significance is to be undertaken to see whether there is an association between whether a parent knows any of their child's social media passwords and the age of the child.
    1. Clearly state the null and alternative hypotheses.
    2. Obtain the expected value that is missing from the table below, indicating clearly how it is calculated from the data values given in the table above.
    Expected values:
    Age (years)
    \cline{2-4} Parent knows password131415
    Yes62.7978.7176.50
    No99.2996.50
    1. Obtain the two chi-squared contributions that are missing from the table below.
    Chi-squared contributions:
    Age (years)
    \cline{2-4} Parent knows password131415
    Yes0.1751.180
    No2.2030.935
    The following output was obtained from the statistical package that was used to undertake the analysis: Pearson chi-squared (2) = 7.409 \quad \(p\)-value = 0.0305
    1. Indicate how the degrees of freedom have been calculated for the chi-squared statistic.
    2. Interpret the output obtained from the statistical test in terms of the initial hypotheses. [10]
  2. Comment on the nature of the association observed, based on the contributions to the test statistic calculated in (a). [2]