Two-tailed test for any correlation

A question is this type if and only if it asks to test whether there is any correlation (non-zero correlation) between two variables using a two-tailed hypothesis test.

14 questions · Standard +0.0

Sort by: Default | Easiest first | Hardest first
CAIE FP2 2018 June Q6
6 marks Standard +0.3
6 A random sample of 15 observations of pairs of values of two variables gives a product moment correlation coefficient of 0.430 .
  1. Test at the \(10 \%\) significance level whether there is evidence of non-zero correlation between the variables.
    A second random sample of \(N\) observations gives a product moment correlation coefficient of 0.615 . Using a 5\% significance level, there is evidence of positive correlation between the variables.
  2. Find the least possible value of \(N\), justifying your answer.
OCR MEI Paper 2 Specimen Q9
4 marks Moderate -0.8
9 A geyser is a hot spring which erupts from time to time. For two geysers, the duration of each eruption, \(x\) minutes, and the waiting time until the next eruption, \(y\) minutes, are recorded.
  1. For a random sample of 50 eruptions of the first geyser, the correlation coefficient between \(x\) and \(y\) is 0.758 .
    The critical value for a 2 -tailed hypothesis test for correlation at the \(5 \%\) level is 0.279 . Explain whether or not there is evidence of correlation in the population of eruptions. The scatter diagram in Fig. 9 shows the data from a random sample of 50 eruptions of the second geyser. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e9f3a5f3-210b-453d-9ff5-8518340f5689-07_794_1298_383_251} \captionsetup{labelformat=empty} \caption{Fig. 9}
    \end{figure}
  2. Stella claims the scatter diagram shows evidence of correlation between duration of eruption and waiting time. Make two comments about Stella's claim.
OCR Further Statistics AS 2019 June Q5
7 marks Standard +0.3
5 Sixteen candidates took an examination paper in mechanics and an examination paper in statistics.
  1. For all sixteen candidates, the value of the product moment correlation coefficient \(r\) for the marks on the two papers was 0.701 correct to 3 significant figures. Test whether there is evidence, at the \(5 \%\) significance level, of association between the marks on the two papers.
  2. A teacher decided to omit the marks of the candidates who were in the top three places in mechanics and the candidates who were in the bottom three places in mechanics. The marks for the remaining 10 candidates can be summarised by \(n = 10 , \sum x = 750 , \sum y = 690 , \sum x ^ { 2 } = 57690 , \sum y ^ { 2 } = 49676 , \sum x y = 50829\).
    1. Calculate the value of \(r\) for these 10 candidates.
    2. What do the two values of \(r\), in parts (a) and (b)(i), tell you about the scores of the sixteen candidates?
Edexcel S3 2024 January Q3
12 marks Standard +0.3
  1. The table shows the annual tea consumption, \(t\) (kg/person), and population, \(p\) (millions), for a random sample of 7 European countries.
CountryABCDEFG
Annual tea consumption, \(\boldsymbol { t }\) (kg/person)0.270.150.420.061.940.780.44
Population, \(\boldsymbol { p }\) (millions)5.45.8910.267.917.18.7
$$\text { (You may use } \mathrm { S } _ { t t } = 2.486 \quad \mathrm {~S} _ { p p } = 3026.234 \quad \mathrm {~S} _ { p t } = 83.634 \text { ) }$$ Angela suggests using the product moment correlation coefficient to calculate the correlation between annual tea consumption and population.
  1. Use Angela's suggestion to test, at the \(5 \%\) level of significance, whether or not there is evidence of any correlation between annual tea consumption and population. State your hypotheses clearly and the critical value used. Johan suggests using Spearman's rank correlation coefficient to calculate the correlation between the rank of annual tea consumption and the rank of population.
  2. Calculate Spearman's rank correlation coefficient between the rank of annual tea consumption and the rank of population.
  3. Use Johan's suggestion to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between annual tea consumption and population.
    State your hypotheses clearly and the critical value used.
Edexcel S3 Q7
16 marks Standard +0.3
7. For one of the activities at a gymnastics competition, 8 gymnasts were awarded marks out of 10 for each of artistic performance and technical ability. The results were as follows.
Gymnast\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Technical ability8.58.69.57.56.89.19.49.2
Artistic performance6.27.58.26.76.07.28.09.1
The value of the product moment correlation coefficient for these data is 0.774 .
  1. Stating your hypotheses clearly and using a \(1 \%\) level of significance, interpret this value.
  2. Calculate the value of the rank correlation coefficient for these data.
  3. Stating your hypotheses clearly and using a \(1 \%\) level of significance, interpret this coefficient.
  4. Explain why the rank correlation coefficient might be the better one to use with these data. END
OCR MEI Further Statistics A AS 2019 June Q4
8 marks Moderate -0.3
4 A student is investigating correlations between various personality traits, two of which are conscientiousness and openness to new experiences.
She selects a random sample of 10 students at her university and uses standard tests to measure their conscientiousness and their openness. The product moment correlation coefficient between these two variables for the 10 students is 0.476 .
  1. Assuming that the underlying population has a bivariate Normal distribution, carry out a hypothesis test at the \(10 \%\) significance level to investigate whether there is any correlation between openness and conscientiousness in students. Table 4.1 below shows the values of the product moment correlation coefficients between 5 different personality traits for a much larger sample of students. Those correlations that are significant at the \(5 \%\) level are denoted by a * after the value of the correlation. \begin{table}[h]
    NeuroticismExtroversionOpennessAgreeablenessConscientiousness
    Neuroticism1
    Extroversion-0.296*1
    Openness-0.0440.405*1
    Agreeableness-0.190*0.0610.0421
    Conscientiousness-0.485*0.1450.235*0.1121
    \captionsetup{labelformat=empty} \caption{Table 4.1}
    \end{table} The student analyses these factors for effect size.
    Guidelines often used when considering effect size are given in Table 4.2 below. \begin{table}[h]
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    \captionsetup{labelformat=empty} \caption{Table 4.2}
    \end{table}
  2. The student notes that, despite the result of the test in part (a), the correlation between openness and conscientiousness is significant at the \(5 \%\) level with this second sample. Comment briefly on why this may be the case.
  3. The student intends to summarise her findings about relationships between these factors, including effect sizes, in a report.
    Use the information in Tables 4.1 and 4.2 to identify two summary points the student could make.
OCR MEI Further Statistics A AS 2024 June Q5
10 marks Moderate -0.3
5 A student is investigating possible association between the amount of coffee that an adult drinks each day and the number of hours that they remain awake each day. In an initial investigation, a random sample of 8 adults is selected. The student obtains the following information from each of these adults: the amount of coffee that they drink each day and the number of hours that they remain awake each day. The student analyses the data and finds that the associated product moment correlation coefficient is 0.6030 .
  1. State one assumption that must be made for a hypothesis test based on the product moment correlation coefficient to be carried out. For the remainder of this question you may assume that this assumption is true.
  2. Carry out a test at the \(5 \%\) significance level to investigate whether there is any correlation between amount of coffee drunk and number of hours awake. The student conducts a second investigation which is similar to the first but this time based on a random sample of 30 adults. The product moment correlation coefficient for the new data is 0.5487 . The student carries out an equivalent hypothesis test to the one carried out in part (b), again using a 5\% significance level.
  3. Identify any differences between the two tests and their results. You do not need to restate the hypotheses or explain the conclusion in context.
  4. You may assume the following guidelines for considering effect size.
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    Explain briefly why the results of the student's second investigation are likely to be more reliable than the results of the initial investigation.
OCR MEI Further Statistics A AS 2020 November Q2
12 marks Standard +0.3
2 A researcher is investigating the concentration of bacteria and fungi in the air in buildings. The researcher selects a random sample of 12 buildings and measures the concentrations of bacteria, \(x\), and fungi, \(y\), in the air in each building. Both concentrations are measured in the same standard units. Fig. 2 illustrates the data collected. The researcher wishes to test for a relationship between \(x\) and \(y\). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{ba3fcd3c-6834-4116-be0e-d5b27aed0a7e-3_595_844_513_255} \captionsetup{labelformat=empty} \caption{Fig. 2}
\end{figure}
  1. Explain why a test based on the product moment correlation coefficient is likely to be appropriate for these data. Summary statistics for the data are as follows. \(n = 12 \quad \sum x = 18030 \quad \sum y = 15550 \quad \sum x ^ { 2 } = 31458700 \quad \sum y ^ { 2 } = 21980500 \quad \sum x y = 25626800\)
  2. In this question you must show detailed reasoning. Calculate the product moment correlation coefficient between \(x\) and \(y\).
  3. Carry out a test at the \(5 \%\) significance level based on the product moment correlation coefficient to investigate whether there is any correlation between concentrations of bacteria and fungi.
  4. Explain why, in order for proper inference to be undertaken, the sample should be chosen randomly.
OCR MEI Further Statistics Minor 2021 November Q4
14 marks Standard +0.3
4 A scientist is investigating sea salinity (the level of salt in the sea) in a particular area. She wishes to check whether satellite measurements, \(y\), of salinity are similar to those directly measured, \(x\). Both variables are measured in parts per thousand in suitable units. The scientist obtains a random sample of 10 values of \(x\) and the related values of \(y\). Below is a screenshot of a scatter diagram to illustrate the data. She decides to carry out a hypothesis test to check if there is any correlation between direct measurement, \(x\), and satellite measurement, \(y\). \includegraphics[max width=\textwidth, alt={}, center]{691e8b55-e9a1-4fff-b9ee-a71ff1f73ead-5_830_837_589_246}
  1. Explain why the scientist might decide to carry out a test based on the product moment correlation coefficient. Summary statistics for \(x\) and \(y\) are as follows. \(n = 10 \quad \sum x = 351.9 \quad \sum y = 350.0 \quad \sum x ^ { 2 } = 12384.5 \quad \sum y ^ { 2 } = 12251.2 \quad \sum \mathrm { xy } = 12317.2\)
  2. In this question you must show detailed reasoning. Calculate the product moment correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is positive correlation between directly measured and satellite measured salinity levels.
  4. Explain why it would be preferable to use a larger sample. The scientist is also interested in whether there is any correlation between salinity and numbers of a particular species of shrimp in the water. She takes a large sample and finds that the product moment correlation coefficient for this sample is 0.165 . The result of a test based on this sample is to reject the null hypothesis and conclude that there is correlation between salinity and numbers of shrimp.
  5. Comment on the outcome of the hypothesis test with reference to the effect size of 0.165 .
OCR MEI Further Statistics Major 2023 June Q6
12 marks Standard +0.3
6 A student wonders if there is any correlation between download and upload speeds of data to and from the internet. The student decides to carry out a hypothesis test to investigate this and so measures the download speed \(x\) and upload speed \(y\) in suitable units on 20 randomly chosen occasions. The scatter diagram below illustrates the data which the student collected. \includegraphics[max width=\textwidth, alt={}, center]{c692fb20-436f-4bc1-89bd-10fdba41ceba-07_824_1411_440_246}
  1. Explain why the student decides to carry out a test based on the product moment correlation coefficient. Summary statistics for the 20 occasions are as follows. $$\sum x = 342.10 \quad \sum y = 273.65 \quad \sum x ^ { 2 } = 5989.53 \quad \sum y ^ { 2 } = 3919.53 \quad \sum x y = 4713.62$$
  2. In this question you must show detailed reasoning. Calculate the product moment correlation coefficient.
  3. Carry out a hypothesis test at the \(5 \%\) significance level to investigate whether there is any correlation between download speed and upload speed.
  4. Both of the variables, download speed and upload speed, are random. Explain why, if download speed had been a non-random variable, the student could not have carried out the hypothesis test to investigate whether there was any correlation between download speed and upload speed.
OCR MEI Further Statistics Major 2020 November Q6
10 marks Standard +0.3
6 A pollution control officer is investigating a possible link between the levels of various pollutants in the air and the speed of the wind at various sites. A random sample of 60 values of the windspeed together with the levels of a variety of pollutants is taken at a particular site. The product moment correlation coefficient between wind-speed and nitrogen dioxide level is 0.3231 .
  1. Carry out a hypothesis test at the \(10 \%\) significance level to investigate whether there is any correlation between wind-speed and nitrogen dioxide level.
  2. State the condition required for the test carried out in part (a) to be valid. Table 6.1 shows the values of the product moment correlation coefficient between 5 different measures of pollution and also wind-speed for a very large random sample of values at another site. Those correlations that are significant at the \(10 \%\) level are denoted by a * after the value of the correlation. \begin{table}[h]
    CorrelationsPM10SPEED\(\mathrm { NO } _ { 2 }\)\(\mathrm { O } _ { 3 }\)PM25\(\mathrm { SO } _ { 2 }\)
    PM101.00
    SPEED0.08*1.00
    \(\mathrm { NO } _ { 2 }\)0.59*0.25*1.00
    \(\mathbf { O } _ { \mathbf { 3 } }\)-0.05*-0.04*-0.30*1.00
    PM250.85*-0.010.56*-0.021.00
    \(\mathrm { SO } _ { 2 }\)0.42*0.15*0.73*-0.63*0.40*1.00
    \captionsetup{labelformat=empty} \caption{Table 6.1}
    \end{table} \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Table 6.2 shows standard guidelines for effect sizes.}
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    \end{table} Table 6.2 The officer analyses these data for effect size.
  3. Explain how the very large sample size relates to the interpretation of the correlation coefficients shown in Table 6.1.
  4. Comment briefly on what the pollution control officer might conclude from these tables, relevant to her investigation into wind-speed and pollutant levels.
WJEC Further Unit 2 2022 June Q2
11 marks Standard +0.3
2. An economist suggested the rate of unemployment and the rate of wage inflation are independent. Amy sets about investigating this suggestion. She collects unemployment data and wage inflation data from a random sample of regions in the UK and decides that it is appropriate to carry out a significance test on Pearson's product moment correlation coefficient. Amy's summary statistics for percentage unemployment, \(x\), and percentage wage inflation, \(y\), are shown below. $$\begin{array} { l l l } \sum x = 62 \cdot 8 & \sum y = 19 \cdot 4 & n = 10 \\ \sum x ^ { 2 } = 413 \cdot 44 & \sum y ^ { 2 } = 46 \cdot 16 & \sum x y = 113 \cdot 16 \end{array}$$
  1. Calculate Pearson's product moment correlation coefficient for these data.
  2. Carry out Amy's test at the \(5 \%\) level of significance and state whether the economist's suggestion is reasonable. Amy also collects unemployment data and wage inflation data from a random sample of 10 regions in Spain and calculates Pearson's product moment correlation coefficient to be - 0.2525 .
  3. Should this change Amy's opinion on the economist's suggestion above? What could she do to improve her investigation?
  4. What assumption has Amy made in deciding that it is appropriate to carry out a significance test on Pearson's product moment correlation coefficient?
OCR Stats 1 2018 December Q11
6 marks Moderate -0.8
11 Laxmi wishes to test whether there is linear correlation between the mass and the height of adult males.
  1. State, with a reason, whether Laxmi should use a 1-tail or a 2-tail test. Laxmi chooses a random sample of 40 adult males and calculates Pearson's product-moment correlation coefficient, \(r\). She finds that \(r = 0.2705\).
  2. Use the table below to carry out the test at the \(5 \%\) significance level. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Critical values of Pearson's product-moment correlation coefficient.}
    \multirow{2}{*}{}1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2.5\%1\%
    \multirow{4}{*}{\(n\)}380.27090.32020.37600.4128
    390.26730.31600.37120.4076
    400.26380.31200.36650.4026
    410.26050.30810.36210.3978
    \end{table}
WJEC Unit 4 Specimen Q5
7 marks Moderate -0.3
5. A hotel owner in Cardiff is interested in what factors hotel guests think are important when staying at a hotel. From a hotel booking website he collects the ratings for 'Cleanliness', 'Location', 'Comfort' and 'Value for money' for a random sample of 17 Cardiff hotels.
(Each rating is the average of all scores awarded by guests who have contributed reviews using a scale from 1 to 10 , where 10 is 'Excellent'.) The scatter graph shows the relationship between 'Value for money' and 'Cleanliness' for the sample of Cardiff hotels. \includegraphics[max width=\textwidth, alt={}, center]{b35e94ab-a426-4fca-9ecb-c659e0143ed7-4_693_1033_749_516}
  1. The product moment correlation coefficient for 'Value for money' and 'Cleanliness' for the sample of 17 Cardiff hotels is 0.895 . Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether this correlation is significant. State your conclusion in context.
  2. The hotel owner also wishes to investigate whether 'Value for money' has a significant correlation with 'Cost per night'. He used a statistical analysis package which provided the following output which includes the Pearson correlation coefficient of interest and the corresponding \(p\)-value.
    Value for moneyCost per night
    Value for money1
    Cost per night
    0.047
    \(( 0.859 )\)
    1
    Comment on the correlation between 'Value for money' and 'Cost per night'.