Hypothesis test of Pearson’s product-moment correlation coefficient

62 questions · 18 question types identified

Sort by: Question count | Difficulty
Calculate PMCC from summary statistics

A question is this type if and only if it asks to calculate Pearson's product moment correlation coefficient given summary statistics (Σx, Σy, Σx², Σy², Σxy, n) or Sxx, Syy, Sxy.

19 Standard +0.1
30.6% of questions
Show example »
1 A wildlife expert measured the neck lengths, \(x\) metres, and the tail lengths, \(y\) metres, of a sample of 12 mature male giraffes as part of a study into their physical characteristics. The results are shown in the table.
View full question →
Easiest question Moderate -0.8 »
1 A wildlife expert measured the neck lengths, \(x\) metres, and the tail lengths, \(y\) metres, of a sample of 12 mature male giraffes as part of a study into their physical characteristics. The results are shown in the table.
View full question →
Hardest question Standard +0.3 »
3 A student is investigating the relationship between the length \(x \mathrm {~mm}\) and circumference \(y \mathrm {~mm}\) of plums from a large crop. The student measures the dimensions of a random sample of 10 plums from this crop. Summary statistics for these dimensions are as follows. $$\begin{aligned} & \sum x = 4715 \quad \sum y = 13175 \quad \sum x ^ { 2 } = 2237725 \\ & \sum y ^ { 2 } = 17455825 \quad \sum x y = 6235575 \quad n = 10 \end{aligned}$$
  1. Calculate the sample product moment correlation coefficient.
  2. Carry out a hypothesis test at the \(5 \%\) significance level to determine whether there is any correlation between length and circumference of plums from this crop. State your hypotheses clearly, defining any symbols which you use.
  3. (A) Explain the meaning of a 5\% significance level.
    (B) State one advantage and one disadvantage of using a \(1 \%\) significance level rather than a \(5 \%\) significance level in a hypothesis test. The student decides to take another random sample of 10 plums. Using the same hypotheses as in part (ii), the correlation coefficient for this second sample is significant at the \(5 \%\) level. The student decides to ignore the first result and concludes that there is correlation between the length and circumference of plums in the crop.
  4. Comment on the student's decision to ignore the first result. Suggest a better way in which the student could proceed.
View full question →
One-tailed test for positive correlation

A question is this type if and only if it asks to test whether there is positive correlation between two variables using a one-tailed hypothesis test with H₁: ρ > 0.

17 Standard +0.3
27.4% of questions
Show example »
2 A shopper estimates the cost, \(\pounds X\) per item, of each of 12 items in a supermarket. The shopper's estimates are compared with the actual cost, \(\pounds Y\) per item, of each item. The results are summarised as follows. \(n = 12\) \(\sum x = 399\) \(\sum y = 623.88\) \(\sum x ^ { 2 } = 28127\) \(\sum y ^ { 2 } = 116509.0212\) \(\sum x y = 45006.01\) Test at the 1\% significance level whether the shopper's estimates are positively correlated with the actual cost of the items.
View full question →
Easiest question Moderate -0.3 »
1 The best performances of a random sample of 20 junior athletes in the long jump, \(x\) metres, and in the high jump, \(y\) metres, were recorded. The following statistics were calculated from the results. $$S _ { x x } = 7.0036 \quad S _ { y y } = 0.8464 \quad S _ { x y } = 1.3781$$
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    (2 marks)
  2. Assuming that these data come from a bivariate normal distribution, investigate, at the \(1 \%\) level of significance, the claim that for junior athletes there is a positive correlation between \(x\) and \(y\).
  3. Interpret your conclusion in the context of this question.
View full question →
Hardest question Standard +0.8 »
9 The land areas \(x\) (in suitable units) and populations \(y\) (in millions) for a sample of 8 randomly chosen cities are given in the following table.
Land area \(( x )\)1.04.52.41.63.88.67.56.5
Population \(( y )\)0.88.44.21.62.210.24.25.2
$$\left[ \Sigma x = 35.9 , \Sigma x ^ { 2 } = 216.47 , \Sigma y = 36.8 , \Sigma y ^ { 2 } = 244.96 , \Sigma x y = 212.62 . \right]$$
  1. Find, showing all necessary working, the value of the product moment correlation coefficient for this sample.
  2. Using a \(1 \%\) significance level, test whether there is positive correlation between land area and population of cities.
    The land areas and populations for another randomly chosen sample of cities, this time of size \(n\), give a product moment correlation coefficient of 0.651 . Using a test at the \(1 \%\) significance level, there is evidence of non-zero correlation between the variables.
  3. Find the least possible value of \(n\), justifying your answer.
View full question →
Two-tailed test for any correlation

A question is this type if and only if it asks to test whether there is any correlation (non-zero correlation) between two variables using a two-tailed hypothesis test.

14 Standard +0.0
22.6% of questions
Show example »
6 A random sample of 15 observations of pairs of values of two variables gives a product moment correlation coefficient of 0.430 .
  1. Test at the \(10 \%\) significance level whether there is evidence of non-zero correlation between the variables.
    A second random sample of \(N\) observations gives a product moment correlation coefficient of 0.615 . Using a 5\% significance level, there is evidence of positive correlation between the variables.
  2. Find the least possible value of \(N\), justifying your answer.
View full question →
Easiest question Easy -1.2 »
3. Laxmi wishes to test whether there is linear correlation between the mass and the height of adult males.
  1. State, with a reason, whether Laxmi should use a 1-tail or a 2-tail test. Laxmi chooses a random sample of 40 adult males and calculates Pearson's product-moment correlation coefficient, \(r\). She finds that \(r = 0.2705\).
  2. Use the table below to carry out the test at the \(5 \%\) significance level. Critical values of Pearson's product-moment correlation coefficient.
    \cline{2-5}
    1-tail
    test
    \(5 \%\)\(2.5 \%\)\(1 \%\)
    2-tail
    test
    \(10 \%\)\(5 \%\)\(2.5 \%\)\(1 \%\)
    380.27090.32020.37600.4128
    390.26730.31600.37120.4076
    400.26380.31200.36650.4026
    410.26050.30810.36210.3978
View full question →
Hardest question Standard +0.3 »
6 A random sample of 15 observations of pairs of values of two variables gives a product moment correlation coefficient of 0.430 .
  1. Test at the \(10 \%\) significance level whether there is evidence of non-zero correlation between the variables.
    A second random sample of \(N\) observations gives a product moment correlation coefficient of 0.615 . Using a 5\% significance level, there is evidence of positive correlation between the variables.
  2. Find the least possible value of \(N\), justifying your answer.
View full question →
One-tailed test for negative correlation

A question is this type if and only if it asks to test whether there is negative correlation between two variables using a one-tailed hypothesis test with H₁: ρ < 0.

4 Standard +0.0
6.5% of questions
Show example »
5 For a random sample of 12 observations of pairs of values \(( x , y )\), the product moment correlation coefficient is - 0.456 . Test, at the \(5 \%\) significance level, whether there is evidence of negative correlation between the variables.
View full question →
Describe correlation from scatter diagram

A question is this type if and only if it shows a scatter diagram and asks to describe the correlation (strength and direction) without calculation.

3 Moderate -0.4
4.8% of questions
Show example »
  1. A random sample of 15 days is taken from the large data set for Perth in June and July 1987. The scatter diagram in Figure 1 displays the values of two of the variables for these 15 days.
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{2b63aa7f-bc50-4422-8dc0-e661b521c221-04_722_709_376_677} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
  1. Describe the correlation. The variable on the \(x\)-axis is Daily Mean Temperature measured in \({ } ^ { \circ } \mathrm { C }\).
  2. Using your knowledge of the large data set,
    1. suggest which variable is on the \(y\)-axis,
    2. state the units that are used in the large data set for this variable. Stav believes that there is a correlation between Daily Total Sunshine and Daily Maximum Relative Humidity at Heathrow. He calculates the product moment correlation coefficient between these two variables for a random sample of 30 days and obtains \(r = - 0.377\)
  3. Carry out a suitable test to investigate Stav's belief at a \(5 \%\) level of significance. State clearly
    • your hypotheses
    • your critical value
    On a random day at Heathrow the Daily Maximum Relative Humidity was 97\%
  4. Comment on the number of hours of sunshine you would expect on that day, giving a reason for your answer.
View full question →
Interpret p-value for correlation test

A question is this type if and only if it provides a p-value and asks to interpret it or use it to reach a conclusion about correlation.

1 Moderate -0.5
1.6% of questions
Show example »
13 The pre-release material contains information concerning median house prices, recycling rates and employment rates. Fig. 13.1 shows a scatter diagram of recycling rate against employment rate for a random sample of 33 regions. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{cea67565-8074-4703-8e1a-09b98e380baf-14_629_1424_397_242} \captionsetup{labelformat=empty} \caption{Fig. 13.1}
\end{figure} The product moment correlation coefficient for this sample is 0.37154 and the associated \(p\)-value is 0.033. Lee conducts a hypothesis test at the \(5 \%\) level to test whether there is any evidence to suggest there is positive correlation between recycling rate and employment rate. He concludes that there is no evidence to suggest positive correlation because \(0.033 \approx 0\) and \(0.37154 > 0.05\).
  1. Explain whether Lee's reasoning is correct. Fig. 13.2 shows a scatter diagram of recycling rate against median house price for a random sample of 33 regions. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{cea67565-8074-4703-8e1a-09b98e380baf-14_648_1474_1758_242} \captionsetup{labelformat=empty} \caption{Fig. 13.2}
    \end{figure} The product moment correlation coefficient for this sample is - 0.33278 and the associated \(p\)-value is 0.058 . Fig. 13.3 shows summary statistics for the median house prices for the data in this sample. \begin{table}[h]
    Statistics
    \(n\)33
    Mean465467.9697
    \(\sigma\)201236.1345
    \(s\)204356.2606
    \(\Sigma x\)15360443
    \(\Sigma x ^ { 2 }\)8486161617387
    Min243500
    Q1342500
    Median410000
    Q3521000
    Max1200000
    \captionsetup{labelformat=empty} \caption{Fig. 13.3}
    \end{table}
  2. Use the information in Fig. 13.3 and Fig. 13.2 to show that there are at least two outliers.
  3. Describe the effect of removing the outliers on
    • the product moment correlation coefficient between recycling rate and median house price,
    • the \(p\)-value associated with this correlation coefficient,
      in each case explaining your answer.
      [0pt] [2]
      All 33 items in the sample are areas in London. A student suggests that it is very unlikely that only areas in London would be selected in a random sample.
    • Use your knowledge of the pre-release material to explain whether you think the student's suggestion is reasonable.
View full question →
Use critical value table directly

A question is this type if and only if it provides a table of critical values and asks to carry out a hypothesis test by comparing the calculated r to the critical value.

1 Moderate -0.8
1.6% of questions
Show example »
10 A researcher plans to carry out a statistical investigation to test whether there is linear correlation between the time ( \(T\) weeks) from conception to birth, and the birth weight ( \(W\) grams) of new-born babies.
  1. Explain why a 1-tail test is appropriate in this context. The researcher records the values of \(T\) and \(W\) for a random sample of 11 babies. They calculate Pearson's product-moment correlation coefficient for the sample and find that the value is 0.722 .
  2. Use the table below to carry out the test at the \(1 \%\) significance level. \section*{Critical values of Pearson's product-moment correlation coefficient.}
    \multirow{2}{*}{}1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2.5\%1\%
    \multirow{4}{*}{\(n\)}100.54940.63190.71550.7646
    110.52140.60210.68510.7348
    120.49730.57600.65810.7079
    130.47620.55290.63390.6835
View full question →
Compare PMCC with Spearman's rank

A question is this type if and only if it asks to test both product moment and Spearman's rank correlation coefficients and compare results or explain which is more appropriate.

1 Standard +0.3
1.6% of questions
Show example »
2. A random sample of 8 students sat examinations in Geography and Statistics. The product moment correlation coefficient between their results was 0.572 and the Spearman rank correlation coefficient was 0.655 .
  1. Test both of these values for positive correlation. Use a \(5 \%\) level of significance.
  2. Comment on your results.
View full question →
Comment on causation vs correlation

A question is this type if and only if it asks to comment on a claim about causation or to explain why correlation does not imply causation.

1 Moderate -0.5
1.6% of questions
Show example »
10
    1. State appropriate hypotheses for Shona to use in her test. 10
  1. (ii) Determine if there is sufficient evidence to reject the null hypothesis.
    Fully justify your answer.
    [0pt] [1 mark] 10
  2. Shona's teacher tells her to remove calculation \(D\) from the table as it is incorrect.
    Explain how the teacher knew it was incorrect.
    [0pt] [1 mark] 10
  3. Before performing calculation B, Shona cleaned the data. She removed all cars from the Large Data Set that had incorrect masses. Using your knowledge of the large data set, explain what was incorrect about the masses which were removed from the calculation.
    [0pt] [1 mark] 10
  4. Apart from CO 2 and CO emissions, state one other type of emission that Shona could investigate using the Large Data Set. 10
  5. Wesley claims that calculation C shows that a heavier car causes higher CO 2 emissions. Give two reasons why Wesley's claim may be incorrect.
View full question →
Identify outliers and their effect

A question is this type if and only if it asks to identify outliers from a scatter diagram or discuss how removing outliers affects correlation or test conclusions.

0
0.0% of questions
Effect of coding on correlation

A question is this type if and only if it asks whether the correlation coefficient would change under linear transformations or coding of variables.

0
0.0% of questions
Justify one-tailed vs two-tailed choice

A question is this type if and only if it asks to state with a reason whether a one-tailed or two-tailed test should be used in a given context.

0
0.0% of questions
Explain why PMCC test appropriate

A question is this type if and only if it asks to explain why a test based on product moment correlation coefficient is appropriate or suitable for given data.

0
0.0% of questions
Comment on scatter diagram validity

A question is this type if and only if it asks to use a scatter diagram to check whether the bivariate normal assumption is likely to be valid or to comment on appropriateness of the test.

0
0.0% of questions
Find minimum sample size for significance

A question is this type if and only if it asks to find the least/minimum value of n (sample size) given a correlation coefficient value and significance level.

0
0.0% of questions
Interpret PMCC value contextually

A question is this type if and only if it asks to interpret what a calculated correlation coefficient value tells you about the relationship or scatter diagram appearance.

0
0.0% of questions
Interpret significance level meaning

A question is this type if and only if it asks to explain the meaning of a significance level (e.g., 5%) or compare advantages/disadvantages of different significance levels.

0
0.0% of questions
State distributional assumption for test

A question is this type if and only if it asks to state the assumption required for the correlation test to be valid (bivariate normal distribution).

0
0.0% of questions