Hypothesis test for correlation

A question is this type if and only if it asks to perform a formal hypothesis test to determine if correlation is significant (positive, negative, or non-zero).

8 questions · Moderate -0.1

Sort by: Default | Easiest first | Hardest first
CAIE Further Paper 4 2020 Specimen Q4
7 marks Standard +0.8
4 The number, \(x\), of a certain type of sea shell was counted at 60 randomly chosen sites, each one metre square, along the coastline in country \(A\). The number, \(y\), of the same type of sea shell was counted at 50 randomly chosen sites, each one metre square, along the coastline in country \(B\). The results are summarised as follows, where \(\bar{x}\) and \(\bar{y}\) denote the sample means of \(x\) and \(y\) respectively. $$\bar{x} = 29.2 \quad \Sigma(x - \bar{x})^{2} = 4341.6 \quad \bar{y} = 24.4 \quad \Sigma(y - \bar{y})^{2} = 3732.0$$ Find a \(95\%\) confidence interval for the difference between the mean number of sea shells, per square metre, on the coastlines in country \(A\) and in country \(B\).
CAIE FP2 2013 November Q10
11 marks Standard +0.3
10 The lengths, \(x \mathrm {~m}\), and masses, \(y \mathrm {~kg}\), of 12 randomly chosen babies born at a particular hospital last year are summarised as follows. $$\Sigma x = 7.50 \quad \Sigma x ^ { 2 } = 4.73 \quad \Sigma y = 38.6 \quad \Sigma y ^ { 2 } = 124.84 \quad \Sigma x y = 24.25$$ Find the value of the product moment correlation coefficient for this sample. Obtain an estimate for the mass of a baby, born last year at the hospital, whose length is 0.64 m . Test, at the \(2 \%\) significance level, whether there is non-zero correlation between the two variables.
OCR MEI AS Paper 2 2021 November Q9
5 marks Moderate -0.5
9 Arun, Beth and Charlie are investigating whether there is any association between death rate per 1000 and physician density per 1000. They each collect a random sample of size 10. Arun's sample is shown in Fig.9.1. \begin{table}[h]
death rate per 1000physician density per 1000
Canberra7.23.62
Dhaka5.30.49
Brasilia6.82.23
Yaounde9.30.08
Zagreb12.53.08
Tehran5.41.16
Rome10.74.14
Tripoli3.82.09
Oslo7.94.51
Abuja9.70.35
\captionsetup{labelformat=empty} \caption{Fig. 9.1}
\end{table}
  1. Explain whether or not Arun collected his data from the pre-release material, or whether it is not possible to say. Beth and Charlie collected their samples from the pre-release material. Each of them drew a scatter diagram for their samples. The samples and scatter diagrams are shown in Figs. 9.2 and 9.3.
    Beth's sampledeath rate per 1000physician density per 1000
    Sudan6.70.41
    Cambodia7.40.17
    Gabon6.20.36
    Seychelles70.95
    Mexico5.42.25
    Kuwait2.32.58
    Haiti7.50.23
    Maldives41.04
    Nauru5.91.24
    Jordan3.42.34
    \includegraphics[max width=\textwidth, alt={}]{2b9ce212-84e2-4817-be94-98e2adff12a3-08_545_1024_340_918}
    \begin{table}[h]
    Charlie's sampledeath rate per 1000physician density per 1000
    Vanuata40.17
    Solomon Islands3.80.2
    N. Mariana Islands4.90.36
    Nauru5.91.24
    United Kingdom9.42.81
    Portugal10.63.34
    North Macedonia9.62.87
    Faroe Islands8.82.62
    Bulgaria14.53.99
    St. Kitts and Nevis7.22.52
    \captionsetup{labelformat=empty} \caption{Fig. 9.3}
    \end{table} \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Fig. 9.2} \includegraphics[alt={},max width=\textwidth]{2b9ce212-84e2-4817-be94-98e2adff12a3-08_572_899_1400_1041}
    \end{figure} Arun states that Charlie's sample and Beth's sample cannot both be random for the following reasons.
    Kofi collects a sample of 10 African countries and 10 European countries. The scatter diagram for his results is shown in Fig. 9.4. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{2b9ce212-84e2-4817-be94-98e2adff12a3-09_485_903_902_260} \captionsetup{labelformat=empty} \caption{Fig. 9.4}
    \end{figure}
  2. On the copy of Fig. 9.4 in the Printed Answer Booklet, use your knowledge of the pre-release material to identify the points representing the 10 European countries, justifying your choice.
Edexcel Paper 3 2018 June Q2
7 marks Standard +0.3
  1. Tessa owns a small clothes shop in a seaside town. She records the weekly sales figures, \(\pounds w\), and the average weekly temperature, \(t ^ { \circ } \mathrm { C }\), for 8 weeks during the summer.
    The product moment correlation coefficient for these data is - 0.915
    1. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test whether or not the correlation between sales figures and average weekly temperature is negative.
    2. Suggest a possible reason for this correlation.
    Tessa suggests that a linear regression model could be used to model these data.
  2. State, giving a reason, whether or not the correlation coefficient is consistent with Tessa's suggestion.
  3. State, giving a reason, which variable would be the explanatory variable. Tessa calculated the linear regression equation as \(w = 10755 - 171 t\)
  4. Give an interpretation of the gradient of this regression equation.
Edexcel S3 2005 June Q4
13 marks Standard +0.3
Over a period of time, researchers took 10 blood samples from one patient with a blood disease. For each sample, they measured the levels of serum magnesium, \(s\) mg/dl, in the blood and the corresponding level of the disease protein, \(d\) mg/dl. The results are shown in the table.
\(s\)1.21.93.23.92.54.55.74.01.15.9
\(d\)3.87.011.012.09.012.013.512.22.013.9
[Use \(\sum s^2 = 141.51\), \(\sum d^2 = 1081.74\) and \(\sum sd = 386.32\)]
  1. Draw a scatter diagram to represent these data. [3]
  2. State what is measured by the product moment correlation coefficient. [1]
  3. Calculate \(S_{ss}\), \(S_{dd}\) and \(S_{sd}\). [3]
  4. Calculate the value of the product moment correlation coefficient \(r\) between \(s\) and \(d\). [2]
  5. Stating your hypotheses clearly, test, at the 1\% significance level, whether or not the correlation coefficient is greater than zero. [3]
  6. With reference to your scatter diagram, comment on your result in part (e). [1]
(Total 13 marks)
AQA Paper 3 Specimen Q10
7 marks Moderate -0.8
Shona calculated four correlation coefficients using data from the Large Data Set. In each case she calculated the correlation coefficient between the masses of the cars and the CO₂ emissions for varying sample sizes. A summary of these calculations, labelled A to D, are listed in the table below.
Sample sizeCorrelation coefficient
A38270.088
B37350.246
C240.400
D1250-1.183
Shona would like to use calculation A to test whether there is evidence of positive correlation between mass and CO₂ emissions. She finds the critical value for a one-tailed test at the 5% level for a sample of size 3827 is 0.027
    1. State appropriate hypotheses for Shona to use in her test. [1 mark]
    2. Determine if there is sufficient evidence to reject the null hypothesis. Fully justify your answer. [1 mark]
  1. Shona's teacher tells her to remove calculation D from the table as it is incorrect. Explain how the teacher knew it was incorrect. [1 mark]
  2. Before performing calculation B, Shona cleaned the data. She removed all cars from the Large Data Set that had incorrect masses. Using your knowledge of the large data set, explain what was incorrect about the masses which were removed from the calculation. [1 mark]
  3. Apart from CO2 and CO emissions, state one other type of emission that Shona could investigate using the Large Data Set. [1 mark]
  4. Wesley claims that calculation C shows that a heavier car causes higher CO2 emissions. Give two reasons why Wesley's claim may be incorrect. [2 marks]
OCR MEI Paper 2 Specimen Q9
4 marks Moderate -0.8
A geyser is a hot spring which erupts from time to time. For two geysers, the duration of each eruption, \(x\) minutes, and the waiting time until the next eruption, \(y\) minutes, are recorded.
  1. For a random sample of 50 eruptions of the first geyser, the correlation coefficient between \(x\) and \(y\) is 0.758. The critical value for a 2-tailed hypothesis test for correlation at the 5% level is 0.279. Explain whether or not there is evidence of correlation in the population of eruptions. [2]
The scatter diagram in Fig. 9 shows the data from a random sample of 50 eruptions of the second geyser. \includegraphics{figure_9}
  1. Stella claims the scatter diagram shows evidence of correlation between duration of eruption and waiting time. Make two comments about Stella's claim. [2]
Pre-U Pre-U 9794/1 2010 June Q13
10 marks Moderate -0.3
A survey was conducted into the annual salary offered for 19 different jobs in 2008. The results were as follows, in thousands of pounds.
15161819213636384141
4347515556606264110
It was decided to undertake a further study to see if self-esteem was correlated with level of annual salary. A random sample of 11 employees was taken and self-esteem was rated on a scale of 1 to 10 with the highest self-esteem being 10. The results were as follows.
Salary in £10 000's1234567891011
Self-esteem435177851079