5.08d Hypothesis test: Pearson correlation

109 questions

Sort by: Default | Easiest first | Hardest first
CAIE FP2 2019 June Q10
11 marks Standard +0.3
10 The values from a random sample of five pairs \(( x , y )\) taken from a bivariate distribution are shown below.
\(x\)34468
\(y\)57\(q\)67
The equation of the regression line of \(x\) on \(y\) is given by \(x = \frac { 5 } { 4 } y + c\).
  1. Given that \(q\) is an integer, find its value.
  2. Find the value of \(c\).
  3. Find the value of the product moment correlation coefficient.
CAIE FP2 2019 June Q10
12 marks Moderate -0.3
10 The means and variances for a random sample of 8 pairs of values of \(x\) and \(y\) taken from a bivariate distribution are given in the following table.
MeanVariance
\(x\)3.31253.3086
\(y\)6.73757.9473
The product moment correlation coefficient for the sample is 0.5815 , correct to 4 decimal places.
  1. Find the equation of the regression line of \(y\) on \(x\).
  2. Test at the \(5 \%\) significance level whether there is evidence of positive correlation between \(x\) and \(y\). [4]
  3. Calculate an estimate of \(y\) when \(x = 6.0\) and comment on the reliability of your estimate.
CAIE FP2 2008 November Q8
9 marks Moderate -0.3
8 The equations of the regression lines for a random sample of 25 pairs of data \(( x , y )\) from a bivariate population are $$\begin{array} { c c } y \text { on } x : & y = 1.28 - 0.425 x , \\ x \text { on } y : & x = 1.05 - 0.516 y . \end{array}$$
  1. Find the sample means, \(\bar { x }\) and \(\bar { y }\).
  2. Find the product moment correlation coefficient for the sample.
  3. Test at the \(5 \%\) significance level whether the population correlation coefficient differs from zero.
CAIE FP2 2012 November Q8
11 marks Moderate -0.8
8 The yield of a particular crop on a farm is thought to depend principally on the amount of sunshine during the growing season. For a random sample of 8 years, the average yield, \(y\) kilograms per square metre, and the average amount of sunshine per day, \(x\) hours, are recorded. The results are given in the following table.
\(x\)12.210.45.26.311.810.014.22.3
\(y\)159107811126
$$\left[ \Sigma x = 72.4 , \Sigma x ^ { 2 } = 769.9 , \Sigma y = 78 , \Sigma y ^ { 2 } = 820 , \Sigma x y = 761.3 . \right]$$
  1. Find the equation of the regression line of \(y\) on \(x\).
  2. Find the product moment correlation coefficient.
  3. Test, at the \(5 \%\) significance level, whether there is positive correlation between the average yield and the average amount of sunshine per day.
CAIE FP2 2013 November Q9
11 marks Standard +0.3
9 For a random sample of 10 observations of pairs of values \(( x , y )\), the equations of the regression lines of \(y\) on \(x\) and of \(x\) on \(y\) are $$y = 4.21 x - 0.862 \quad \text { and } \quad x = 0.043 y + 6.36$$ respectively.
  1. Find the value of the product moment correlation coefficient for the sample.
  2. Test, at the \(10 \%\) significance level, whether there is evidence of non-zero correlation between the variables.
  3. Find the mean values of \(x\) and \(y\) for this sample.
  4. Estimate the value of \(x\) when \(y = 2.3\) and comment on the reliability of your answer.
CAIE FP2 2013 November Q10
11 marks Standard +0.3
10 The lengths, \(x \mathrm {~m}\), and masses, \(y \mathrm {~kg}\), of 12 randomly chosen babies born at a particular hospital last year are summarised as follows. $$\Sigma x = 7.50 \quad \Sigma x ^ { 2 } = 4.73 \quad \Sigma y = 38.6 \quad \Sigma y ^ { 2 } = 124.84 \quad \Sigma x y = 24.25$$ Find the value of the product moment correlation coefficient for this sample. Obtain an estimate for the mass of a baby, born last year at the hospital, whose length is 0.64 m . Test, at the \(2 \%\) significance level, whether there is non-zero correlation between the two variables.
CAIE FP2 2014 November Q9
11 marks Standard +0.8
9 A random sample of 10 pairs of values of \(x\) and \(y\) is given in the following table.
\(x\)466827121495
\(y\)24686109865
  1. Find the equation of the regression line of \(y\) on \(x\).
  2. Find the product moment correlation coefficient for the sample.
  3. Find the estimated value of \(y\) when \(x = 10\), and comment on the reliability of this estimate.
  4. Another sample of \(N\) pairs of data from the same population has the same product moment correlation coefficient as the first sample given. A test, at the \(1 \%\) significance level, on this second sample indicates that there is sufficient evidence to conclude that there is positive correlation. Find the set of possible values of \(N\).
CAIE FP2 2016 November Q10 OR
Challenging +1.2
For a random sample, \(A\), of 5 pairs of values of \(x\) and \(y\), the equations of the regression lines of \(y\) on \(x\) and \(x\) on \(y\) are respectively \(y = 4.5 + 0.3 x\) and \(x = 3 y - 13\). Four of the five pairs of data are given in the following table.
\(x\)1579
\(y\)5677
Find
  1. the fifth pair of values of \(x\) and \(y\),
  2. the value of the product moment correlation coefficient. A second random sample, \(B\), of 5 pairs of values of \(x\) and \(y\) is summarised as follows. $$\Sigma x = 20 \quad \Sigma x ^ { 2 } = 100 \quad \Sigma y = 17 \quad \Sigma y ^ { 2 } = 69 \quad \Sigma x y = 75$$ The two samples, \(A\) and \(B\), are combined to form a single random sample of size 10 .
  3. Use this combined sample to test, at the \(5 \%\) significance level, whether the population product moment correlation coefficient is different from zero.
CAIE FP2 2017 November Q9
9 marks Standard +0.8
9 The land areas \(x\) (in suitable units) and populations \(y\) (in millions) for a sample of 8 randomly chosen cities are given in the following table.
Land area \(( x )\)1.04.52.41.63.88.67.56.5
Population \(( y )\)0.88.44.21.62.210.24.25.2
$$\left[ \Sigma x = 35.9 , \Sigma x ^ { 2 } = 216.47 , \Sigma y = 36.8 , \Sigma y ^ { 2 } = 244.96 , \Sigma x y = 212.62 . \right]$$
  1. Find, showing all necessary working, the value of the product moment correlation coefficient for this sample.
  2. Using a \(1 \%\) significance level, test whether there is positive correlation between land area and population of cities.
    The land areas and populations for another randomly chosen sample of cities, this time of size \(n\), give a product moment correlation coefficient of 0.651 . Using a test at the \(1 \%\) significance level, there is evidence of non-zero correlation between the variables.
  3. Find the least possible value of \(n\), justifying your answer.
CAIE FP2 2017 Specimen Q9
11 marks Standard +0.8
9 A random sample of 8 students is chosen from those sitting examinations in both Mathematics and French. Their marks in Mathematics, \(x\), and in French, \(y\), are summarised as follows. $$\Sigma x = 472 \quad \Sigma x ^ { 2 } = 29950 \quad \Sigma y = 400 \quad \Sigma y ^ { 2 } = 21226 \quad \Sigma x y = 24879$$ Another student scored 72 marks in the Mathematics examination but was unable to sit the French examination.
  1. Estimate the mark that this student would have obtained in the French examination.
  2. Test, at the \(5 \%\) significance level, whether there is non-zero correlation between marks in Mathematics and marks in French.
OCR H240/02 2019 June Q11
8 marks Moderate -0.8
11 A trainer was asked to give a lecture on population profiles in different Local Authorities (LAs) in the UK. Using data from the 2011 census, he created the following scatter diagram for 17 selected LAs. \begin{figure}[h]
\captionsetup{labelformat=empty} \caption{17 Selected Local Authorities} \includegraphics[alt={},max width=\textwidth]{1a0e0afb-81be-45d1-8c86-f98e508e9a49-08_560_897_466_246}
\end{figure} He selected the 17 LAs using the following method. The proportions of people aged 18 to 24 and aged 65+ in any Local Authority are denoted by \(P _ { \text {young } }\) and \(P _ { \text {senior } }\) respectively. The trainer used a spreadsheet to calculate the value of \(k = \frac { P _ { \text {young } } } { P _ { \text {senior } } }\) for each of the 348 LAs in the UK. He then used specific ranges of values of \(k\) to select the 17 LAs.
  1. Estimate the ranges of values of \(k\) that he used to select these 17 LAs.
  2. Using the 17 LAs the trainer carried out a hypothesis test with the following hypotheses. \(\mathrm { H } _ { 0 }\) : There is no linear correlation in the population between \(P _ { \text {young } }\) and \(P _ { \text {senior } }\). \(\mathrm { H } _ { 1 }\) : There is negative linear correlation in the population between \(P _ { \text {young } }\) and \(P _ { \text {senior } }\).
    He found that the value of Pearson's product-moment correlation coefficient for the 17 LAs is - 0.797 , correct to 3 significant figures.
    1. Use the table on page 9 to show that this value is significant at the \(1 \%\) level. The trainer concluded that there is evidence of negative linear correlation between \(P _ { \text {young } }\) and \(P _ { \text {senior } }\) in the population.
    2. Use the diagram to comment on the reliability of this conclusion.
  3. Describe one outstanding feature of the population in the areas represented by the points in the bottom right hand corner of the diagram.
  4. The trainer's audience included representatives from several universities. Suggest a reason why the diagram might be of particular interest to these people. \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Critical values of Pearson's product-moment correlation coefficient}
    \multirow{2}{*}{1-tail test 2-tail test}5\%2.5\%1\%0.5\%
    10\%5\%2\%1\%
    \(n\)
    1----
    2----
    30.98770.99690.99950.9999
    40.90000.95000.98000.9900
    50.80540.87830.93430.9587
    60.72930.81140.88220.9172
    70.66940.75450.83290.8745
    80.62150.70670.78870.8343
    90.58220.66640.74980.7977
    100.54940.63190.71550.7646
    110.52140.60210.68510.7348
    120.49730.57600.65810.7079
    130.47620.55290.63390.6835
    140.45750.53240.61200.6614
    150.44090.51400.59230.6411
    160.42590.49730.57420.6226
    170.41240.48210.55770.6055
    180.40000.46830.54250.5897
    190.38870.45550.52850.5751
    200.37830.44380.51550.5614
    210.36870.43290.50340.5487
    220.35980.42270.49210.5368
    230.35150.41320.48150.5256
    240.34380.40440.47160.5151
    250.33650.39610.46220.5052
    260.32970.38820.45340.4958
    270.32330.38090.44510.4869
    280.31720.37390.43720.4785
    290.31150.36730.42970.4705
    300.30610.36100.42260.4629
    \end{table} Turn over for questions 12 and 13
OCR H240/02 2021 November Q10
6 marks Moderate -0.8
10 A researcher plans to carry out a statistical investigation to test whether there is linear correlation between the time ( \(T\) weeks) from conception to birth, and the birth weight ( \(W\) grams) of new-born babies.
  1. Explain why a 1-tail test is appropriate in this context. The researcher records the values of \(T\) and \(W\) for a random sample of 11 babies. They calculate Pearson's product-moment correlation coefficient for the sample and find that the value is 0.722 .
  2. Use the table below to carry out the test at the \(1 \%\) significance level. \section*{Critical values of Pearson's product-moment correlation coefficient.}
    \multirow{2}{*}{}1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2.5\%1\%
    \multirow{4}{*}{\(n\)}100.54940.63190.71550.7646
    110.52140.60210.68510.7348
    120.49730.57600.65810.7079
    130.47620.55290.63390.6835
OCR MEI Paper 2 2024 June Q11
5 marks Moderate -0.8
11 A householder is investigating whether there is any relationship between his monthly cost of gas and his monthly cost of electricity, both measured in pounds ( \(\pounds\) ). The householder collects a random sample of monthly costs and presents them in the scatter diagram below. \includegraphics[max width=\textwidth, alt={}, center]{8e48bbd3-2166-49e7-8906-833261f331ca-08_604_1452_392_244} One of the points on the diagram represents the energy costs in a month when the householder was away on holiday for three weeks. The other points represent the energy costs in months when the householder did not go away on holiday.
  1. On the copy of the diagram in the Printed Answer Booklet, circle the point which represents the month when the householder was most likely to have been away on holiday for three weeks.
  2. With reference to the diagram, describe the relationship between the cost of gas and the cost of electricity. The householder decides to test whether there is evidence to suggest that there is any association between the monthly cost of gas and the monthly cost of electricity. The value of Spearman's rank correlation coefficient for this sample is 0.4359 and the associated \(p\)-value is 0.09195 .
  3. Determine whether there is any evidence to suggest, at the \(5 \%\) level, that there is any association between the monthly cost of gas and the monthly cost of electricity.
OCR MEI Paper 2 2020 November Q13
7 marks Moderate -0.5
13 The pre-release material contains information concerning median house prices, recycling rates and employment rates. Fig. 13.1 shows a scatter diagram of recycling rate against employment rate for a random sample of 33 regions. \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{cea67565-8074-4703-8e1a-09b98e380baf-14_629_1424_397_242} \captionsetup{labelformat=empty} \caption{Fig. 13.1}
\end{figure} The product moment correlation coefficient for this sample is 0.37154 and the associated \(p\)-value is 0.033. Lee conducts a hypothesis test at the \(5 \%\) level to test whether there is any evidence to suggest there is positive correlation between recycling rate and employment rate. He concludes that there is no evidence to suggest positive correlation because \(0.033 \approx 0\) and \(0.37154 > 0.05\).
  1. Explain whether Lee's reasoning is correct. Fig. 13.2 shows a scatter diagram of recycling rate against median house price for a random sample of 33 regions. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{cea67565-8074-4703-8e1a-09b98e380baf-14_648_1474_1758_242} \captionsetup{labelformat=empty} \caption{Fig. 13.2}
    \end{figure} The product moment correlation coefficient for this sample is - 0.33278 and the associated \(p\)-value is 0.058 . Fig. 13.3 shows summary statistics for the median house prices for the data in this sample. \begin{table}[h]
    Statistics
    \(n\)33
    Mean465467.9697
    \(\sigma\)201236.1345
    \(s\)204356.2606
    \(\Sigma x\)15360443
    \(\Sigma x ^ { 2 }\)8486161617387
    Min243500
    Q1342500
    Median410000
    Q3521000
    Max1200000
    \captionsetup{labelformat=empty} \caption{Fig. 13.3}
    \end{table}
  2. Use the information in Fig. 13.3 and Fig. 13.2 to show that there are at least two outliers.
  3. Describe the effect of removing the outliers on
OCR Further Statistics AS 2018 June Q4
8 marks Standard +0.3
4 Judith believes that mathematical ability and chess-playing ability are related. She asks 20 randomly chosen chess players, with known British Chess Federation (BCF) ratings \(X\), to take a mathematics aptitude test, with scores \(Y\). The results are summarised as follows. $$n = 20 , \sum x = 3600 , \sum x ^ { 2 } = 660500 , \sum y = 1440 , \sum y ^ { 2 } = 105280 , \sum x y = 260990$$
  1. Calculate the value of Pearson's product-moment correlation coefficient \(r\).
  2. State an assumption needed to be able to carry out a significance test on the value of \(r\).
  3. Assume now that the assumption in part (ii) is valid. Test at the \(5 \%\) significance level whether there is evidence that chess players with higher BCF ratings are better at mathematics.
  4. There are two different grading systems for chess players, the BCF system and the international ELO system. The two sets of ratings are related by $$\text { ELO rating } = 8 \times \text { BCF rating } + 650$$ Magnus says that the experiment should have used ELO ratings instead of BCF ratings. Comment on Magnus's suggestion.
OCR Further Statistics AS 2019 June Q5
7 marks Standard +0.3
5 Sixteen candidates took an examination paper in mechanics and an examination paper in statistics.
  1. For all sixteen candidates, the value of the product moment correlation coefficient \(r\) for the marks on the two papers was 0.701 correct to 3 significant figures. Test whether there is evidence, at the \(5 \%\) significance level, of association between the marks on the two papers.
  2. A teacher decided to omit the marks of the candidates who were in the top three places in mechanics and the candidates who were in the bottom three places in mechanics. The marks for the remaining 10 candidates can be summarised by \(n = 10 , \sum x = 750 , \sum y = 690 , \sum x ^ { 2 } = 57690 , \sum y ^ { 2 } = 49676 , \sum x y = 50829\).
    1. Calculate the value of \(r\) for these 10 candidates.
    2. What do the two values of \(r\), in parts (a) and (b)(i), tell you about the scores of the sixteen candidates?
OCR Further Statistics AS 2023 June Q5
9 marks Standard +0.3
5 A psychologist investigates the relationship between 'openness' and 'creativity' in adults. Each member of a random sample of 15 adults is given two tests, one on openness and one on creativity. Each test has a maximum score of 75 . The results are given in the table.
AdultABCDEFGHIJKLMNO
Openness, \(x\)393429204035203655314143333033
Creativity, \(y\)593417294946455460384635435634
\(n = 15 \quad \sum x = 519 \quad \sum y = 645 \quad \sum x ^ { 2 } = 19033 \quad \sum y ^ { 2 } = 29751 \quad \sum x y = 23034\)
  1. Use Pearson's product-moment correlation coefficient to test, at the \(5 \%\) significance level, whether there is positive association between openness and creativity.
  2. State what the value of Pearson's product-moment correlation coefficient shows about a scatter diagram illustrating the data.
  3. A student suggests that there is a way to obtain a more accurate measure of the correlation. Before carrying out the test it would be better to standardise the test scores so that they have the same mean and variance. Explain whether you agree with this suggestion.
OCR Further Statistics AS 2021 November Q2
7 marks Standard +0.3
2 A shopper estimates the cost, \(\pounds X\) per item, of each of 12 items in a supermarket. The shopper's estimates are compared with the actual cost, \(\pounds Y\) per item, of each item. The results are summarised as follows. \(n = 12\) \(\sum x = 399\) \(\sum y = 623.88\) \(\sum x ^ { 2 } = 28127\) \(\sum y ^ { 2 } = 116509.0212\) \(\sum x y = 45006.01\) Test at the 1\% significance level whether the shopper's estimates are positively correlated with the actual cost of the items.
OCR Further Statistics 2022 June Q2
11 marks Moderate -0.8
2 The directors of a large company believe that there are more computer failures in the Head Office when temperatures are higher. They obtain data for the Head Office for the maximum temperature, \(T ^ { \circ } \mathrm { C }\), and the number of computer failures, \(X\), on each of 12 randomly chosen days.
  1. State which of the following words can be applied to \(T\). Dependent Independent Controlled Response The data is summarised as follows. \(n = 12 \quad \sum t = 261 \quad \sum x = 41 \quad \sum t ^ { 2 } = 5869 \quad \sum x ^ { 2 } = 311 \quad \sum \mathrm { tx } = 1021\)
  2. Calculate the value of the product moment correlation coefficient \(r\).
  3. The directors wish to investigate their belief using a significance test at the \(1 \%\) level.
    1. Explain why a 1-tail test is appropriate in this situation.
    2. Carry out the test.
  4. One of the directors prefers the temperatures to be given in Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ), rather than Centigrade ( \({ } ^ { \circ } \mathrm { C }\) ). The relationship between F and C is \(\mathrm { F } = \frac { 9 } { 5 } \mathrm { C } + 32\).
    State the value of \(r\) that would result from using temperatures in Fahrenheit in the calculation.
OCR Further Statistics 2024 June Q2
9 marks Standard +0.3
2 A newspaper article claimed that "taller dog owners have taller dogs as pets". Alex investigated this claim and obtained data from a random sample of 16 fellow students who owned exactly one dog. The results are summarised as follows, where the height of the student, in cm, is denoted by \(h\) and the height, in cm, of their dog is denoted by \(d\). \(\mathrm { n } = 16 \quad \sum \mathrm {~h} = 2880 \quad \sum \mathrm {~d} = 660 \quad \sum \mathrm {~h} ^ { 2 } = 519276 \quad \sum \mathrm {~d} ^ { 2 } = 30000 \quad \sum \mathrm { hd } = 119425\)
  1. Calculate the value of Pearson's product moment correlation coefficient for the data.
  2. State what your answer tells you about a scatter diagram illustrating the data.
  3. Use the data to test, at the \(5 \%\) significance level, the claim of the newspaper article.
  4. Explain whether the answer to part (a) would be likely to be different if the dogs' weights had been used instead of their heights.
Edexcel S1 2008 January Q1
7 marks Moderate -0.3
  1. A personnel manager wants to find out if a test carried out during an employee's interview and a skills assessment at the end of basic training is a guide to performance after working for the company for one year.
The table below shows the results of the interview test of 10 employees and their performance after one year.
EmployeeA\(B\)CD\(E\)\(F\)G\(H\)IJ
Interview test, \(x\) \%.65717977857885908162
Performance after one year, \(y \%\).65748264877861657969
$$\text { [You may use } \sum x ^ { 2 } = 60475 , \sum y ^ { 2 } = 53122 , \sum x y = 56076 \text { ] }$$
  1. Showing your working clearly, calculate the product moment correlation coefficient between the interview test and the performance after one year. The product moment correlation coefficient between the skills assessment and the performance after one year is - 0.156 to 3 significant figures.
  2. Use your answer to part (a) to comment on whether or not the interview test and skills assessment are a guide to the performance after one year. Give clear reasons for your answers.
Edexcel S1 2002 June Q7
16 marks Moderate -0.8
7. An ice cream seller believes that there is a relationship between the temperature on a summer day and the number of ice creams sold. Over a period of 10 days he records the temperature at 1 p.m., \(t ^ { \circ } \mathrm { C }\), and the number of ice creams sold, \(c\), in the next hour. The data he collects is summarised in the table below.
\(t\)\(c\)
1324
2255
1735
2045
1020
1530
1939
1219
1836
2354
[Use \(\left. \Sigma t ^ { 2 } = 3025 , \Sigma c ^ { 2 } = 14245 , \Sigma c t = 6526 .\right]\)
  1. Calculate the value of the product moment correlation coefficient between \(t\) and \(c\).
  2. State whether or not your value supports the use of a regression equation to predict the number of ice creams sold. Give a reason for your answer.
  3. Find the equation of the least squares regression line of \(c\) on \(t\) in the form \(c = a + b t\).
  4. Interpret the value of \(b\).
  5. Estimate the number of ice creams sold between 1 p.m. and 2 p.m. when the temperature at 1 p.m. is \(16 ^ { \circ } \mathrm { C }\).
    (3)
  6. At 1 p.m. on a particular day, the highest temperature for 50 years was recorded. Give a reason why you should not use the regression equation to predict ice cream sales on that day.
    (1)
Edexcel S1 2016 June Q1
11 marks Moderate -0.8
  1. A biologist is studying the behaviour of bees in a hive. Once a bee has located a source of food, it returns to the hive and performs a dance to indicate to the other bees how far away the source of the food is. The dance consists of a series of wiggles. The biologist records the distance, \(d\) metres, of the food source from the hive and the average number of wiggles, \(w\), in the dance.
Distance, \(\boldsymbol { d } \mathbf { m }\)305080100150400500650
Average number
of wiggles, \(\boldsymbol { w }\)
0.7251.2101.7752.2503.5186.3828.1859.555
[You may use \(\sum w = 33.6 \sum d w = 13833 \mathrm {~S} _ { d d } = 394600 \mathrm {~S} _ { w w } = 80.481\) (to 3 decimal places)]
  1. Show that \(\mathrm { S } _ { d w } = 5601\)
  2. State, giving a reason, which is the response variable.
  3. Calculate the product moment correlation coefficient for these data.
  4. Calculate the equation of the regression line of \(w\) on \(d\), giving your answer in the form \(w = a + b d\) A new source of food is located 350 m from the hive.
    1. Use your regression equation to estimate the average number of wiggles in the corresponding dance.
    2. Comment, giving a reason, on the reliability of your estimate.
Edexcel S1 2017 June Q1
14 marks Moderate -0.5
  1. A clothes shop manager records the weekly sales figures, \(\pounds s\), and the average weekly temperature, \(t ^ { \circ } \mathrm { C }\), for 6 weeks during the summer. The sales figures were coded so that \(w = \frac { s } { 1000 }\)
The data are summarised as follows $$\mathrm { S } _ { w w } = 50 \quad \sum w t = 784 \quad \sum t ^ { 2 } = 2435 \quad \sum t = 119 \quad \sum w = 42$$
  1. Find \(\mathrm { S } _ { w t }\) and \(\mathrm { S } _ { t t }\)
  2. Write down the value of \(\mathrm { S } _ { s s }\) and the value of \(\mathrm { S } _ { s t }\)
  3. Find the product moment correlation coefficient between \(s\) and \(t\). The manager of the clothes shop believes that a linear regression model may be appropriate to describe these data.
  4. State, giving a reason, whether or not your value of the correlation coefficient supports the manager's belief.
  5. Find the equation of the regression line of \(w\) on \(t\), giving your answer in the form \(w = a + b t\)
  6. Hence find the equation of the regression line of \(s\) on \(t\), giving your answer in the form \(s = c + d t\), where \(c\) and \(d\) are correct to 3 significant figures.
  7. Using your equation in part (f), interpret the effect of a \(1 ^ { \circ } \mathrm { C }\) increase in average weekly temperature on weekly sales during the summer.
Edexcel S3 2022 January Q3
12 marks Moderate -0.3
  1. A medical research team carried out an investigation into the metabolic rate, MR, of men aged between 30 years and 60 years.
A random sample of 10 men was taken from this age group.
The table below shows for each man his MR and his body mass index, BMI. The table also shows the rank for the level of daily physical activity, DPA, which was assessed by the medical research team. Rank 1 was assigned to the man with the highest level of daily physical activity.
Man\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
MR ( \(\boldsymbol { x }\) )6.245.946.836.536.317.447.328.707.887.78
BMI ( \(\boldsymbol { y }\) )19.619.223.621.420.220.822.925.523.325.1
DPA rank10798631452
$$\text { [You may use } \quad \mathrm { S } _ { x y } = 15.1608 \quad \mathrm {~S} _ { x x } = 6.90181 \quad \mathrm {~S} _ { y y } = 45.304 \text { ] }$$
  1. Calculate the value of the product moment correlation coefficient between MR and BMI for these 10 men.
  2. Use your value of the product moment correlation coefficient to test, at the 5\% significance level, whether or not there is evidence of a positive correlation between MR and BMI.
    State your hypotheses clearly.
  3. State an assumption that must be made to carry out the test in part (b).
  4. Calculate the value of Spearman's rank correlation coefficient between MR and DPA for these 10 men.
  5. Use a two-tailed test and a \(5 \%\) level of significance to assess whether or not there is evidence of a correlation between MR and DPA.