5.08a Pearson correlation: calculate pmcc

246 questions

Sort by: Default | Easiest first | Hardest first
Edexcel S1 2012 June Q2
6 marks Moderate -0.8
2. A bank reviews its customer records at the end of each month to find out how many customers have become unemployed, \(u\), and how many have had their house repossessed, \(h\), during that month. The bank codes the data using variables \(x = \frac { u - 100 } { 3 }\) and \(y = \frac { h - 20 } { 7 }\) The results for the 12 months of 2009 are summarised below. $$\sum x = 477 \quad S _ { x x } = 5606.25 \quad \sum y = 480 \quad S _ { y y } = 4244 \quad \sum x y = 23070$$
  1. Calculate the value of the product moment correlation coefficient for \(x\) and \(y\).
  2. Write down the product moment correlation coefficient for \(u\) and \(h\). The bank claims that an increase in unemployment among its customers is associated with an increase in house repossessions.
  3. State, with a reason, whether or not the bank's claim is supported by these data.
Edexcel S1 2013 June Q5
11 marks Moderate -0.3
5. A researcher believes that parents with a short family name tended to give their children a long first name. A random sample of 10 children was selected and the number of letters in their family name, \(x\), and the number of letters in their first name, \(y\), were recorded. The data are summarised as: $$\sum x = 60 , \quad \sum y = 61 , \quad \sum y ^ { 2 } = 393 , \quad \sum x y = 382 , \quad \mathrm {~S} _ { x x } = 28$$
  1. Find \(\mathrm { S } _ { y y }\) and \(\mathrm { S } _ { x y }\)
  2. Calculate the product moment correlation coefficient, \(r\), between \(x\) and \(y\).
  3. State, giving a reason, whether or not these data support the researcher's belief. The researcher decides to add a child with family name "Turner" to the sample.
  4. Using the definition \(\mathrm { S } _ { x x } = \sum ( x - \bar { x } ) ^ { 2 }\), state the new value of \(\mathrm { S } _ { x x }\) giving a reason for your answer. Given that the addition of the child with family name "Turner" to the sample leads to an increase in \(\mathrm { S } _ { y y }\)
  5. use the definition \(\mathrm { S } _ { x y } = \sum ( x - \bar { x } ) ( y - \bar { y } )\) to determine whether or not the value of \(r\) will increase, decrease or stay the same. Give a reason for your answer.
Edexcel S1 2013 June Q1
13 marks Moderate -0.8
  1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
\(h\)140011002608409005501230100770
\(t\)310209101352416
[You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
  1. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  2. Calculate the product moment correlation coefficient for this data.
  3. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  4. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  5. Interpret the value of \(b\).
  6. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
Edexcel S1 2014 June Q3
16 marks Moderate -0.8
3. A large company is analysing how much money it spends on paper in its offices every year. The number of employees, \(x\), and the amount of money spent on paper, \(p\) ( \(\pounds\) hundreds), in 8 randomly selected offices are given in the table below.
\(x\)891214731619
\(p\) (£ hundreds)40.536.130.439.432.631.143.445.7
$$\text { (You may use } \sum x ^ { 2 } = 1160 \quad \sum p = 299.2 \quad \sum p ^ { 2 } = 11422 \quad \sum x p = 3449.5 \text { ) }$$
  1. Show that \(S _ { p p } = 231.92\) and find the value of \(S _ { x x }\) and the value of \(S _ { x p }\)
  2. Calculate the product moment correlation coefficient between \(x\) and \(p\). The equation of the regression line of \(p\) on \(x\) is given in the form \(p = a + b x\).
  3. Show that, to 3 significant figures, \(b = 0.824\) and find the value of \(a\).
  4. Estimate the amount of money spent on paper in an office with 10 employees.
  5. Explain the effect each additional employee has on the amount of money spent on paper. Later the company realised it had made a mistake in adding up its costs, \(p\). The true costs were actually half of the values recorded. The product moment correlation coefficient and the equation of the linear regression line are recalculated using this information.
  6. Write down the new value of
    1. the product moment correlation coefficient,
    2. the gradient of the regression line.
Edexcel S1 2014 June Q3
13 marks Easy -1.2
3. The table shows data on the number of visitors to the UK in a month, \(v\) (1000s), and the amount of money they spent, \(m\) ( \(\pounds\) millions), for each of 8 months.
Number of visitors
\(v ( 1000 \mathrm {~s} )\)
24502480254024202350229024002460
Amount of money spent
\(m ( \pounds\) millions \()\)
13701350140013301270121013301350
You may use \(S _ { v v } = 42587.5 \quad S _ { v m } = 31512.5 \quad S _ { m m } = 25187.5 \quad \sum v = 19390 \quad \sum m = 10610\)
  1. Find the product moment correlation coefficient between \(m\) and \(v\).
  2. Give a reason to support fitting a regression model of the form \(m = a + b v\) to these data.
  3. Find the value of \(b\) correct to 3 decimal places.
  4. Find the equation of the regression line of \(m\) on \(v\).
  5. Interpret your value of \(b\).
  6. Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000
  7. Comment on the reliability of your estimate in part (f). Give a reason for your answer.
Edexcel S1 2015 June Q2
8 marks Moderate -0.8
2. An estate agent recorded the price per square metre, \(p \pounds / \mathrm { m } ^ { 2 }\), for 7 two-bedroom houses. He then coded the data using the coding \(q = \frac { p - a } { b }\), where \(a\) and \(b\) are positive constants. His results are shown in the table below.
\(p\)1840184818301824181918341850
\(q\)4.04.83.02.41.93.45.0
  1. Find the value of \(a\) and the value of \(b\) The estate agent also recorded the distance, \(d \mathrm {~km}\), of each house from the nearest train station. The results are summarised below. $$\mathrm { S } _ { d d } = 1.02 \quad \mathrm {~S} _ { q q } = 8.22 \quad \mathrm {~S} _ { d q } = - 2.17$$
  2. Calculate the product moment correlation coefficient between \(d\) and \(q\)
  3. Write down the value of the product moment correlation coefficient between \(d\) and \(p\) The estate agent records the price and size of 2 additional two-bedroom houses, \(H\) and \(J\).
    HousePrice \(( \pounds )\)Size \(\left( \mathrm { m } ^ { 2 } \right)\)
    \(H\)15640085
    \(J\)17290095
  4. Suggest which house is most likely to be closer to a train station. Justify your answer.
Edexcel S1 2016 June Q1
11 marks Moderate -0.8
  1. A biologist is studying the behaviour of bees in a hive. Once a bee has located a source of food, it returns to the hive and performs a dance to indicate to the other bees how far away the source of the food is. The dance consists of a series of wiggles. The biologist records the distance, \(d\) metres, of the food source from the hive and the average number of wiggles, \(w\), in the dance.
Distance, \(\boldsymbol { d } \mathbf { m }\)305080100150400500650
Average number
of wiggles, \(\boldsymbol { w }\)
0.7251.2101.7752.2503.5186.3828.1859.555
[You may use \(\sum w = 33.6 \sum d w = 13833 \mathrm {~S} _ { d d } = 394600 \mathrm {~S} _ { w w } = 80.481\) (to 3 decimal places)]
  1. Show that \(\mathrm { S } _ { d w } = 5601\)
  2. State, giving a reason, which is the response variable.
  3. Calculate the product moment correlation coefficient for these data.
  4. Calculate the equation of the regression line of \(w\) on \(d\), giving your answer in the form \(w = a + b d\) A new source of food is located 350 m from the hive.
    1. Use your regression equation to estimate the average number of wiggles in the corresponding dance.
    2. Comment, giving a reason, on the reliability of your estimate.
Edexcel S1 2016 June Q3
10 marks Moderate -0.8
3. Before going on holiday to Seapron, Tania records the weekly rainfall ( \(x \mathrm {~mm}\) ) at Seapron for 8 weeks during the summer. Her results are summarised as $$\sum x = 86.8 \quad \sum x ^ { 2 } = 985.88$$
  1. Find the standard deviation, \(\sigma _ { x }\), for these data.
    (3) Tania also records the number of hours of sunshine ( \(y\) hours) per week at Seapron for these 8 weeks and obtains the following $$\bar { y } = 58 \quad \sigma _ { y } = 9.461 \text { (correct to } 4 \text { significant figures) } \quad \sum x y = 4900.5$$
  2. Show that \(\mathrm { S } _ { y y } = 716\) (correct to 3 significant figures)
  3. Find \(\mathrm { S } _ { x y }\)
  4. Calculate the product moment correlation coefficient, \(r\), for these data. During Tania's week-long holiday at Seapron there are 14 mm of rain and 70 hours of sunshine.
  5. State, giving a reason, what the effect of adding this information to the above data would be on the value of the product moment correlation coefficient.
Edexcel S1 2017 June Q1
14 marks Moderate -0.5
  1. A clothes shop manager records the weekly sales figures, \(\pounds s\), and the average weekly temperature, \(t ^ { \circ } \mathrm { C }\), for 6 weeks during the summer. The sales figures were coded so that \(w = \frac { s } { 1000 }\)
The data are summarised as follows $$\mathrm { S } _ { w w } = 50 \quad \sum w t = 784 \quad \sum t ^ { 2 } = 2435 \quad \sum t = 119 \quad \sum w = 42$$
  1. Find \(\mathrm { S } _ { w t }\) and \(\mathrm { S } _ { t t }\)
  2. Write down the value of \(\mathrm { S } _ { s s }\) and the value of \(\mathrm { S } _ { s t }\)
  3. Find the product moment correlation coefficient between \(s\) and \(t\). The manager of the clothes shop believes that a linear regression model may be appropriate to describe these data.
  4. State, giving a reason, whether or not your value of the correlation coefficient supports the manager's belief.
  5. Find the equation of the regression line of \(w\) on \(t\), giving your answer in the form \(w = a + b t\)
  6. Hence find the equation of the regression line of \(s\) on \(t\), giving your answer in the form \(s = c + d t\), where \(c\) and \(d\) are correct to 3 significant figures.
  7. Using your equation in part (f), interpret the effect of a \(1 ^ { \circ } \mathrm { C }\) increase in average weekly temperature on weekly sales during the summer.
Edexcel S1 2018 June Q6
14 marks Moderate -0.8
6. A group of climbers collected information about the height above sea level, \(h\) metres, and the air temperature, \(t ^ { \circ } \mathrm { C }\), at the same time at 8 different points on the same mountain. The data are summarised by $$\sum h = 6370 \quad \sum t = 61 \quad \sum t h = 31070 \quad \sum t ^ { 2 } = 693$$
  1. Show that \(\mathrm { S } _ { \text {th } } = - 17501.25\) and \(\mathrm { S } _ { \text {tt } } = 227.875\) The product moment correlation coefficient for these data is - 0.985
  2. State, giving a reason, whether or not this value supports the use of a regression equation to predict the air temperature at different heights on this mountain.
  3. Find the equation of the regression line of \(t\) on \(h\), giving your answer in the form \(t = a + b h\). Give the value of your coefficients to 3 significant figures.
  4. Give an interpretation of your value of \(a\). One of the climbers has just stopped for a short break before climbing the next 150 metres.
  5. Estimate the drop in temperature over this 150 metre climb.
Edexcel S1 2004 November Q6
18 marks Easy -1.2
6. Students in Mr Brawn's exercise class have to do press-ups and sit-ups. The number of press-ups \(x\) and the number of sit-ups \(y\) done by a random sample of 8 students are summarised below. $$\begin{array} { l l } \Sigma x = 272 , & \Sigma x ^ { 2 } = 10164 , \quad \Sigma x y = 11222 , \\ \Sigma y = 320 , & \Sigma y ^ { 2 } = 13464 . \end{array}$$
  1. Evaluate \(S _ { x x } , S _ { y y }\) and \(S _ { x y }\).
  2. Calculate, to 3 decimal places, the product moment correlation coefficient between \(x\) and \(y\).
  3. Give an interpretation of your coefficient.
  4. Calculate the mean and the standard deviation of the number of press-ups done by these students. Mr Brawn assumes that the number of press-ups that can be done by any student can be modelled by a normal distribution with mean \(\mu\) and standard deviation \(\sigma\). Assuming that \(\mu\) and \(\sigma\) take the same values as those calculated in part (d),
  5. find the value of \(a\) such that \(\mathrm { P } ( \mu - a < X < \mu + a ) = 0.95\).
  6. Comment on Mr Brawn's assumption of normality.
Edexcel S3 2022 January Q3
12 marks Moderate -0.3
  1. A medical research team carried out an investigation into the metabolic rate, MR, of men aged between 30 years and 60 years.
A random sample of 10 men was taken from this age group.
The table below shows for each man his MR and his body mass index, BMI. The table also shows the rank for the level of daily physical activity, DPA, which was assessed by the medical research team. Rank 1 was assigned to the man with the highest level of daily physical activity.
Man\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)\(J\)
MR ( \(\boldsymbol { x }\) )6.245.946.836.536.317.447.328.707.887.78
BMI ( \(\boldsymbol { y }\) )19.619.223.621.420.220.822.925.523.325.1
DPA rank10798631452
$$\text { [You may use } \quad \mathrm { S } _ { x y } = 15.1608 \quad \mathrm {~S} _ { x x } = 6.90181 \quad \mathrm {~S} _ { y y } = 45.304 \text { ] }$$
  1. Calculate the value of the product moment correlation coefficient between MR and BMI for these 10 men.
  2. Use your value of the product moment correlation coefficient to test, at the 5\% significance level, whether or not there is evidence of a positive correlation between MR and BMI.
    State your hypotheses clearly.
  3. State an assumption that must be made to carry out the test in part (b).
  4. Calculate the value of Spearman's rank correlation coefficient between MR and DPA for these 10 men.
  5. Use a two-tailed test and a \(5 \%\) level of significance to assess whether or not there is evidence of a correlation between MR and DPA.
Edexcel S3 2023 January Q2
12 marks Standard +0.3
2 The table shows the season's best times, \(x\) seconds, for the 8 athletes who took part in the 200 m final in the 2021 Tokyo Olympics. It also shows their finishing position in the race.
Athlete\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
Season's best time19.8919.8319.7419.8419.9119.9920.1320.10
Finishing position12345678
Given that the fastest season's best time is ranked number 1
  1. calculate the value of the Spearman's rank correlation coefficient for these data.
  2. Stating your hypotheses clearly, test, at the \(1 \%\) level of significance, whether or not there is evidence of a positive correlation between the rank of the season's best time and the finishing position for these athletes. Chris suggests that it would be better to use the actual finishing time, \(y\) seconds, of these athletes rather than their finishing position. Given that $$S _ { x x } = 0.1286875 \quad S _ { y y } = 0.55275 \quad S _ { x y } = 0.225175$$
  3. calculate the product moment correlation coefficient between the season's best time and the finishing time for these athletes.
    Give your answer correct to 3 decimal places.
  4. Use your value of the product moment correlation coefficient to test, at the \(1 \%\) level of significance, whether or not there is evidence of a positive correlation between the season's best time and the finishing time for these athletes.
Edexcel S3 2024 January Q3
12 marks Standard +0.3
  1. The table shows the annual tea consumption, \(t\) (kg/person), and population, \(p\) (millions), for a random sample of 7 European countries.
CountryABCDEFG
Annual tea consumption, \(\boldsymbol { t }\) (kg/person)0.270.150.420.061.940.780.44
Population, \(\boldsymbol { p }\) (millions)5.45.8910.267.917.18.7
$$\text { (You may use } \mathrm { S } _ { t t } = 2.486 \quad \mathrm {~S} _ { p p } = 3026.234 \quad \mathrm {~S} _ { p t } = 83.634 \text { ) }$$ Angela suggests using the product moment correlation coefficient to calculate the correlation between annual tea consumption and population.
  1. Use Angela's suggestion to test, at the \(5 \%\) level of significance, whether or not there is evidence of any correlation between annual tea consumption and population. State your hypotheses clearly and the critical value used. Johan suggests using Spearman's rank correlation coefficient to calculate the correlation between the rank of annual tea consumption and the rank of population.
  2. Calculate Spearman's rank correlation coefficient between the rank of annual tea consumption and the rank of population.
  3. Use Johan's suggestion to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between annual tea consumption and population.
    State your hypotheses clearly and the critical value used.
Edexcel S3 2014 June Q4
12 marks Standard +0.3
4. In a survey 10 randomly selected men had their systolic blood pressure, \(x\), and weight, \(w\), measured. Their results are as follows
Man\(\boldsymbol { A }\)\(\boldsymbol { B }\)\(\boldsymbol { C }\)\(\boldsymbol { D }\)\(\boldsymbol { E }\)\(\boldsymbol { F }\)\(\boldsymbol { G }\)\(\boldsymbol { H }\)\(\boldsymbol { I }\)\(\boldsymbol { J }\)
\(x\)123128137143149153154159162168
\(w\)78938583759888879599
  1. Calculate the value of Spearman's rank correlation coefficient between \(x\) and \(w\).
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight. The product moment correlation coefficient for these data is 0.5114
  3. Use the value of the product moment correlation coefficient to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight.
  4. Using your conclusions to part (b) and part (c), describe the relationship between systolic blood pressure and weight.
Edexcel S3 2018 June Q1
12 marks Standard +0.3
  1. A random sample of 9 footballers is chosen to participate in an obstacle course. The time taken, \(y\) seconds, for each footballer to complete the obstacle course is recorded, together with the footballer's Body Mass Index, \(x\). The results are shown in the table below.
FootballerBody Mass Index, \(\boldsymbol { x }\)Time taken to complete the obstacle course, \(y\) seconds
A18.7690
B19.5801
C20.2723
D20.4633
E20.8660
F21.9655
G23.2711
H24.3642
I24.8607
Russell claims, that for footballers, as Body Mass Index increases the time taken to complete the obstacle course tends to decrease.
  1. Find, to 3 decimal places, Spearman's rank correlation coefficient between \(x\) and \(y\).
  2. Use your value of Spearman's rank correlation coefficient to test Russell's claim. Use a 5\% significance level and state your hypotheses clearly. The product moment correlation coefficient for these data is - 0.5594
  3. Use the value of the product moment correlation coefficient to test for evidence of a negative correlation between Body Mass Index and the time taken to complete the obstacle course. Use a 5\% significance level.
  4. Using your conclusions to part (b) and part (c), describe the relationship between Body Mass Index and the time taken to complete the obstacle course.
Edexcel S3 2023 June Q1
9 marks Standard +0.3
  1. (a) State two conditions under which it might be more appropriate to use Spearman's rank correlation coefficient rather than the product moment correlation coefficient.
A random sample of 10 melons was taken from a market stall. The length, in centimetres, and maximum diameter, in centimetres, of each melon were recorded. The Spearman's rank correlation coefficient between the results was - 0.673
(b) Test, at the \(5 \%\) level of significance, whether or not there is evidence of a correlation. State clearly your hypotheses and the critical value used. The product moment correlation coefficient between the results was - 0.525
(c) Test, at the \(5 \%\) level of significance, whether or not there is evidence of a negative correlation.
State clearly your hypotheses and the critical value used.
Edexcel S3 2021 October Q3
14 marks Standard +0.3
3. A cafe owner wishes to know whether the price of strawberry jam is related to the taste of the jam. He finds a website that lists the price per 100 grams and a mark for the taste, out of 100, awarded by a judge, for 9 different strawberry jams \(A , B , C , D , E , F , G , H\) and \(I\). He then ranks the marks for taste and the prices. The ranks are shown in the table below.
Rank123456789
Price\(A\)\(B\)\(E\)\(C\)\(D\)\(F\)\(G\)\(H\)\(I\)
Taste\(A\)\(B\)\(F\)\(E\)\(H\)\(G\)\(I\)\(C\)\(D\)
  1. Calculate Spearman's rank correlation coefficient for these data.
  2. Test, at the \(5 \%\) level of significance, whether or not there is a relationship between the price and the taste of these strawberry jams. State your hypotheses clearly. A friend suggests that it would be better to use the price per 100 grams, \(c\), and the mark for the taste, \(m\), for each strawberry jam rather than rank them. Given that $$\mathrm { S } _ { c c } = 2.0455 \quad \mathrm {~S} _ { m m } = 243.5556 \quad \mathrm {~S} _ { c m } = 16.4943$$
  3. calculate the product moment correlation coefficient between the price and the mark for taste of these strawberry jams, giving your answer correct to 3 decimal places.
  4. Use your value of the product moment correlation coefficient to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between the price and the mark for taste of these 9 strawberry jams. State your hypotheses clearly.
  5. State which of the tests in parts (b) and (d) is more appropriate for the cafe owner to use. Give a reason for your answer.
Edexcel S3 2008 June Q3
14 marks Standard +0.3
  1. The product moment correlation coefficient is denoted by \(r\) and Spearman's rank correlation coefficient is denoted by \(r _ { s }\).
    1. Sketch separate scatter diagrams, with five points on each diagram, to show
      1. \(r = 1\),
      2. \(r _ { s } = - 1\) but \(r > - 1\).
    Two judges rank seven collie dogs in a competition. The collie dogs are labelled \(A\) to \(G\) and the rankings are as follows
    Rank1234567
    Judge 1\(A\)\(C\)\(D\)\(B\)\(E\)\(F\)\(G\)
    Judge 2\(A\)\(B\)\(D\)\(C\)\(E\)\(G\)\(F\)
    1. Calculate Spearman's rank correlation coefficient for these data.
    2. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, whether or not the judges are generally in agreement.
Edexcel S3 2013 June Q3
13 marks Standard +0.3
3. The table below shows the population and the number of council employees for different towns and villages.
Town or villagePopulationNumber of council employees
A21110
B3562
C104712
D246321
E489216
F647925
G657167
H657345
I984548
\(J\)1478434
  1. Find, to 3 decimal places, Spearman's rank correlation coefficient between the population and the number of council employees.
  2. Use your value of Spearman's rank correlation coefficient to test for evidence of a positive correlation between the population and the number of council employees. Use a \(2.5 \%\) significance level. State your hypotheses clearly. It is suggested that a product moment correlation coefficient would be a more suitable calculation in this case. The product moment correlation coefficient for these data is 0.627 to 3 decimal places.
  3. Use the value of the product moment correlation coefficient to test for evidence of a positive correlation between the population and the number of council employees. Use a \(2.5 \%\) significance level.
  4. Interpret and comment on your results from part(b) and part(c).
Edexcel S3 2014 June Q8
16 marks Standard +0.3
8. The heights, in metres, and weights, in kilograms, of a random sample of 9 men are shown in the table below
Man\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)\(I\)
Height \(( x )\)1.681.741.751.761.781.821.841.881.98
Weight \(( y )\)757610077909511096120
  1. Given that \(\mathrm { S } _ { x x } = 0.0632 , \mathrm {~S} _ { y y } = 1957.5556\) and \(\mathrm { S } _ { x y } = 9.3433\) calculate, to 3 decimal places, the product moment correlation coefficient between height and weight for these men.
  2. Use your value of the product moment correlation coefficient to test whether or not there is evidence of a positive correlation between the height and weight of men. Use a \(5 \%\) significance level. State your hypotheses clearly. Peter does not know the heights or weights of the 9 men. He is given photographs of them and asked to put them in order of increasing weight. He puts them in the order $$A C E B G D I F H$$
  3. Find, to 3 decimal places, Spearman's rank correlation coefficient between Peter's order and the actual order.
  4. Use your value of Spearman's rank correlation coefficient to test for evidence of Peter's ability to correctly order men, by their weight, from their photographs. Use a 5\% significance level and state your hypotheses clearly.
Edexcel S3 2018 June Q1
13 marks Standard +0.3
  1. Phil measures the concentration of a radioactive element, \(c\), and the amount of dissolved solids, \(a\), of 8 random samples of groundwater. His results are shown in the table below.
Sample\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
\(c\)625700650645720600825665
\(a\)1.281.301.001.201.551.151.401.45
Given that $$\mathrm { S } _ { c c } = 34787.5 \quad \mathrm {~S} _ { a a } = 0.2172875 \quad \mathrm {~S} _ { c a } = 47.7625$$
  1. calculate, to 3 decimal places, the product moment correlation coefficient between the concentration of the radioactive element and the amount of dissolved solids for these groundwater samples.
  2. Use your value of the product moment correlation coefficient to test whether or not there is evidence of a positive correlation between the concentration of this radioactive element and the amount of dissolved solids in groundwater. Use a \(5 \%\) significance level. State your hypotheses clearly.
  3. Calculate, to 3 decimal places, Spearman's rank correlation coefficient between the concentration of the radioactive element and the amount of dissolved solids.
  4. Use your value of Spearman's rank correlation coefficient to test for evidence of a positive correlation between the concentration of the radioactive element and the amount of dissolved solids. Use a \(5 \%\) significance level. State your hypotheses clearly.
  5. Using your conclusions in part (b) and part (d), comment on the possible relationship between these variables.
AQA S1 2006 January Q5
11 marks Easy -1.2
5 [Figure 1, printed on the insert, is provided for use in this question.]
The table shows the times, in seconds, taken by a random sample of 10 boys from a junior swimming club to swim 50 metres freestyle and 50 metres backstroke.
BoyABCDEFGHIJ
Freestyle ( \(\boldsymbol { x }\) seconds)30.232.825.131.831.235.632.438.036.134.1
Backstroke ( \(y\) seconds)33.535.437.427.234.738.237.741.442.338.4
  1. On Figure 1, complete the scatter diagram for these data.
  2. Hence:
    1. give two distinct comments on what your scatter diagram reveals;
    2. state, without calculation, which of the following 3 values is most likely to be the value of the product moment correlation coefficient for the data in your scatter diagram. $$0.912 \quad 0.088 \quad 0.462$$
  3. In the sample of 10 boys, one boy is a junior-champion freestyle swimmer and one boy is a junior-champion backstroke swimmer. Identify the two most likely boys.
  4. Removing the data for the two boys whom you identified in part (c):
    1. calculate the value of the product moment correlation coefficient for the remaining 8 pairs of values of \(x\) and \(y\);
    2. comment, in context, on the value that you obtain.
AQA S1 2008 January Q2
7 marks Moderate -0.8
2 The head and body length, \(x\) millimetres, and tail length, \(y\) millimetres, of each of a sample of 20 adult dormice were measured. The following statistics are derived from the results. $$S _ { x x } = 1280.55 \quad S _ { y y } = 281.8 \quad S _ { x y } = 416.3$$
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of this question.
  3. Write down the value of the product moment correlation coefficient if the measurements had been recorded in centimetres.
  4. Give a reason why it is not generally advisable to calculate the value of the product moment correlation coefficient without first viewing a scatter diagram of the data. Illustrate your answer with a sketch.
AQA S1 2009 January Q2
7 marks Moderate -0.3
2 A greengrocer sells bunches of 9 carrots at his Saturday market stall. Tom and Geri are two Statistics students who work on the stall. Each selects a bunch of carrots at random.
  1. At home, Tom measures the length, \(x\) centimetres, and the maximum diameter, \(y\) centimetres, of each carrot in his selected bunch with the following results.
    \(\boldsymbol { x }\)16.213.110.412.114.69.711.813.617.3
    \(\boldsymbol { y }\)4.23.94.73.33.72.43.13.52.7
    1. Calculate the value of the product moment correlation coefficient.
    2. Interpret your value in context.
  2. At her home, Geri measures the length, in centimetres, and the weight, in grams, of each carrot in her selected bunch and then obtains a value of - 0.986 for the product moment correlation coefficient. Comment, with a reason, on the likely validity of Geri's value.