5.08a Pearson correlation: calculate pmcc

246 questions

Sort by: Default | Easiest first | Hardest first
AQA S1 2011 January Q1
7 marks Easy -1.3
1
  1. Estimate, without undertaking any calculations, the value of the product moment correlation coefficient between the variables \(x\) and \(y\) for each of the two scatter diagrams. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{(i)} \includegraphics[alt={},max width=\textwidth]{156f9453-ebc6-4406-b5bc-08d1918ebc62-02_487_652_733_356}
    \end{figure} \includegraphics[max width=\textwidth, alt={}, center]{156f9453-ebc6-4406-b5bc-08d1918ebc62-02_576_714_733_1153}
  2. The table gives the circumference, \(x\) centimetres, and the weight, \(y\) grams, of each of 12 new cricket balls.
    \(\boldsymbol { x }\)22.522.722.622.422.522.822.622.722.822.422.922.6
    \(\boldsymbol { y }\)160.3159.4157.8158.0157.3159.8158.3159.6161.3156.4162.5161.2
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Assuming that the 12 balls may be considered to be a random sample, interpret your value in context.
AQA S1 2013 January Q4
12 marks Moderate -0.3
4 Ashok is a work-experience student with an organisation that offers two separate professional examination papers, I and II. For each of a random sample of 12 students, A to L , he records the mark, \(x\) per cent, achieved on Paper I, and the mark, \(y\) per cent, achieved on Paper II.
\cline { 2 - 13 } \multicolumn{1}{c|}{}\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)\(\mathbf { K }\)\(\mathbf { L }\)
\(\boldsymbol { x }\)344653626772605470718285
\(\boldsymbol { y }\)616672788881496054444936
    1. Calculate the value of the product moment correlation coefficient, \(r\), between \(x\) and \(y\).
    2. Interpret your value of \(r\) in the context of this question.
    1. Give two possible advantages of plotting data on a graph before calculating the value of a product moment correlation coefficient.
    2. Complete the plotting of Ashok's data on the scatter diagram on page 5.
    3. State what is now revealed by the scatter diagram.
  1. Ashok subsequently discovers that students A to F have a more scientific background than students G to L. With reference to your scatter diagram, estimate the value of the product moment correlation coefficient for each of the two groups of students. You are not expected to calculate the two values.
    \cline { 2 - 7 } \multicolumn{1}{c|}{}\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)\(\mathbf { K }\)\(\mathbf { L }\)
    \(\boldsymbol { x }\)605470718285
    \(\boldsymbol { y }\)496054444936
    \section*{Examination Marks}
    \includegraphics[max width=\textwidth, alt={}]{68830a6a-5479-4e5c-a845-a6536ab51cee-5_1616_1634_836_189}
AQA S1 2007 June Q1
5 marks Moderate -0.8
1 The table shows the length, in centimetres, and maximum diameter, in centimetres, of each of 10 honeydew melons selected at random from those on display at a market stall.
Length24251928272135233226
Maximum diameter18141611131412161514
  1. Calculate the value of the product moment correlation coefficient.
  2. Interpret your value in the context of this question.
AQA S1 2008 June Q3
10 marks Easy -1.3
3 [Figure 1, printed on the insert, is provided for use in this question.]
The table shows, for each of a sample of 12 handmade decorative ceramic plaques, the length, \(x\) millimetres, and the width, \(y\) millimetres.
Plaque\(\boldsymbol { x }\)\(\boldsymbol { y }\)
A232109
B235112
C236114
D234118
E230117
F230113
G246121
H240125
I244128
J241122
K246126
L245123
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of this question.
  3. On Figure 1, complete the scatter diagram for these data.
  4. In fact, the 6 plaques \(\mathrm { A } , \mathrm { B } , \ldots , \mathrm { F }\) are from a different source to the 6 plaques \(\mathrm { G } , \mathrm { H } , \ldots , \mathrm { L }\). With reference to your scatter diagram, but without further calculations, estimate the value of the product moment correlation coefficient between \(x\) and \(y\) for each source of plaque.
AQA S1 2010 June Q1
5 marks Moderate -0.5
1 The weight, \(x \mathrm {~kg}\), and the engine power, \(y \mathrm { bhp }\), of each car in a random sample of 10 hatchback cars are shown in the table.
\(\boldsymbol { x }\)1196106213351429101213551145141712751284
\(\boldsymbol { y }\)123881501586912094143107128
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of the question.
    \includegraphics[max width=\textwidth, alt={}]{c4844a30-6a86-49e3-b6aa-8e213dfc8ca1-03_2484_1709_223_153}
AQA S1 2011 June Q7
9 marks Moderate -0.3
7
  1. Three airport management trainees, Ryan, Sunil and Tim, were each instructed to select a random sample of 12 suitcases from those waiting to be loaded onto aircraft. Each trainee also had to measure the volume, \(x\), and the weight, \(y\), of each of the 12 suitcases in his sample, and then calculate the value of the product moment correlation coefficient, \(r\), between \(x\) and \(y\).
    • Ryan obtained a value of - 0.843 .
    • Sunil obtained a value of + 0.007 .
    Explain why neither of these two values is likely to be correct.
  2. Peggy, a supervisor with many years' experience, measured the volume, \(x\) cubic feet, and the weight, \(y\) pounds, of each suitcase in a random sample of 6 suitcases, and then obtained a value of 0.612 for \(r\).
    • Ryan and Sunil each claimed that Peggy's value was different from their values because she had measured the volumes in cubic feet and the weights in pounds, whereas they had measured the volumes in cubic metres and the weights in kilograms.
    • Tim claimed that Peggy's value was almost exactly half his calculated value because she had used a sample of size 6 whereas he had used one of size 12 .
    Explain why neither of these two claims is valid.
  3. Quentin, a manager, recorded the volumes, \(v\), and the weights, \(w\), of a random sample of 8 suitcases as follows.
    \(\boldsymbol { v }\)28.119.746.423.631.117.535.813.8
    \(\boldsymbol { w }\)14.912.121.118.019.819.216.214.7
    1. Calculate the value of \(r\) between \(v\) and \(w\).
    2. Interpret your value in the context of this question.
AQA S1 2012 June Q1
4 marks Easy -1.2
1 A production line in a rolling mill produces lengths of steel.
A random sample of 20 lengths of steel from the production line was selected. The minimum width, \(x\) centimetres, and the minimum thickness, \(y\) millimetres, of each selected length was recorded. The following summarised information was then calculated from these records. $$S _ { x x } = 2.030 \quad S _ { y y } = 1.498 \quad S _ { x y } = - 0.410$$
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret your value in the context of the question.
AQA S1 2013 June Q1
7 marks Moderate -0.8
1 The average maximum monthly temperatures, \(u\) degrees Fahrenheit, and the average minimum monthly temperatures, \(v\) degrees Fahrenheit, in New York City are as follows.
JanFebMarAprMayJunJulAugSepOctNovDec
Maximum (u)394048617181858377675441
Minimum (v)262734445363686660514130
    1. Calculate, to one decimal place, the mean and the standard deviation of the 12 values of the average maximum monthly temperature.
    2. For comparative purposes with a UK city, it was necessary to convert the temperatures from degrees Fahrenheit ( \({ } ^ { \circ } \mathrm { F }\) ) to degrees Celsius ( \({ } ^ { \circ } \mathrm { C }\) ). The formula used to convert \(f ^ { \circ } \mathrm { F }\) to \(c ^ { \circ } \mathrm { C }\) is: $$c = \frac { 5 } { 9 } ( f - 32 )$$ Use this formula and your answers in part (a)(i) to calculate, in \({ } ^ { \circ } \mathbf { C }\), the mean and the standard deviation of the 12 values of the average maximum monthly temperature.
      (3 marks)
  1. The value of the product moment correlation coefficient, \(r _ { u v }\), between the above 12 values of \(u\) and \(v\) is 0.997 , correct to three decimal places. State, giving a reason, the corresponding value of \(r _ { x y }\), where \(x\) and \(y\) are the exact equivalent temperatures in \({ } ^ { \circ } \mathrm { C }\) of \(u\) and \(v\) respectively.
    (2 marks)
AQA S1 2013 June Q4
17 marks Standard +0.3
4 The girth, \(g\) metres, the length, \(l\) metres, and the weight, \(y\) kilograms, of each of a sample of 20 pigs were measured. The data collected is summarised as follows. $$S _ { g g } = 0.1196 \quad S _ { l l } = 0.0436 \quad S _ { y y } = 5880 \quad S _ { g y } = 24.15 \quad S _ { l y } = 10.25$$
  1. Calculate the value of the product moment correlation coefficient between:
    1. girth and weight;
    2. length and weight.
  2. Interpret, in context, each of the values that you obtained in part (a).
  3. Weighing pigs requires expensive equipment, whereas measuring their girths and lengths simply requires a tape measure. With this in mind, the following formula is proposed to make an estimate of a pig's weight, \(x\) kilograms, from its girth and length. $$x = 69.3 \times g ^ { 2 } \times l$$ Applying this formula to the relevant data on the 20 pigs resulted in $$S _ { x x } = 5656.15 \quad S _ { x y } = 5662.97$$
    1. By calculating a third value of the product moment correlation coefficient, state which of \(g , l\) or \(x\) is the most strongly correlated with \(y\), the weight.
    2. Estimate the weight of a pig that has a girth of 1.25 metres and a length of 1.15 metres.
    3. Given the additional information that \(\bar { x } = 115.4\) and \(\bar { y } = 116.0\), calculate the equation of the least squares regression line of \(y\) on \(x\), in the form \(y = a + b x\).
    4. Comment on the likely accuracy of the estimated weight found in part (c)(ii). Your answer should make reference to the value of the product moment correlation coefficient found in part (c)(i) and to the values of \(b\) and \(a\) found in part (c)(iii).
      (4 marks)
AQA S1 2014 June Q5
13 marks Moderate -0.5
5 As part of a study of charity shops in a small market town, two such shops, \(X\) and \(Y\), were each asked to provide details of its takings on 12 randomly selected days. The table shows, for each of the 12 days, the day's takings, \(\pounds x\), of charity shop \(X\) and the day's takings, \(\pounds y\), of charity shop \(Y\).
Day\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)\(\mathbf { K }\)\(\mathbf { L }\)
\(\boldsymbol { x }\)4657391166277416115536861
\(\boldsymbol { y }\)781026621498729813421679583
    1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Interpret your value in the context of this question.
  1. Complete the scatter diagram shown on the opposite page.
  2. The investigator realised subsequently that one of the 12 selected days was a particularly popular town market day and another was a day on which the weather was extremely severe. Identify each of these days giving a reason for each choice.
  3. Removing the two days described in part (c) from the data gives the following information. $$S _ { x x } = 1292.5 \quad S _ { y y } = 3850.1 \quad S _ { x y } = 407.5$$
    1. Use this information to recalculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    2. Hence revise, as necessary, your interpretation in part (a)(ii).
      [0pt] [3 marks] Shop \(X\) takings(£) \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_33_21_294_1617}
      \end{figure} \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{harity Shops} \includegraphics[alt={},max width=\textwidth]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_49_24_276_1710}
      \end{figure}
      \includegraphics[max width=\textwidth, alt={}]{ddf7f158-b6ae-42c6-98f1-d59c205646ad-17_1304_415_406_1391}
AQA S1 2014 June Q4
7 marks Moderate -0.3
4 Every year, usually during early June, the Isle of Man hosts motorbike races. Each race consists of three consecutive laps of the island's course. To compete in a race, a rider must first complete at least one qualifying lap. The data refer to the lightweight motorbike class in 2012 and show, for each of a random sample of 10 riders, values of $$u = x - 100 \quad \text { and } \quad v = y - 100$$ where \(x\) denotes the average speed, in mph, for the rider's fastest qualifying lap and \(y\) denotes the average speed, in mph, for the rider's three laps of the race.
\cline { 2 - 11 } \multicolumn{1}{c|}{}Rider
\cline { 2 - 11 } \multicolumn{1}{c|}{}\(\mathbf { A }\)\(\mathbf { B }\)\(\mathbf { C }\)\(\mathbf { D }\)\(\mathbf { E }\)\(\mathbf { F }\)\(\mathbf { G }\)\(\mathbf { H }\)\(\mathbf { I }\)\(\mathbf { J }\)
\(\boldsymbol { u }\)7.8813.024.292.886.267.033.6011.7813.1511.69
\(\boldsymbol { v }\)6.6310.163.630.475.708.013.307.3113.0811.82
    1. Calculate the value of \(r _ { u v }\), the product moment correlation coefficient between \(u\) and \(v\).
    2. Hence state the value of \(r _ { x y }\), giving a reason for your answer.
  1. Interpret your value of \(r _ { x y }\) in the context of this question.
AQA S1 2016 June Q1
5 marks Moderate -0.8
1 The table shows the heights, \(x \mathrm {~cm}\), and the arm spans, \(y \mathrm {~cm}\), of a random sample of 12 men aged between 21 years and 40 years.
\(\boldsymbol { x }\)152166154159179167155168174182161163
\(\boldsymbol { y }\)143154151153168160146163170175155158
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
  2. Interpret, in context, your value calculated in part (a).
Edexcel S1 Q5
12 marks Moderate -0.8
  1. The table shows the numbers of cars and vans in a company's fleet having registrations with the prefix letters shown.
Registration letter\(K\)\(L\)\(M\)\(N\)\(P\)\(R\)\(S\)\(T\)\(V\)
Number of cars \(( x )\)67911151412107
Number of vans \(( y )\)810141313151498
  1. Plot a scatter graph of this data, with the number of cars on the horizontal axis and the number of vans on the vertical axis.
  2. If there were \(4 J\)-registered cars, estimate the number of \(J\)-registered vans. Given that \(\sum x ^ { 2 } = 1001 , \sum y ^ { 2 } = 1264\) and \(\sum x y = 1106\),
  3. calculate the product-moment correlation coefficient between \(x\) and \(y\). Give a brief interpretation of your answer.
Edexcel S1 Q2
7 marks Moderate -0.8
2. A tennis coach believes that taller players are generally capable of hitting faster serves. To investigate this hypothesis he collects data on the 20 adult male players he coaches. The height, \(h\), in metres and the speed of each player's fastest serve, \(v\), in miles per hour were recorded and summarised as follows: $$\Sigma h = 36.22 , \quad \Sigma v = 2275 , \quad \Sigma h ^ { 2 } = 65.7396 , \quad \Sigma v ^ { 2 } = 259853 , \quad \Sigma h v = 4128.03 .$$
  1. Calculate the product moment correlation coefficient for these data.
  2. Comment on the coach's hypothesis.
Edexcel S1 Q6
17 marks Moderate -0.8
6. A school introduced a new programme of support lessons in 1994 with a view to improving grades in GCSE English. The table below shows the number of years since 1994, n, and the corresponding percentage of students achieving A to C grades in GCSE English, \(p\), for each year.
\(n\)123456
\(p ( \% )\)35.237.140.639.043.444.8
  1. Represent these data on a scatter diagram. You may use the following values. $$\Sigma n = 21 , \quad \Sigma p = 240.1 , \quad \Sigma n ^ { 2 } = 91 , \quad \Sigma p ^ { 2 } = 9675.41 , \quad \Sigma n p = 873 .$$
  2. Find an equation of the regression line of \(p\) on \(n\) and draw it on your graph.
  3. Calculate the product moment correlation coefficient for these data and comment on the suitability of a linear model for the relationship between \(n\) and \(p\) during this period.
Edexcel S1 Q2
8 marks Easy -1.2
2. A supermarket manager believes that those of her staff on lower rates of pay tend to work more hours of overtime.
  1. Suggest why this might be the case. To investigate her theory the manager recorded the number of hours of overtime, \(h\), worked by each of the store's 18 full-time staff during one week. She also recorded each employee's hourly rate of pay, \(\pounds p\), and summarised her results as follows: $$\Sigma p = 86 , \quad \Sigma h = 104.5 , \quad \Sigma p ^ { 2 } = 420.58 , \quad \Sigma h ^ { 2 } = 830.25 , \quad \Sigma p h = 487.3$$
  2. Calculate the product moment correlation coefficient for these data.
  3. Comment on the manager's hypothesis.
Edexcel S1 Q2
10 marks Moderate -0.3
2. A statistics student gave a questionnaire to a random sample of 50 pupils at his school. The sample included pupils aged from 11 to 18 years old. The student summarised the data on age in completed years, \(A\), and the number of hours spent doing homework in the previous week, \(H\), giving the following: $$\Sigma A = 703 , \quad \Sigma H = 217 , \quad \Sigma A ^ { 2 } = 10131 , \quad \Sigma H ^ { 2 } = 1338.5 , \quad \Sigma A H = 3253.5$$
  1. Calculate the product moment correlation coefficient for these data and explain what is shown by your result.
    (6 marks)
    The student also asked each pupil how many hours of paid work they had done in the previous week. He then calculated the product moment correlation coefficient for the data on hours doing homework and hours doing paid work, giving a value of \(r = 0.5213\) The student concluded that paid work did not interfere with homework as pupils doing more paid work also tended to do more homework.
  2. Explain why this conclusion may not be valid.
  3. Explain briefly how the student could more effectively investigate the effect of paid work on homework.
    (2 marks)
Edexcel S1 Q1
7 marks Moderate -0.8
  1. A shop recorded the number of pairs of gloves, \(n\), that it sold and the average daytime temperature, \(T ^ { \circ } \mathrm { C }\), for each month over a 12-month period.
The data was then summarised as follows: $$\Sigma T = 124 , \quad \Sigma n = 384 , \quad \Sigma T ^ { 2 } = 1802 , \quad \Sigma n ^ { 2 } = 18518 , \quad \Sigma T n = 2583 .$$
  1. Calculate the product moment correlation coefficient for these data.
  2. Comment on what your value shows and suggest a reason for this.
AQA S3 2008 June Q1
7 marks Moderate -0.3
1 The best performances of a random sample of 20 junior athletes in the long jump, \(x\) metres, and in the high jump, \(y\) metres, were recorded. The following statistics were calculated from the results. $$S _ { x x } = 7.0036 \quad S _ { y y } = 0.8464 \quad S _ { x y } = 1.3781$$
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(y\).
    (2 marks)
  2. Assuming that these data come from a bivariate normal distribution, investigate, at the \(1 \%\) level of significance, the claim that for junior athletes there is a positive correlation between \(x\) and \(y\).
  3. Interpret your conclusion in the context of this question.
AQA S3 2012 June Q1
6 marks Moderate -0.8
1 A wildlife expert measured the neck lengths, \(x\) metres, and the tail lengths, \(y\) metres, of a sample of 12 mature male giraffes as part of a study into their physical characteristics. The results are shown in the table.
AQA S3 2015 June Q1
6 marks Moderate -0.8
1 A demographer measured the length of the right foot, \(x\) millimetres, and the length of the right hand, \(y\) millimetres, of each of a sample of 12 males aged between 19 years and 25 years. The results are given in the table.
Edexcel S3 Q2
6 marks Standard +0.3
2. A Geography teacher is interested in the link between mathematical ability and the ability to visualise three-dimensional situations. He gives a group of 15 students a test and records each student's score, \(m\), on the mathematics questions and each student's score, \(v\), on the visiospatial questions. He calculates the following summary statistics: $$S _ { m m } = 3747.73 , \quad S _ { v v } = 2791.33 , \quad S _ { m v } = 2564.33$$
  1. Calculate the product moment correlation coefficient for these data.
  2. Stating your hypotheses clearly and using a \(5 \%\) level of significance test the theory that students who are good at Mathematics tend to have better visio-spatial awareness.
    (4 marks)
OCR MEI Further Statistics A AS 2018 June Q6
9 marks Standard +0.3
6 A researcher is investigating various bodily characteristics of frogs of various species. She collects data on length, \(x \mathrm {~mm}\), and head width, \(y \mathrm {~mm}\), of a random sample of 14 frogs of a particular species. A scatter diagram of the data is shown in Fig. 6, together with the equation of the regression line of \(y\) on \(x\) and also the value of \(r ^ { 2 }\). \begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{e3ac0ba0-9692-4018-894e-2b04b07eaf32-6_949_1616_450_228} \captionsetup{labelformat=empty} \caption{Fig. 6}
\end{figure}
  1. (A) Use the equation of the regression line to estimate the mean head width for frogs of each of the following lengths.
OCR MEI Further Statistics A AS 2019 June Q4
8 marks Moderate -0.3
4 A student is investigating correlations between various personality traits, two of which are conscientiousness and openness to new experiences.
She selects a random sample of 10 students at her university and uses standard tests to measure their conscientiousness and their openness. The product moment correlation coefficient between these two variables for the 10 students is 0.476 .
  1. Assuming that the underlying population has a bivariate Normal distribution, carry out a hypothesis test at the \(10 \%\) significance level to investigate whether there is any correlation between openness and conscientiousness in students. Table 4.1 below shows the values of the product moment correlation coefficients between 5 different personality traits for a much larger sample of students. Those correlations that are significant at the \(5 \%\) level are denoted by a * after the value of the correlation. \begin{table}[h]
    NeuroticismExtroversionOpennessAgreeablenessConscientiousness
    Neuroticism1
    Extroversion-0.296*1
    Openness-0.0440.405*1
    Agreeableness-0.190*0.0610.0421
    Conscientiousness-0.485*0.1450.235*0.1121
    \captionsetup{labelformat=empty} \caption{Table 4.1}
    \end{table} The student analyses these factors for effect size.
    Guidelines often used when considering effect size are given in Table 4.2 below. \begin{table}[h]
    Product moment
    correlation coefficient
    Effect size
    0.1Small
    0.3Medium
    0.5Large
    \captionsetup{labelformat=empty} \caption{Table 4.2}
    \end{table}
  2. The student notes that, despite the result of the test in part (a), the correlation between openness and conscientiousness is significant at the \(5 \%\) level with this second sample. Comment briefly on why this may be the case.
  3. The student intends to summarise her findings about relationships between these factors, including effect sizes, in a report.
    Use the information in Tables 4.1 and 4.2 to identify two summary points the student could make.
OCR MEI Further Statistics A AS 2023 June Q5
10 marks Standard +0.3
5 Two practice GCSE examinations in mathematics are given to all of the students in a large year group. A teacher wants to check whether there is a positive relationship between the marks obtained by the students in the two examinations. She selects a random sample of 20 students. Summary data for the marks obtained in the first and second practice examinations, \(x\) and \(y\) respectively, are as follows. $$\sum x = 565 \quad \sum y = 724 \quad \sum x ^ { 2 } = 17103 \quad \sum y ^ { 2 } = 29286 \quad \sum x y = 21635$$ The teacher decides to carry out a hypothesis test based on Pearson's product moment correlation coefficient.
  1. In this question you must show detailed reasoning. Calculate the value of Pearson's product moment correlation coefficient.
  2. Carry out the test at the \(5 \%\) significance level.
  3. Given that the teacher did not draw a scatter diagram before carrying out the test, comment on the validity of the test.