Interpret correlation coefficient value

A question is this type if and only if it asks to interpret the meaning or context of a given or calculated correlation coefficient value.

5 questions

AQA S1 2011 June Q7
7
  1. Three airport management trainees, Ryan, Sunil and Tim, were each instructed to select a random sample of 12 suitcases from those waiting to be loaded onto aircraft. Each trainee also had to measure the volume, \(x\), and the weight, \(y\), of each of the 12 suitcases in his sample, and then calculate the value of the product moment correlation coefficient, \(r\), between \(x\) and \(y\).
    • Ryan obtained a value of - 0.843 .
    • Sunil obtained a value of + 0.007 .
    Explain why neither of these two values is likely to be correct.
  2. Peggy, a supervisor with many years' experience, measured the volume, \(x\) cubic feet, and the weight, \(y\) pounds, of each suitcase in a random sample of 6 suitcases, and then obtained a value of 0.612 for \(r\).
    • Ryan and Sunil each claimed that Peggy's value was different from their values because she had measured the volumes in cubic feet and the weights in pounds, whereas they had measured the volumes in cubic metres and the weights in kilograms.
    • Tim claimed that Peggy's value was almost exactly half his calculated value because she had used a sample of size 6 whereas he had used one of size 12 .
    Explain why neither of these two claims is valid.
  3. Quentin, a manager, recorded the volumes, \(v\), and the weights, \(w\), of a random sample of 8 suitcases as follows.
    \(\boldsymbol { v }\)28.119.746.423.631.117.535.813.8
    \(\boldsymbol { w }\)14.912.121.118.019.819.216.214.7
    1. Calculate the value of \(r\) between \(v\) and \(w\).
    2. Interpret your value in the context of this question.
SPS SPS ASFM Statistics 2025 January Q1
  1. \(\mathrm { E } ( a X + b Y + c ) = a \mathrm { E } ( X ) + b \mathrm { E } ( Y ) + c\),
  2. if \(X\) and \(Y\) are independent then \(\operatorname { Var } ( a X + b Y + c ) = a ^ { 2 } \operatorname { Var } ( X ) + b ^ { 2 } \operatorname { Var } ( Y )\).
\section*{Non-parametric tests} Goodness-of-fit test and contingency tables: \(\sum \frac { \left( O _ { i } - E _ { i } \right) ^ { 2 } } { E _ { i } } \sim \chi _ { v } ^ { 2 }\)
Approximate distributions for large samples
Wilcoxon Signed Rank test: \(T \sim \mathrm {~N} \left( \frac { 1 } { 4 } n ( n + 1 ) , \frac { 1 } { 24 } n ( n + 1 ) ( 2 n + 1 ) \right)\)
Wilcoxon Rank Sum test (samples of sizes \(m\) and \(n\), with \(m \leq n\) ): $$W \sim \mathrm {~N} \left( \frac { 1 } { 2 } m ( m + n + 1 ) , \frac { 1 } { 12 } m n ( m + n + 1 ) \right)$$ \section*{Discrete distributions} \(X\) is a random variable taking values \(x _ { i }\) in a discrete distribution with \(\mathrm { P } \left( X = x _ { i } \right) = p _ { i }\)
Expectation: \(\mu = \mathrm { E } ( X ) = \sum x _ { i } p _ { i }\)
Variance: \(\sigma ^ { 2 } = \operatorname { Var } ( X ) = \sum \left( x _ { i } - \mu \right) ^ { 2 } p _ { i } = \sum x _ { i } ^ { 2 } p _ { i } - \mu ^ { 2 }\) \(n = 8 \quad \sum p = 28.5 \quad \sum q = 26.7 \quad \sum p ^ { 2 } = 136.35 \quad \sum q ^ { 2 } = 116.35 \quad \sum p q = 116.70\)
\includegraphics[max width=\textwidth, alt={}, center]{76f751ed-394d-41cb-b98f-bc8efcf3365e-08_705_1164_1139_267}
  1. State which, if either, of the variables \(p\) and \(q\) is independent.
  2. Calculate the equation of the regression line of \(q\) on \(p\).
    1. Use the regression line to estimate the value of \(q\) for an investment account for which \(p = 2.5\).
    2. Give two reasons why this estimate could be considered reliable.
  3. Comment on the reliability of using the regression line to predict the value of \(q\) when \(p = 7.0\). Total: \(\_\_\_\_\) / 9 marks \section*{Question 4} After a holiday organised for a group, the company organising the holiday obtained scores out of 10 for six different aspects of the holiday. The company obtained responses from 100 couples and 100 single travellers. The total scores for each of the aspects are given in the following table. After further investigation, the statistician decides to use a different model for the distribution of \(F\). In this model it is now assumed that \(\mathrm { P } ( F = 0 )\) is still 0.200 , but that if one failure occurs, there is an increased probability that further failures occur.
  4. Explain the effect of this assumption on the value of \(\mathrm { P } ( F = 1 )\). Total: \(\_\_\_\_\) / 10 marks \section*{Question 6} In a fashion competition, two judges gave marks to a large number of contestants.
    The value of Spearman's rank correlation coefficient, \(r _ { s }\), between the marks given to 7 randomly chosen contestants is \(\frac { 27 } { 28 }\).
  5. An excerpt from the table of critical values of \(r _ { s }\) is shown below. \section*{Critical values of Spearman's rank correlation coefficient}
    1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2\%1\%
    \multirow{3}{*}{\(n\)}60.82860.88570.94291.0000
    70.71430.78570.89290.9286
    80.64290.73810.83330.8810
    Test whether there is evidence, at the \(1 \%\) significance level, that the judges agree with each another. The marks given by the two judges to the 7 randomly chosen contestants were as follows, where \(x\) is an integer.
    Contestant\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
    Judge 164656778798086
    Judge 2616378808190\(x\)
  6. Use the value \(r _ { s } = \frac { 27 } { 28 }\) to determine the range of possible values of \(x\).
  7. Give a reason why it might be preferable to use the product moment correlation coefficient rather than Spearman's rank correlation coefficient in this context. Total: \(\_\_\_\_\) / 9 marks \section*{Question 7} A bag contains \(2 m\) yellow and \(m\) green counters. Three counters are chosen at random, without replacement. The probability that exactly two of the three counters are yellow is \(\frac { 28 } { 55 }\). Determine the value of \(m\). Total: \(\_\_\_\_\) End of Paper
Edexcel S1 Q7
7. In a school there are 148 students in Years 12 and 13 studying Science, Humanities or Arts subjects. Of these students, 89 wear glasses and the others do not. There are 30 Science students of whom 18 wear glasses. The corresponding figures for the Humanities students are 68 and 44 respectively. A student is chosen at random. Find the probability that this student
  1. is studying Arts subjects,
  2. does not wear glasses, given that the student is studying Arts subjects. Amongst the Science students, \(80 \%\) are right-handed. Corresponding percentages for Humanities and Arts students are 75\% and 70\% respectively. A student is again chosen at random.
  3. Find the probability that this student is right-handed.
  4. Given that this student is right-handed, find the probability that the student is studying Science subjects.
    Turn over
    1. (a) Describe the main features and uses of a box plot.
    Children from schools \(A\) and \(B\) took part in a fun run for charity. The times, to the nearest minute, taken by the children from school \(A\) are summarised in Figure 1. \begin{figure}[h]
    \captionsetup{labelformat=empty} \caption{Figure 1} \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-015_398_1045_946_461}
    \end{figure}
    1. Write down the time by which \(75 \%\) of the children in school \(A\) had completed the run.
    2. State the name given to this value.
  5. Explain what you understand by the two crosses ( X ) on Figure 1.
    For school \(B\) the least time taken by any of the children was 25 minutes and the longest time was 55 minutes. The three quartiles were 30,37 and 50 respectively.
  6. Draw a box plot to represent the data from school \(B\).
    \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-016_798_1196_580_372}
  7. Compare and contrast these two box plots.
    2. Sunita and Shelley talk to one another once a week on the telephone. Over many weeks they recorded, to the nearest minute, the number of minutes spent in conversation on each occasion. The following table summarises their results. Turn over
    1. As part of a statistics project, Gill collected data relating to the length of time, to the nearest minute, spent by shoppers in a supermarket and the amount of money they spent. Her data for a random sample of 10 shoppers are summarised in the table below, where \(t\) represents time and \(\pounds m\) the amount spent over \(\pounds 20\).
    Turn over
    1. A young family were looking for a new 3 bedroom semi-detached house. A local survey recorded the price \(x\), in \(\pounds 1000\), and the distance \(y\), in miles, from the station of such houses. The following summary statistics were provided
    $$S _ { x x } = 113573 , \quad S _ { y y } = 8.657 , \quad S _ { x y } = - 808.917$$
  8. Use these values to calculate the product moment correlation coefficient.
  9. Give an interpretation of your answer to part (a). Another family asked for the distances to be measured in km rather than miles.
  10. State the value of the product moment correlation coefficient in this case.
    2. The box plot in Figure 1 shows a summary of the weights of the luggage, in kg, for each musician in an orchestra on an overseas tour. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-045_346_1452_324_228} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} The airline's recommended weight limit for each musician's luggage was 45 kg . Given that none of the musicians' luggage weighed exactly 45 kg ,
  11. state the proportion of the musicians whose luggage was below the recommended weight limit. A quarter of the musicians had to pay a charge for taking heavy luggage.
  12. State the smallest weight for which the charge was made.
  13. Explain what you understand by the + on the box plot in Figure 1, and suggest an instrument that the owner of this luggage might play.
  14. Describe the skewness of this distribution. Give a reason for your answer. One musician of the orchestra suggests that the weights of luggage, in kg, can be modelled by a normal distribution with quartiles as given in Figure 1.
  15. Find the standard deviation of this normal distribution.
    3. A student is investigating the relationship between the price ( \(y\) pence) of 100 g of chocolate and the percentage ( \(x \%\) ) of cocoa solids in the chocolate.
    The following data is obtained Turn over
    advancing learning, changing lives
    1. A personnel manager wants to find out if a test carried out during an employee's interview and a skills assessment at the end of basic training is a guide to performance after working for the company for one year.
    The table below shows the results of the interview test of 10 employees and their performance after one year. Turn over
    advancing learning, changing lives
    1. A disease is known to be present in \(2 \%\) of a population. A test is developed to help determine whether or not someone has the disease.
    Given that a person has the disease, the test is positive with probability 0.95
    Given that a person does not have the disease, the test is positive with probability 0.03
  16. Draw a tree diagram to represent this information. A person is selected at random from the population and tested for this disease.
  17. Find the probability that the test is positive. A doctor randomly selects a person from the population and tests him for the disease. Given that the test is positive,
  18. find the probability that he does not have the disease.
  19. Comment on the usefulness of this test. 2. The age in years of the residents of two hotels are shown in the back to back stem and leaf diagram below. Abbey Hotel \(8 | 5 | 0\) means 58 years in Abbey hotel and 50 years in Balmoral hotel Balmoral Hotel Turn over
    1. A teacher is monitoring the progress of students using a computer based revision course. The improvement in performance, \(y\) marks, is recorded for each student along with the time, \(x\) hours, that the student spent using the revision course. The results for a random sample of 10 students are recorded below.
    Turn over
    advancing learning, changing lives
    1. The volume of a sample of gas is kept constant. The gas is heated and the pressure, \(p\), is measured at 10 different temperatures, \(t\). The results are summarised below.
      \(\sum p = 445 \quad \sum p ^ { 2 } = 38125 \quad \sum t = 240 \quad \sum t ^ { 2 } = 27520 \quad \sum p t = 26830\)
    2. Find \(\mathrm { S } _ { p p }\) and \(\mathrm { S } _ { p t }\).
    Given that \(\mathrm { S } _ { t t } = 21760\),
  20. calculate the product moment correlation coefficient.
  21. Give an interpretation of your answer to part (b).
    2. On a randomly chosen day the probability that Bill travels to school by car, by bicycle or on foot is \(\frac { 1 } { 2 } , \frac { 1 } { 6 }\) and \(\frac { 1 } { 3 }\) respectively. The probability of being late when using these methods of travel is \(\frac { 1 } { 5 } , \frac { 2 } { 5 }\) and \(\frac { 1 } { 10 }\) respectively.
  22. Draw a tree diagram to represent this information.
  23. Find the probability that on a randomly chosen day
    1. Bill travels by foot and is late,
    2. Bill is not late.
  24. Given that Bill is late, find the probability that he did not travel on foot.
    3. The variable \(x\) was measured to the nearest whole number. Forty observations are given in the table below.
    \(x\)\(10 - 15\)\(16 - 18\)\(19 -\)
    Frequency15916
    A histogram was drawn and the bar representing the \(10 - 15\) class has a width of 2 cm and a height of 5 cm . For the \(16 - 18\) class find
  25. the width,
  26. the height
    of the bar representing this class.
    4. A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The lengths are summarised in the table below.
    Foot length, \(l\), (cm)Number of children
    \(10 \leqslant l < 12\)5
    \(12 \leqslant l < 17\)53
    \(17 \leqslant l < 19\)29
    \(19 \leqslant l < 21\)15
    \(21 \leqslant l < 23\)11
    \(23 \leqslant l < 25\)7
  27. Use interpolation to estimate the median of this distribution.
  28. Calculate estimates for the mean and the standard deviation of these data. One measure of skewness is given by $$\text { Coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  29. Evaluate this coefficient and comment on the skewness of these data. Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old children.
  30. Using the value found in part (c), comment on Greg's suggestion, giving a reason for your answer.
    5. The weight, \(w\) grams, and the length, \(l \mathrm {~mm}\), of 10 randomly selected newborn turtles are given in the table below.
    \(l\)49.052.053.054.554.153.450.051.649.551.2
    \(w\)29323439383530312930
    $$\text { (You may use } \mathrm { S } _ { l l } = 33.381 \quad \mathrm {~S} _ { w l } = 59.99 \quad \mathrm {~S} _ { w w } = 120.1 \text { ) }$$
  31. Find the equation of the regression line of \(w\) on \(l\) in the form \(w = a + b l\).
  32. Use your regression line to estimate the weight of a newborn turtle of length 60 mm .
  33. Comment on the reliability of your estimate giving a reason for your answer.
    6. The discrete random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \left\{ \begin{array} { c l } a ( 3 - x ) & x = 0,1,2
    b & x = 3 \end{array} \right.$$
  34. Find \(\mathrm { P } ( X = 2 )\) and complete the table below.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(3 a\)\(2 a\)\(b\)
    Given that \(\mathrm { E } ( X ) = 1.6\)
  35. Find the value of \(a\) and the value of \(b\). Find
  36. \(\mathrm { P } ( 0.5 < X < 3 )\),
  37. \(\mathrm { E } ( 3 X - 2 )\).
  38. Show that the \(\operatorname { Var } ( X ) = 1.64\)
  39. Calculate \(\operatorname { Var } ( 3 X - 2 )\).
    7. (a) Given that \(\mathrm { P } ( A ) = a\) and \(\mathrm { P } ( B ) = b\) express \(\mathrm { P } ( A \cup B )\) in terms of \(a\) and \(b\) when
    1. \(A\) and \(B\) are mutually exclusive,
    2. \(A\) and \(B\) are independent. Two events \(R\) and \(Q\) are such that
      \(\mathrm { P } \left( R \cap Q ^ { \prime } \right) = 0.15 , \quad \mathrm { P } ( Q ) = 0.35\) and \(\mathrm { P } ( R \mid Q ) = 0.1\)
      Find the value of
  40. \(\mathrm { P } ( R \cup Q )\),
  41. \(\mathrm { P } ( R \cap Q )\),
  42. \(\mathrm { P } ( R )\).
Edexcel S1 Q8
8. The lifetimes of bulbs used in a lamp are normally distributed. A company \(X\) sells bulbs with a mean lifetime of 850 hours and a standard deviation of 50 hours.
  1. Find the probability of a bulb, from company \(X\), having a lifetime of less than 830 hours.
  2. In a box of 500 bulbs, from company \(X\), find the expected number having a lifetime of less than 830 hours. A rival company \(Y\) sells bulbs with a mean lifetime of 860 hours and \(20 \%\) of these bulbs have a lifetime of less than 818 hours.
  3. Find the standard deviation of the lifetimes of bulbs from company \(Y\). Both companies sell the bulbs for the same price.
  4. State which company you would recommend. Give reasons for your answer.
    \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{physicsandmathstutor.com}
    \end{table} Paper Reference(s)
    6683/01 \section*{Edexcel GCE } Examiner's use only
    \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-112_99_309_493_1636} \(\frac { \text { Materials required for examination } } { \text { Mathematical Formulae (Pink) } } \frac { \text { Items included with question papers } } { \text { Nil } }\) Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulae stored in them. In the boxes above, write your centre number, candidate number, your surname, initials and signature.
    Check that you have the correct question paper.
    Answer ALL the questions.
    You must write your answer to each question in the space following the question.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2).
    There are 7 questions in this question paper. The total mark for this paper is 75.
    There are 28 pages in this question paper. Any blank pages are indicated. You must ensure that your answers to parts of questions are clearly labelled.
    You should show sufficient working to make your methods clear to the Examiner.
    Answers without working may not gain full credit. Turn over
    advancing learning, changing lives
    1. Gary compared the total attendance, \(x\), at home matches and the total number of goals, \(y\), scored at home during a season for each of 12 football teams playing in a league. He correctly calculated:
    $$S _ { x x } = 1022500 \quad S _ { y y } = 130.9 \quad S _ { x y } = 8825$$
  5. Calculate the product moment correlation coefficient for these data.
  6. Interpret the value of the correlation coefficient. Helen was given the same data to analyse. In view of the large numbers involved she decided to divide the attendance figures by 100 . She then calculated the product moment correlation coefficient between \(\frac { x } { 100 }\) and \(y\).
  7. Write down the value Helen should have obtained.
    2. An experiment consists of selecting a ball from a bag and spinning a coin. The bag contains 5 red balls and 7 blue balls. A ball is selected at random from the bag, its colour is noted and then the ball is returned to the bag. When a red ball is selected, a biased coin with probability \(\frac { 2 } { 3 }\) of landing heads is spun.
    When a blue ball is selected a fair coin is spun.
  8. Complete the tree diagram below to show the possible outcomes and associated probabilities.
    \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-129_787_395_734_548} \section*{Coin}
    \includegraphics[max width=\textwidth, alt={}]{3d4f7bfb-b235-418a-9411-a4d0b3188254-129_1007_488_808_950}
    Shivani selects a ball and spins the appropriate coin.
  9. Find the probability that she obtains a head. Given that Tom selected a ball at random and obtained a head when he spun the appropriate coin,
  10. find the probability that Tom selected a red ball. Shivani and Tom each repeat this experiment.
  11. Find the probability that the colour of the ball Shivani selects is the same as the colour of the ball Tom selects. 3. The discrete random variable \(X\) has probability distribution given by Turn over
    advancing learning, changing lives
    1. A random sample of 50 salmon was caught by a scientist. He recorded the length \(l \mathrm {~cm}\) and weight \(w \mathrm {~kg}\) of each salmon.
    The following summary statistics were calculated from these data.
    \(\sum l = 4027 \quad \sum l ^ { 2 } = 327754.5 \quad \sum w = 357.1 \quad \sum l w = 29330.5 \quad S _ { w w } = 289.6\)
  12. Find \(S _ { l l }\) and \(S _ { l w }\)
  13. Calculate, to 3 significant figures, the product moment correlation coefficient between \(l\) and \(w\).
  14. Give an interpretation of your coefficient.
    1. Keith records the amount of rainfall, in mm , at his school, each day for a week. The results are given below.
      0.0
      0.5
      1.8
      2.8
      2.3
      5.6
      9.4
    Jenny then records the amount of rainfall, \(x \mathrm {~mm}\), at the school each day for the following 21 days. The results for the 21 days are summarised below. $$\sum x = 84.6$$
  15. Calculate the mean amount of rainfall during the whole 28 days. Keith realises that he has transposed two of his figures. The number 9.4 should have been 4.9 and the number 0.5 should have been 5.0 Keith corrects these figures.
  16. State, giving your reason, the effect this will have on the mean.
    3. Over a long period of time a small company recorded the amount it received in sales per month. The results are summarised below. Turn over
    advancing learning, changing lives
    1. On a particular day the height above sea level, \(x\) metres, and the mid-day temperature, \(y ^ { \circ } \mathrm { C }\), were recorded in 8 north European towns. These data are summarised below
    $$\mathrm { S } _ { x x } = 3535237.5 \quad \sum y = 181 \quad \sum y ^ { 2 } = 4305 \quad \mathrm {~S} _ { x y } = - 23726.25$$
  17. Find \(\mathrm { S } _ { y y }\)
  18. Calculate, to 3 significant figures, the product moment correlation coefficient for these data.
  19. Give an interpretation of your coefficient. A student thought that the calculations would be simpler if the height above sea level, \(h\), was measured in kilometres and used the variable \(h = \frac { x } { 1000 }\) instead of \(x\).
  20. Write down the value of \(\mathrm { S } _ { h h }\)
  21. Write down the value of the correlation coefficient between \(h\) and \(y\).
    1. The random variable \(X \sim \mathrm {~N} \left( \mu , 5 ^ { 2 } \right)\) and \(\mathrm { P } ( X < 23 ) = 0.9192\)
    2. Find the value of \(\mu\).
    3. Write down the value of \(\mathrm { P } ( \mu < X < 23 )\).
    4. The discrete random variable \(Y\) has probability distribution
    Turn over
    1. The histogram in Figure 1 shows the time, to the nearest minute, that a random sample of 100 motorists were delayed by roadworks on a stretch of motorway.
    \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-171_1312_673_349_639} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure}
  22. Complete the table. Turn over
    1. A discrete random variable \(X\) has the probability function
    $$\mathrm { P } ( X = x ) = \begin{cases} k ( 1 - x ) ^ { 2 } & x = - 1,0,1 \text { and } 2
    0 & \text { otherwise } \end{cases}$$
  23. Show that \(k = \frac { 1 } { 6 }\)
  24. Find \(\mathrm { E } ( X )\)
  25. Show that \(\mathrm { E } \left( X ^ { 2 } \right) = \frac { 4 } { 3 }\)
  26. Find \(\operatorname { Var } ( 1 - 3 X )\)
    2. A bank reviews its customer records at the end of each month to find out how many customers have become unemployed, \(u\), and how many have had their house repossessed, \(h\), during that month. The bank codes the data using variables \(x = \frac { u - 100 } { 3 }\) and \(y = \frac { h - 20 } { 7 }\) The results for the 12 months of 2009 are summarised below. $$\sum x = 477 \quad S _ { x x } = 5606.25 \quad \sum y = 480 \quad S _ { y y } = 4244 \quad \sum x y = 23070$$
  27. Calculate the value of the product moment correlation coefficient for \(x\) and \(y\).
  28. Write down the product moment correlation coefficient for \(u\) and \(h\). The bank claims that an increase in unemployment among its customers is associated with an increase in house repossessions.
  29. State, with a reason, whether or not the bank's claim is supported by these data.
    3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, \(p\), and measures the thinning of the shell, \(t\). The results are shown in the table below. Turn over
    1. A teacher asked a random sample of 10 students to record the number of hours of television, \(t\), they watched in the week before their mock exam. She then calculated their grade, \(g\), in their mock exam. The results are summarised as follows.
    $$\sum t = 258 \quad \sum t ^ { 2 } = 8702 \quad \sum g = 63.6 \quad \mathrm {~S} _ { g g } = 7.864 \quad \sum g t = 1550.2$$
  30. Find \(\mathrm { S } _ { t t }\) and \(\mathrm { S } _ { g t }\)
  31. Calculate, to 3 significant figures, the product moment correlation coefficient between \(t\) and \(g\). The teacher also recorded the number of hours of revision, \(v\), these 10 students completed during the week before their mock exam. The correlation coefficient between \(t\) and \(v\) was -0.753
  32. Describe, giving a reason, the nature of the correlation you would expect to find between \(v\) and \(g\).
    2. The discrete random variable \(X\) can take only the values 1,2 and 3 . For these values the cumulative distribution function is defined by $$\mathrm { F } ( x ) = \frac { x ^ { 3 } + k } { 40 } \quad x = 1,2,3$$
  33. Show that \(k = 13\)
  34. Find the probability distribution of \(X\). Given that \(\operatorname { Var } ( X ) = \frac { 259 } { 320 }\)
  35. find the exact value of \(\operatorname { Var } ( 4 X - 5 )\).
    3. A biologist is comparing the intervals ( \(m\) seconds) between the mating calls of a certain species of tree frog and the surrounding temperature ( \(t { } ^ { \circ } \mathrm { C }\) ). The following results were obtained. Turn over
    1. Sammy is studying the number of units of gas, \(g\), and the number of units of electricity, \(e\), used in her house each week. A random sample of 10 weeks use was recorded and the data for each week were coded so that \(x = \frac { g - 60 } { 4 }\) and \(y = \frac { e } { 10 }\). The results for the coded data are summarised below
    $$\sum x = 48.0 \quad \sum y = 58.0 \quad \mathrm {~S} _ { x x } = 312.1 \quad \mathrm {~S} _ { y y } = 2.10 \quad \mathrm {~S} _ { x y } = 18.35$$
  36. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\). Give the values of \(a\) and \(b\) correct to 3 significant figures.
  37. Hence find the equation of the regression line of \(e\) on \(g\) in the form \(e = c + d g\). Give the values of \(c\) and \(d\) correct to 2 significant figures.
  38. Use your regression equation to estimate the number of units of electricity used in a week when 100 units of gas were used.
    (a)Find the probability distribution of \(X\) .
    (b)Write down the value of \(\mathrm { F } ( 1.8 )\) .
    (a)Find the probability distribution of \(X\) .勤 2.The discrete random variable \(X\) takes the values 1,2 and 3 and has cum
    function \(\mathrm { F } ( x )\) given by Turn over
    1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
    \(h\)140011002608409005501230100770
    \(t\)310209101352416
    [You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
  39. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  40. Calculate the product moment correlation coefficient for this data.
  41. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  42. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  43. Interpret the value of \(b\).
  44. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
    1. The marks of a group of female students in a statistics test are summarised in Figure 1
    \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-227_629_1102_342_429} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure}
  45. Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.
    Mark(2|6 means 26)Totals
    14(1)
    26(1)
    3447(3)
    4066778(6)
    5001113677(9)
    6223338(6)
    7008(3)
    85(1)
    90(1)
  46. Find the median and interquartile range of the marks of the male students. An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile or more than \(1.5 \times\) interquartile range below the lower quartile.
  47. In the space provided on Figure 1 draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  48. Compare and contrast the marks of the male and the female students.
    3. In a company the 200 employees are classified as full-time workers, part-time workers or contractors.
    The table below shows the number of employees in each category and whether they walk to work or use some form of transport.
    \cline { 2 - 3 } \multicolumn{1}{c|}{}WalkTransport
    Full-time worker28
    Part-time worker3575
    Contractor3050
    The events \(F , H\) and \(C\) are that an employee is a full-time worker, part-time worker or contractor respectively. Let \(W\) be the event that an employee walks to work. An employee is selected at random.
    Find
  49. \(\mathrm { P } ( H )\)
  50. \(\mathrm { P } \left( [ F \cap W ] ^ { \prime } \right)\)
  51. \(\mathrm { P } ( W \mid C )\) Let \(B\) be the event that an employee uses the bus.
    Given that \(10 \%\) of full-time workers use the bus, \(30 \%\) of part-time workers use the bus and \(20 \%\) of contractors use the bus,
  52. draw a Venn diagram to represent the events \(F , H , C\) and \(B\),
  53. find the probability that a randomly selected employee uses the bus to travel to work. 4. The following table summarises the times, \(t\) minutes to the nearest minute, recorded for a group of students to complete an exam.
    Time (minutes) \(t\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 35\)\(36 - 45\)\(46 - 60\)
    Number of students f628816131110
    $$\text { [You may use } \sum \mathrm { f } t ^ { 2 } = 134281.25 \text { ] }$$
  54. Estimate the mean and standard deviation of these data.
  55. Use linear interpolation to estimate the value of the median.
  56. Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
  57. Estimate the interquartile range of this distribution.
  58. Give a reason why the mean and standard deviation are not the most appropriate summary statistics to use with these data. The person timing the exam made an error and each student actually took 5 minutes less than the times recorded above. The table below summarises the actual times.
    Time (minutes) \(t\)\(6 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 55\)
    Number of students f628816131110
  59. Without further calculations, explain the effect this would have on each of the estimates found in parts (a), (b), (c) and (d).
    1. A biased die with six faces is rolled. The discrete random variable \(X\) represents the score on the uppermost face. The probability distribution of \(X\) is shown in the table below.
    \(x\)123456
    \(\mathrm { P } ( X = x )\)\(a\)\(a\)\(a\)\(b\)\(b\)0.3
  60. Given that \(\mathrm { E } ( X ) = 4.2\) find the value of \(a\) and the value of \(b\).
  61. Show that \(\mathrm { E } \left( X ^ { 2 } \right) = 20.4\)
  62. Find \(\operatorname { Var } ( 5 - 3 X )\) A biased die with five faces is rolled. The discrete random variable \(Y\) represents the score which is uppermost. The cumulative distribution function of \(Y\) is shown in the table below.
    \(y\)12345
    \(\mathrm {~F} ( y )\)\(\frac { 1 } { 10 }\)\(\frac { 2 } { 10 }\)\(3 k\)\(4 k\)\(5 k\)
  63. Find the value of \(k\).
  64. Find the probability distribution of \(Y\). Each die is rolled once. The scores on the two dice are independent.
  65. Find the probability that the sum of the two scores equals 2
    1. The weight, in grams, of beans in a tin is normally distributed with mean \(\mu\) and standard deviation 7.8
    Given that \(10 \%\) of tins contain less than 200 g , find
  66. the value of \(\mu\)
  67. the percentage of tins that contain more than 225 g of beans. The machine settings are adjusted so that the weight, in grams, of beans in a tin is normally distributed with mean 205 and standard deviation \(\sigma\).
  68. Given that \(98 \%\) of tins contain between 200 g and 210 g find the value of \(\sigma\).
    \section*{Probability} $$\begin{aligned} & \mathrm { P } ( A \cup B ) = \mathrm { P } ( A ) + \mathrm { P } ( B ) - \mathrm { P } ( A \cap B )
    & \mathrm { P } ( A \cap B ) = \mathrm { P } ( A ) \mathrm { P } ( B \mid A )
    & \mathrm { P } ( A \mid B ) = \frac { \mathrm { P } ( B \mid A ) \mathrm { P } ( A ) } { \mathrm { P } ( B \mid A ) \mathrm { P } ( A ) + \mathrm { P } \left( B \mid A ^ { \prime } \right) \mathrm { P } \left( A ^ { \prime } \right) } \end{aligned}$$ \section*{Discrete distributions} For a discrete random variable \(X\) taking values \(x _ { i }\) with probabilities \(\mathrm { P } \left( X = x _ { i } \right)\)
    Expectation (mean): \(\mathrm { E } ( X ) = \mu = \Sigma x _ { i } \mathrm { P } \left( X = x _ { i } \right)\)
    Variance: \(\operatorname { Var } ( X ) = \sigma ^ { 2 } = \Sigma \left( x _ { i } - \mu \right) ^ { 2 } \mathrm { P } \left( X = x _ { i } \right) = \Sigma x _ { i } ^ { 2 } \mathrm { P } \left( X = x _ { i } \right) - \mu ^ { 2 }\)
    For a function \(\mathrm { g } ( X ) : \mathrm { E } ( \mathrm { g } ( X ) ) = \Sigma \mathrm { g } \left( x _ { i } \right) \mathrm { P } \left( X = x _ { i } \right)\) \section*{Continuous distributions} Standard continuous distribution:
    Distribution of \(X\)P.D.F.MeanVariance
    Normal \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\)\(\frac { 1 } { \sigma \sqrt { 2 \pi } } \mathrm { e } ^ { - \frac { 1 } { 2 } \left( \frac { x - \mu } { \sigma } \right) ^ { 2 } }\)\(\mu\)\(\sigma ^ { 2 }\)
    \section*{Correlation and regression} For a set of \(n\) pairs of values ( \(x _ { i } , y _ { i }\) ) $$\begin{aligned} & S _ { x x } = \Sigma \left( x _ { i } - \bar { x } \right) ^ { 2 } = \Sigma x _ { i } ^ { 2 } - \frac { \left( \Sigma x _ { i } \right) ^ { 2 } } { n }
    & S _ { y y } = \Sigma \left( y _ { i } - \bar { y } \right) ^ { 2 } = \Sigma y _ { i } ^ { 2 } - \frac { \left( \Sigma y _ { i } \right) ^ { 2 } } { n }
    & S _ { x y } = \Sigma \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) = \Sigma x _ { i } y _ { i } - \frac { \left( \Sigma x _ { i } \right) \left( \Sigma y _ { i } \right) } { n } \end{aligned}$$ The product moment correlation coefficient is $$r = \frac { S _ { x y } } { \sqrt { S _ { x x } S _ { y y } } } = \frac { \Sigma \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) } { \sqrt { \left\{ \Sigma \left( x _ { i } - \bar { x } \right) ^ { 2 } \right\} \left\{ \Sigma \left( y _ { i } - \bar { y } \right) ^ { 2 } \right\} } } = \frac { \Sigma x _ { i } y _ { i } - \frac { \left( \Sigma x _ { i } \right) \left( \Sigma y _ { i } \right) } { n } } { \sqrt { \left( \Sigma x _ { i } ^ { 2 } - \frac { \left( \Sigma x _ { i } \right) ^ { 2 } } { n } \right) \left( \Sigma y _ { i } ^ { 2 } - \frac { \left( \Sigma y _ { i } \right) ^ { 2 } } { n } \right) } }$$ The regression coefficient of \(y\) on \(x\) is \(b = \frac { S _ { x y } } { S _ { x x } } = \frac { \Sigma \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) } { \Sigma \left( x _ { i } - \bar { x } \right) ^ { 2 } }\) Least squares regression line of \(y\) on \(x\) is \(y = a + b x\) where \(a = \bar { y } - b \bar { x }\) \section*{THE NORMAL DISTRIBUTION FUNCTION} The function tabulated below is \(\Phi ( z )\), defined as \(\Phi ( z ) = \frac { 1 } { \sqrt { 2 \pi } } \int _ { - \infty } ^ { z } e ^ { - \frac { 1 } { 2 } t ^ { 2 } } \mathrm {~d} t\).
    \(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)
    0.000.50000.500.69151.000.84131.500.93322.000.9772
    0.010.50400.510.69501.010.84381.510.93452.020.9783
    0.020.50800.520.69851.020.84611.520.93572.040.9793
    0.030.51200.530.70191.030.84851.530.93702.060.9803
    0.040.51600.540.70541.040.85081.540.93822.080.9812
    0.050.51990.550.70881.050.85311.550.93942.100.9821
    0.060.52390.560.71231.060.85541.560.94062.120.9830
    0.070.52790.570.71571.070.85771.570.94182.140.9838
    0.080.53190.580.71901.080.85991.580.94292.160.9846
    0.090.53590.590.72241.090.86211.590.94412.180.9854
    0.100.53980.600.72571.100.86431.600.94522.200.9861
    0.110.54380.610.72911.110.86651.610.94632.220.9868
    0.120.54780.620.73241.120.86861.620.94742.240.9875
    0.130.55170.630.73571.130.87081.630.94842.260.9881
    0.140.55570.640.73891.140.87291.640.94952.280.9887
    0.150.55960.650.74221.150.87491.650.95052.300.9893
    0.160.56360.660.74541.160.87701.660.95152.320.9898
    0.170.56750.670.74861.170.87901.670.95252.340.9904
    0.180.57140.680.75171.180.88101.680.95352.360.9909
    0.190.57530.690.75491.190.88301.690.95452.380.9913
    0.200.57930.700.75801.200.88491.700.95542.400.9918
    0.210.58320.710.76111.210.88691.710.95642.420.9922
    0.220.58710.720.76421.220.88881.720.95732.440.9927
    0.230.59100.730.76731.230.89071.730.95822.460.9931
    0.240.59480.740.77041.240.89251.740.95912.480.9934
    0.250.59870.750.77341.250.89441.750.95992.500.9938
    0.260.60260.760.77641.260.89621.760.96082.550.9946
    0.270.60640.770.77941.270.89801.770.96162.600.9953
    0.280.61030.780.78231.280.89971.780.96252.650.9960
    0.290.61410.790.78521.290.90151.790.96332.700.9965
    0.300.61790.800.78811.300.90321.800.96412.750.9970
    0.310.62170.810.79101.310.90491.810.96492.800.9974
    0.320.62550.820.79391.320.90661.820.96562.850.9978
    0.330.62930.830.79671.330.90821.830.96642.900.9981
    0.340.63310.840.79951.340.90991.840.96712.950.9984
    0.350.63680.850.80231.350.91151.850.96783.000.9987
    0.360.64060.860.80511.360.91311.860.96863.050.9989
    0.370.64430.870.80781.370.91471.870.96933.100.9990
    0.380.64800.880.81061.380.91621.880.96993.150.9992
    0.390.65170.890.81331.390.91771.890.97063.200.9993
    0.400.65540.900.81591.400.91921.900.97133.250.9994
    0.410.65910.910.81861.410.92071.910.97193.300.9995
    0.420.66280.920.82121.420.92221.920.97263.350.9996
    0.430.66640.930.82381.430.92361.930.97323.400.9997
    0.440.67000.940.82641.440.92511.940.97383.500.9998
    0.450.67360.950.82891.450.92651.950.97443.600.9998
    0.460.67720.960.83151.460.92791.960.97503.700.9999
    0.470.68080.970.83401.470.92921.970.97563.800.9999
    0.480.68440.980.83651.480.93061.980.97613.901.0000
    0.490.68790.990.83891.490.93191.990.97674.001.0000
    0.500.69151.000.84131.500.93322.000.9772
    \section*{PERCENTAGE POINTS OF THE NORMAL DISTRIBUTION} The values \(z\) in the table are those which a random variable \(Z \sim N ( 0,1 )\) exceeds with probability \(p\); that is, \(\mathrm { P } ( \mathrm { Z } > \mathrm { z } ) = 1 - \Phi ( \mathrm { z } ) = p\).
    \(p\)\(z\)\(p\)\(z\)
    0.50000.00000.05001.6449
    0.40000.25330.02501.9600
    0.30000.52440.01002.3263
    0.20000.84160.00502.5758
    0.15001.03640.00103.0902
    0.10001.28160.00053.2905
OCR FS1 AS 2021 June Q2
23 marks
2 In the manufacture of fibre optical cable (FOC), flaws occur randomly. Whether any point on a cable is flawed is independent of whether any other point is flawed. The number of flaws in 100 m of FOC of standard diameter is denoted by \(X\).
  1. State a further assumption needed for \(X\) to be well modelled by a Poisson distribution. Assume now that \(X\) can be well modelled by the distribution \(\operatorname { Po } ( 0.7 )\).
  2. Find the probability that in 300 m of FOC of standard diameter there are exactly 3 flaws. The number of flaws in 100 m of FOC of a larger diameter has the distribution \(\mathrm { Po } ( 1.6 )\).
  3. Find the probability that in 200 m of FOC of standard diameter and 100 m of FOC of the larger diameter the total number of flaws is at least 4 . Judith believes that mathematical ability and chess-playing ability are related. She asks 20 randomly chosen chess players, with known British Chess Federation (BCF) ratings \(X\), to take a mathematics aptitude test, with scores \(Y\). The results are summarised as follows. $$n = 20 , \Sigma x = 3600 , \Sigma x ^ { 2 } = 660500 , \Sigma y = 1440 , \Sigma y ^ { 2 } = 105280 , \Sigma x y = 260990$$
  4. Calculate the value of Pearson's product-moment correlation coefficient \(r\).
  5. State an assumption needed to be able to carry out a significance test on the value of \(r\).
  6. Assume now that the assumption in part (ii) is valid. Test at the \(5 \%\) significance level whether there is evidence that chess players with higher BCF ratings are better at mathematics.
  7. There are two different grading systems for chess players, the BCF system and the international ELO system. The two sets of ratings are related by $$\text { ELO rating } = 8 \times \text { BCF rating } + 650$$ Magnus says that the experiment should have used ELO ratings instead of BCF ratings. Comment on Magnus's suggestion.
  8. Calculate the value of Pearson's product-moment correlation coefficient \(r\).
  9. State an assumption needed to be able to carry out a significance test on the value of \(r\).
  10. Assume now that the assumption in part (b) is valid. Test at the \(5 \%\) significance level whether there is evidence that chess players with higher BCF ratings are better at mathematics.
  11. There are two different grading systems for chess players, the BCF system and the international ELO system. The two sets of ratings are related by $$\mathrm { ELO } \text { rating } = 8 \# \mathrm { BCF } \text { rating } + 650 .$$ Magnus says that the experiment should have used ELO ratings instead of BCF ratings. Comment on Magnus's suggestion. An environmentalist measures the mean concentration, \(c\) milligrams per litre, of a particular chemical in a group of rivers, and the mean mass, \(m\) pounds, of fish of a certain species found in those rivers. The results are given in the table. \end{table}
    QuestionAnswerMarksAOGuidance
    1(a)\(\begin{aligned}0.25 + 0.36 + x + x ^ { 2 } = 1
    x ^ { 2 } + x - 0.39 = 0
    x = 0.3 \text { (or } - 1.3 \text { ) }
    x \text { cannot be negative }
    \mathrm { E } ( W ) = 2.23
    \mathrm { E } \left( W ^ { 2 } \right) = \Sigma w ^ { 2 } \mathrm { p } ( w ) \quad [ = 5.83 ]
    \text { Subtract } [ \mathrm { E } ( W ) ] ^ { 2 } \text { to get } \mathbf { 0 . 8 5 7 1 } \end{aligned}\)\(\begin{gathered} \text { M1 }
    \text { A1 }
    \text { A1 }
    \text { B1ft }
    \text { B1 }
    \text { M1 }
    \text { A1 }
    { [ 7 ] } \end{gathered}\)
    3.1a
    1.1b
    1.1b
    2.3
    1.1b
    1.1
    2.1
    Equation using \(\Sigma p = 1\)
    Correct simplified quadratic Correctly obtain \(x = 0.3\)
    Explicitly reject other solution
    2.23 or exact equivalent only Use \(\Sigma w ^ { 2 } \mathrm { p } ( w )\)
    Correctly obtain given answer, www
    Can be implied
    Method needed ft on their quadratic Allow for \(\mathrm { E } ( W ) ^ { 2 } = 4.9729\)
    Need 2.23 or 4.9729 and 5.83 or full numerical \(\Sigma w ^ { 2 } \mathrm { p } ( w )\)
    1(b)\(9 \times 0.8571 = 7.7139\)
    B1
    [1]
    1.1bAllow 7.71 or 7.714
    2(a)Flaws must occur at constant average rate (uniform rate)
    B1
    [1]
    1.2
    Context (e.g. "flaws") needed
    Extra answers, e.g. "singly": B0
    Not "constant rate" or "average constant rate".
    2(b)\(\operatorname { Po(2.1)~or~ } e ^ { - \lambda } \frac { \lambda ^ { 3 } } { 3 ! }\)
    M1
    A1
    [2]
    1.1
    1.1b
    Po(2.1) stated or implied, or formula with \(\lambda = 2.1\) stated Awrt 0.189
    2(c)
    Po(3)
    \(1 - \mathrm { P } ( \leq 3 )\)
    M1
    M1
    A1
    [3]
    1.1
    1.1
    1.1b
    \(\operatorname { Po } ( 2 \times 0.7 + 1.6 )\) stated or implied
    Allow \(1 - \mathrm { P } ( \leq 4 ) = 0.1847\), or from wrong \(\lambda\)
    Awrt 0.353
    Or all combinations \(\leq 3\)
    \(1 -\) above, not just \(= 3\)
    QuestionAnswerMarksAOGuidance
    3(a)0.4(00)
    B2
    [2]
    1.1
    1.1b
    SC: if B0, give SC B1 for two of \(S _ { x x } = 12500 , S _ { y y } = 1600 , S _ { x y } = 1790\) and \(S _ { x y } / \sqrt { } \left( S _ { x x } S _ { y y } \right)\)Also allow SC B1 for equivalent methods using Covariance \SDs
    3(b)Data needs to have a bivariate normal distribution
    B1
    [1]
    1.2Needs "bivariate normal" or clear equivalent. Not just "both normally distributed"Allow "scatter diagram forms ellipse"
    3(c)
    \(\mathrm { H } _ { 0 }\) : higher maths scores are not associated with higher BCF grading; \(\mathrm { H } _ { 1 }\) : positively associated
    CV 0.3783
    \(0.400 > 0.3783\) so reject \(\mathrm { H } _ { 0 }\)
    Significant evidence that higher maths scores are associated with higher BCF grading
    B1
    B1
    M1ft
    A1ft
    [4]
    2.5
    1.1b
    2.2b
    3.5a
    Needs context and clearly onetailed \(O R \rho\) used and defined Not "evidence that ..."
    Allow 0.378
    Reject/do not reject \(\mathrm { H } _ { 0 }\)
    Contextualised, not too definite Needn't say "positive" if \(\mathrm { H } _ { 1 } \mathrm { OK }\)
    SC 2-tail: B0; 0.4438, or 0.3783 B1; then M1A0
    \(\mathrm { H } _ { 0 } : \rho = 0 , \mathrm { H } _ { 1 } : \rho > 0\) where \(\rho\) is population pmcc (not \(r\) )
    FT on their \(r\), but not CV
    Not "scores are associated
    ...". FT on their \(r\) only
    3(d)It makes no difference as this is a linear transformation
    B1
    [1]
    2.2aNeed both "unchanged" oe and reason, need "linear" or exact equivalent"oe" includes "their 0.4"
    4(a)Neither
    B1
    [1]
    2.5OENot "neither is independent of the other"
    4(b)\(c = 2.848 - 0.1567 m \quad \mathbf { B C }\)
    B1
    B1
    B1
    [3]
    1.1
    1.1
    1.1
    Correct \(a\), awrt 2.85
    Correct \(b\), awrt 0.157
    Letters correct from correct method
    (If both wrongly rounded, e.g. \(c = 2.84 - 0.156 m\), give B2)
    \(\mathrm { SC } : m\) on \(c\) :
    \(m = 15.65 - 4.832 c\) : B2
    \(y = 15.65 - 4.832 x\) : B1
    \(c = 15.65 - 4.832 m : \mathrm { B } 1\)
    If B0B0, give B1 for correct letters from valid working
    QuestionAnswerMarksAOGuidance
    4(c)\(a\) unchanged, \(b\) multiplied by 2.2 (allow " \(a\) unchanged, \(b\) increases", etc)B1 [1]2.2aoe, e.g. \(c = 2.848 - 0.345 m\); \(m = 7.114 - 2.196 c\)SC: \(m\) on \(c\) in (b): Both divided by 2.2 B1
    4(d)
    Draw approximate line of best fit
    Draw at least one vertical from line to point
    Say that "Best fit" line minimises the sum of squares of these distances
    M1
    M1
    A1
    [3]
    1.1
    2.4
    2.4
    Needs M2 and "minimises" and "sums of squares" oe
    SC: Horizontal(s):
    full marks (indept of (b))