Identify outliers or unusual points

A question is this type if and only if it asks to identify outliers, errors, or unusual data points in bivariate data or scatter diagrams.

4 questions

OCR MEI AS Paper 2 2024 June Q5
5 The pre-release material contains information for countries in the world concerning real GDP per capita in US\$ and mobile phone subscribers per 100 population. In an investigation into the relationship between these two variables, a student takes a sample of 20 countries in Africa. The student draws a scatter diagram for the data, which is shown in Fig. 5.1. \section*{Fig. 5.1} \section*{Africa 1st sample} \includegraphics[max width=\textwidth, alt={}, center]{ce94c1ea-ffe5-42d0-8f8a-43c47105d6bf-4_433_1043_842_244}
  1. What does Fig. 5.1 suggest about the relationship between real GDP per capita and the number of mobile phone subscribers per 100 population? Another student collects a different sample of 20 countries from Africa, and draws a scatter diagram for the data, which is shown in Fig. 5.2. \section*{Fig. 5.2} \section*{Africa 2nd sample}
    \includegraphics[max width=\textwidth, alt={}]{ce94c1ea-ffe5-42d0-8f8a-43c47105d6bf-4_273_1084_1818_244}
    Mobile phone subscribers per 100 population
  2. What does Fig. 5.2 suggest about the relationship between real GDP per capita and the number of mobile phone subscribers per 100 population?
  3. Explain whether either of the two scatter diagrams is likely to be representative of the true relationship between real GDP per capita and the number of mobile phone subscribers per 100 population, for countries in Africa.
OCR FS1 AS 2017 December Q5
5 A shop manager recorded the maximum daytime temperature \(T ^ { \circ } \mathrm { C }\) and the number \(C\) of ice creams sold on 9 summer days. The results are given in the table and illustrated in the scatter diagram.
\(T\)172125262727293030
\(C\)211620383237353942
\includegraphics[max width=\textwidth, alt={}]{64d7ed6d-fadd-4c59-afb0-97d1788ba369-3_661_1189_1320_431}
$$n = 9 , \Sigma t = 232 , \Sigma c = 280 , \Sigma t ^ { 2 } = 6130 , \Sigma c ^ { 2 } = 9444 , \Sigma t c = 7489$$
  1. State, with a reason, whether one of the variables \(C\) or \(T\) is likely to be dependent upon the other.
  2. Calculate Pearson's product-moment correlation coefficient \(r\) for the data.
  3. State with a reason what the value of \(r\) would have been if the temperature had been measured in \({ } ^ { \circ } \mathrm { F }\) rather than \({ } ^ { \circ } \mathrm { C }\).
  4. Calculate the equation of the least squares regression line of \(c\) on \(t\).
  5. The regression line is drawn on the copy of the scatter diagram in the Printed Answer Booklet. Use this diagram to explain what is meant by "least squares".
AQA AS Paper 2 2018 June Q18
18 Jennie is a piano teacher who teaches nine pupils. She records how many hours per week they practice the piano along with their most recent practical exam score.
StudentPractice (hours per week)Practical exam score (out of 100)
Donovan5064
Vazquez671
Higgins355
Begum2.547
Collins180
Coldbridge461
Nedbalek4.565
Carter883
White1192
She plots a scatter diagram of this data, as shown below.
\includegraphics[max width=\textwidth, alt={}, center]{8d9ace4b-0c15-48bd-9b0d-302f57ea9759-20_862_1516_1434_262} 18
  1. Identify two possible outliers by name, giving a possible explanation for the position on the scatter diagram of each outlier. First outlier \(\_\_\_\_\)
    Possible reason \(\_\_\_\_\)
    Second outlier \(\_\_\_\_\)
    Possible reason \(\_\_\_\_\)
    18
  2. Jennie discards the two outliers.
    18
    1. Describe the correlation shown by the scatter diagram for the remaining points.
      18
  3. (ii) Interpret this correlation in the context of the question.
    In the past, he has found that \(70 \%\) of all seeds successfully germinate and grow into cucumber plants. He decides to try out a new brand of seed.
    The producer of this brand claims that these seeds are more likely to successfully germinate than other brands of seeds. Martin sows 20 of this new brand of seed and 18 successfully germinate.
    Carry out a hypothesis test at the \(5 \%\) level of significance to investigate the producer's claim.
AQA AS Paper 2 2021 June Q15
1 marks
15
The number of hours of sunshine and the daily maximum temperature were recorded over a 9-day period in June at an English seaside town. A scatter diagram representing the recorded data is shown below.
\includegraphics[max width=\textwidth, alt={}, center]{f87d1b36-26db-4a0b-b9ec-d7d82a396aba-20_872_1511_488_264} One of the points on the scatter diagram is an error. 15
    1. Write down the letter that identifies this point.
      15
  1. (ii) Suggest one possible action that could be taken to deal with this error.
    15
  2. It is claimed that the scatter diagram proves that longer hours of sunshine cause
    higher maximum daily temperatures. Comment on the validity of this claim.
    [0pt] [1 mark]