Edexcel S1 (Statistics 1)

Question 1
View details
  1. The students in a class were each asked to write down how many CDs they owned. The student with the least number of CDs had 14 and all but one of the others owned 60 or fewer. The remaining student owned 65 . The quartiles for the class were 30,34 and 42 respectively.
Outliers are defined to be any values outside the limits of \(1.5 \left( Q _ { 3 } - Q _ { 1 } \right)\) below the lower quartile or above the upper quartile. On graph paper draw a box plot to represent these data, indicating clearly any outliers.
(7 marks)
Question 2
View details
2. The random variable \(X\) is normally distributed with mean 177.0 and standard deviation 6.4.
  1. Find \(\mathrm { P } ( 166 < X < 185 )\).
    (4 marks)
    It is suggested that \(X\) might be a suitable random variable to model the height, in cm , of adult males.
  2. Give two reasons why this is a sensible suggestion.
    (2 marks)
  3. Explain briefly why mathematical models can help to improve our understanding of real-world problems.
    (2 marks)
Question 4
View details
4. The employees of a company are classified as management, administration or production. The following table shows the number employed in each category and whether or not they live close to the company or some distance away.
Live close
Live some
distance away
Management614
Administration2510
Production4525
An employee is chosen at random.
Find the probability that this employee
  1. is an administrator,
  2. lives close to the company, given that the employee is a manager. Of the managers, \(90 \%\) are married, as are \(60 \%\) of the administrators and \(80 \%\) of the production employees.
  3. Construct a tree diagram containing all the probabilities.
  4. Find the probability that an employee chosen at random is married. (3 marks) An employee is selected at random and found to be married.
  5. Find the probability that this employee is in production.
Question 5
View details
5. The following grouped frequency distribution summarises the number of minutes, to the nearest minute, that a random sample of 200 motorists were delayed by roadworks on a stretch of motorway.
Delay (mins)Number of motorists
\(4 - 6\)15
\(7 - 8\)28
949
1053
\(11 - 12\)30
\(13 - 15\)15
\(16 - 20\)10
  1. Using graph paper represent these data by a histogram.
  2. Give a reason to justify the use of a histogram to represent these data.
  3. Use interpolation to estimate the median of this distribution.
  4. Calculate an estimate of the mean and an estimate of the standard deviation of these data. One coefficient of skewness is given by $$\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } } .$$
  5. Evaluate this coefficient for the above data.
  6. Explain why the normal distribution may not be suitable to model the number of minutes that motorists are delayed by these roadworks.
Question 6
View details
6. A local authority is investigating the cost of reconditioning its incinerators. Data from 10 randomly chosen incinerators were collected. The variables monitored were the operating time \(x\) (in thousands of hours) since last reconditioning and the reconditioning cost \(y\) (in \(\pounds 1000\) ). None of the incinerators had been used for more than 3000 hours since last reconditioning. The data are summarised below, $$\Sigma x = 25.0 , \Sigma x ^ { 2 } = 65.68 , \Sigma y = 50.0 , \Sigma y ^ { 2 } = 260.48 , \Sigma x y = 130.64$$
  1. Find \(\mathrm { S } _ { x x } , \mathrm {~S} _ { x y } , \mathrm {~S} _ { y y }\).
  2. Calculate the product moment correlation coefficient between \(x\) and \(y\).
  3. Explain why this value might support the fitting of a linear regression model of the form \(y = a + b x\).
  4. Find the values of \(a\) and \(b\).
  5. Give an interpretation of \(a\).
  6. Estimate
    1. the reconditioning cost for an operating time of 2400 hours,
    2. the financial effect of an increase of 1500 hours in operating time.
  7. Suggest why the authority might be cautious about making a prediction of the reconditioning cost of an incinerator which had been operating for 4500 hours since its last reconditioning. Materials required for examination
    Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Items included with question papers
    Nil Paper Reference(s)
    6683 \section*{Edexcel GCE
    Statistics S1
    (New Syllabus)
    Advanced/Advanced Subsidiary
    Tuesday 12 June 2001 - Afternoon
    Time: 1 hour 30 minutes} Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. Pages 6, 7 and 8 are blank. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Each of the 25 students on a computer course recorded the number of minutes \(x\), to the nearest minute, spent surfing the internet during a given day. The results are summarised below.
    $$\Sigma x = 1075 , \Sigma x ^ { 2 } = 46625$$
  8. Find \(\mu\) and \(\sigma\) for these data. Two other students surfed the internet on the same day for 35 and 51 minutes respectively.
  9. Without further calculation, explain the effect on the mean of including these two students.
    2. On a particular day in summer 1993 at 0800 hours the height above sea level, \(x\) metres, and the temperature, \(y ^ { \circ } \mathrm { C }\), were recorded in 10 Mediterranean towns. The following summary statistics were calculated from the results. $$\Sigma x = 7300 , \Sigma x ^ { 2 } = 6599600 , S _ { x y } = - 13060 , S _ { y y } = 140.9 .$$
  10. Find \(S _ { x x }\).
  11. Calculate, to 3 significant figures, the product moment correlation coefficient between \(x\) and \(y\).
  12. Give an interpretation of your coefficient.
    3. The continuous random variable \(Y\) is normally distributed with mean 100 and variance 256 .
  13. Find \(\mathrm { P } ( Y < 80 )\).
  14. Find \(k\) such that \(\mathrm { P } ( 100 - k \leq Y \leq 100 + k ) = 0.516\).
    4. The discrete random variable \(X\) has the probability function shown in the table below.
    \(x\)- 2- 10123
    \(\mathrm { P } ( X = x )\)0.1\(\alpha\)0.30.20.10.1
    Find
  15. \(\alpha\),
  16. \(\mathrm { P } ( - 1 < X \leq 2 )\),
  17. \(\mathrm { F } ( - 0.4 )\),
  18. \(\mathrm { E } ( 3 X + 4 )\),
  19. \(\operatorname { Var } ( 2 X + 3 )\).
    5. A market researcher asked 100 adults which of the three newspapers \(A , B , C\) they read. The results showed that \(30 \operatorname { read } A , 26\) read \(B , 21\) read \(C , 5\) read both \(A\) and \(B\), 7 read both \(B\) and \(C\), 6 read both \(C\) and \(A\) and 2 read all three.
  20. Draw a Venn diagram to represent these data. One of the adults is then selected at random.
    Find the probability that she reads
  21. at least one of the newspapers,
  22. only \(A\),
  23. only one of the newspapers,
  24. \(A\) given that she reads only one newspaper.
    6. Three swimmers Alan, Diane and Gopal record the number of lengths of the swimming pool they swim during each practice session over several weeks. The stem and leaf diagram below shows the results for Alan.
    Lengths\(2 \mid 0\) means 20
    20122\(( 4 )\)
    25567789\(( 7 )\)
    301224\(( 5 )\)
    356679\(( 5 )\)
    40133333444\(( 10 )\)
    4556667788999\(( 12 )\)
    5000\(( 3 )\)
  25. Find the three quartiles for Alan's results. The table below summarises the results for Diane and Gopal.
    DianeGopal
    Smallest value3525
    Lower quartile3734
    Median4242
    Upper quartile5350
    Largest value6557
  26. Using the same scale and on the same sheet of graph paper draw box plots to represent the data for Alan, Diane and Gopal.
  27. Compare and contrast the three box plots.
Question 7
View details
7. A music teacher monitored the sight-reading ability of one of her pupils over a 10 week period. At the end of each week, the pupil was given a new piece to sight-read and the teacher noted the number of errors \(y\). She also recorded the number of hours \(x\) that the pupil had practised each week. The data are shown in the table below.
  1. Given that \(\mathrm { E } ( X ) = - 0.2\), find the value of \(\alpha\) and the value of \(\beta\).
  2. Write down \(\mathrm { F } ( 0.8 )\).
  3. Evaluate \(\operatorname { Var } ( X )\). Find the value of
  4. \(\mathrm { E } ( 3 X - 2 )\),
  5. \(\operatorname { Var } ( 2 X + 6 )\).
    7. The following stem and leaf diagram shows the aptitude scores \(x\) obtained by all the applicants for a particular job. 3| 1 means 31 \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Aptitude score}
    3129
    424689
    51335679
    60133356889
    71222455568888
    801235889
    9012
    9
    \end{table}
  6. Write down the modal aptitude score.
  7. Find the three quartiles for these data. Outliers can be defined to be outside the limits \(\mathrm { Q } _ { 1 } - 1.0 \left( \mathrm { Q } _ { 3 } - \mathrm { Q } _ { 1 } \right)\) and \(\mathrm { Q } _ { 3 } + 1.0 \left( \mathrm { Q } _ { 3 } - \mathrm { Q } _ { 1 } \right)\).
  8. On a graph paper, draw a box plot to represent these data. For these data, \(\Sigma x = 3363\) and \(\Sigma x ^ { 2 } = 238305\).
  9. Calculate, to 2 decimal places, the mean and the standard deviation for these data.
  10. Use two different methods to show that these data are negatively skewed. END \section*{Advanced/Advanced Subsidiary} \section*{Wednesday 15 January 2003 - Morning} Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic Nil algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. Pages 6, 7 and 8 are blank. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. The total amount of time a secretary spent on the telephone in a working day was recorded to the nearest minute. The data collected over 40 days are summarised in the table below.
    Draw a histogram to illustrate these data.
    2. The lifetimes of batteries used for a computer game have a mean of 12 hours and a standard deviation of 3 hours. Battery lifetimes may be assumed to be normally distributed. Find the lifetime, \(t\) hours, of a battery such that 1 battery in 5 will have a lifetime longer than \(t\).
    3. A company owns two petrol stations \(P\) and \(Q\) along a main road. Total daily sales in the same week for \(P ( \pounds p )\) and for \(Q ( \pounds q )\) are summarised in the table below. A science teacher believes that students' marks in physics depend upon their mathematical ability. The teacher decides to investigate this relationship using the test marks.
  11. Write down which is the explanatory variable in this investigation.
  12. Draw a scatter diagram to illustrate these data.
  13. Showing your working, find the equation of the regression line of \(p\) on \(m\).
  14. Draw the regression line on your scatter diagram. A ninth student was absent for the physics test, but she sat the mathematics test and scored 15 .
  15. Using this model, estimate the mark she would have scored in the physics test. \section*{END} Materials required for examination
    Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac)
    Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
    6683 \section*{Advanced/Advanced Subsidiary} \section*{Tuesday 4 November 2003 - Morning} In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has six questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. A company wants to pay its employees according to their performance at work. The performance score \(x\) and the annual salary, \(y\) in \(\pounds 100\) s, for a random sample of 10 of its employees for last year were recorded. The results are shown in the table below.
    [You may use \(\left. \Sigma h ^ { 2 } = 272094 , \Sigma c ^ { 2 } = 2878966 , \Sigma h c = 884484 \right]\)
  16. Draw a scatter diagram to illustrate these data.
  17. Find exact values of \(S _ { h c } S _ { h h }\) and \(S _ { c c }\).
  18. Calculate the value of the product moment correlation coefficient for these data.
  19. Give an interpretation of your correlation coefficient.
  20. Calculate the equation of the regression line of \(c\) on \(h\) in the form \(c = a + b h\).
  21. Estimate the level of confidence of a person of height 180 cm .
  22. State the range of values of \(h\) for which estimates of \(c\) are reliable. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has six questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    3. A discrete random variable \(X\) has a probability function as shown in the table below, where \(a\) and \(b\) are constants. Key: \(0 \quad | 18 | 4\) means 180 for Keith and 184 for Asif
    The quartiles for these two distributions are summarised in the table below. \(\quad 00\) means 10 Totals
  23. Find the three quartiles of these data. During the same month, the least number of caravans on Northcliffe caravan site was 31. The maximum number of caravans on this site on any night that month was 72 . The three quartiles for this site were 38, 45 and 52 respectively.
  24. On graph paper and using the same scale, draw box plots to represent the data for both caravan sites. You may assume that there are no outliers.
  25. Compare and contrast these two box plots.
  26. Give an interpretation to the upper quartiles of these two distributions.
    3. The following table shows the height \(x\), to the nearest cm , and the weight \(y\), to the nearest kg , of a random sample of 12 students.
  27. Give a reason to justify the use of a histogram to represent these data.
  28. Calculate the frequency densities needed to draw a histogram for these data.
    (DO NOT DRAW THE HISTOGRAM)
  29. Use interpolation to estimate the median \(Q _ { 2 }\), the lower quartile \(Q _ { 1 }\), and the upper quartile \(Q _ { 3 }\) of these data. The mid-point of each class is represented by \(x\) and the corresponding frequency by \(f\). Calculations then give the following values $$\sum f x = 8379.5 \text { and } \sum f x ^ { 2 } = 557489.75$$
  30. Calculate an estimate of the mean and an estimate of the standard deviation for these data. One coefficient of skewness is given by $$\frac { Q _ { 3 } - 2 Q _ { 2 } + Q _ { 1 } } { Q _ { 3 } - Q _ { 1 } } .$$
  31. Evaluate this coefficient and comment on the skewness of these data.
  32. Give another justification of your comment in part (e).
    3. A long distance lorry driver recorded the distance travelled, \(m\) miles, and the amount of fuel used, \(f\) litres, each day. Summarised below are data from the driver's records for a random sample of 8 days. The data are coded such that \(x = m - 250\) and \(y = f - 100\). $$\sum x = 130 \quad \sum y = 48 \quad \sum x y = 8880 \quad S _ { x x } = 20487.5$$
  33. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\).
  34. Hence find the equation of the regression line of \(f\) on \(m\).
  35. Predict the amount of fuel used on a journey of 235 miles.
    4. Aeroplanes fly from City \(A\) to City \(B\). Over a long period of time the number of minutes delay in take-off from City \(A\) was recorded. The minimum delay was 5 minutes and the maximum delay was 63 minutes. A quarter of all delays were at most 12 minutes, half were at most 17 minutes and \(75 \%\) were at most 28 minutes. Only one of the delays was longer than 45 minutes. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
  36. On graph paper, draw a box plot to represent these data.
  37. Comment on the distribution of delays. Justify your answer.
  38. Suggest how the distribution might be interpreted by a passenger who frequently flies from City \(A\) to City \(B\).
    5. The random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \begin{cases} k x , & x = 1,2,3
    k ( x + 1 ) , & x = 4,5 \end{cases}$$ where \(k\) is a constant.
  39. Find the value of \(k\).
  40. Find the exact value of \(\mathrm { E } ( X )\).
  41. Show that, to 3 significant figures, \(\operatorname { Var } ( X ) = 1.47\).
  42. Find, to 1 decimal place, \(\operatorname { Var } ( 4 - 3 X )\).
    6. A scientist found that the time taken, \(M\) minutes, to carry out an experiment can be modelled by a normal random variable with mean 155 minutes and standard deviation 3.5 minutes. Find
  43. \(\mathrm { P } ( M > 160 )\),
  44. \(\mathrm { P } ( 150 \leq M \leq 157 )\),
  45. the value of \(m\), to 1 decimal place, such that \(\mathrm { P } ( M \leq m ) = 0.30\).
    7. In a school there are 148 students in Years 12 and 13 studying Science, Humanities or Arts subjects. Of these students, 89 wear glasses and the others do not. There are 30 Science students of whom 18 wear glasses. The corresponding figures for the Humanities students are 68 and 44 respectively. A student is chosen at random.
    Find the probability that this student
  46. is studying Arts subjects,
  47. does not wear glasses, given that the student is studying Arts subjects. Amongst the Science students, \(80 \%\) are right-handed. Corresponding percentages for Humanities and Arts students are \(75 \%\) and \(70 \%\) respectively. A student is again chosen at random.
  48. Find the probability that this student is right-handed. \section*{TOTAL FOR PAPER:75 MARKS}
  49. Given that this student is right-handed, find the probability that the student is studying Science subjects. Materials required for examination
    Mathematical Formulae (Green or Lilac) Items included with question papers
    Nil Paper Reference(s)
    6683/01 \section*{Advanced/Advanced Subsidiary} Monday 16 January 2006 - Morning
    Time: 1 hour 30 minutes Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2). There are 7 questions on this paper. The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Over a period of time, the number of people \(x\) leaving a hotel each morning was recorded. These data are summarised in the stem and leaf diagram below.
    Two of the conversations were chosen at random.
  50. Find the probability that both of them were longer than 24.5 minutes. The mid-point of each class was represented by \(x\) and its corresponding frequency by \(f\), giving \(\sum f x = 1060\).
  51. Calculate an estimate of the mean time spent on their conversations. During the following 25 weeks they monitored their weekly conversation and found that at the end of the 80 weeks their overall mean length of conversation was 21 minutes.
  52. Find the mean time spent in conversation during these 25 weeks.
  53. Comment on these two mean values.
    3. A metallurgist measured the length, \(l \mathrm {~mm}\), of a copper rod at various temperatures, \(t ^ { \circ } \mathrm { C }\), and recorded the following results. (You may use: \(\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750\) )
  54. Draw a scatter diagram to represent these data.
  55. Show that \(S _ { x y } = 4337.5\) and find \(S _ { x x }\). The student believes that a linear relationship of the form \(y = a + b x\) could be used to describe these data.
  56. Use linear regression to find the value of \(a\) and the value of \(b\), giving your answers to 1 decimal place.
  57. Draw the regression line on your diagram. The student believes that one brand of chocolate is overpriced.
  58. Use the scatter diagram to
    1. state which brand is overpriced,
    2. suggest a fair price for this brand. Give reasons for both your answers.
      4. A survey of the reading habits of some students revealed that, on a regular basis, \(25 \%\) read quality newspapers, \(45 \%\) read tabloid newspapers and \(40 \%\) do not read newspapers at all.
  59. Find the proportion of students who read both quality and tabloid newspapers.
  60. Draw a Venn diagram to represent this information. A student is selected at random. Given that this student reads newspapers on a regular basis,
  61. find the probability that this student only reads quality newspapers.
    5. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-046_494_926_258_1706} \captionsetup{labelformat=empty} \caption{Figure 2}
    \end{figure} Figure 2 shows a histogram for the variable \(t\) which represents the time taken, in minutes, by a group of people to swim 500 m .
  62. Copy and complete the frequency table for \(t\). $$\text { [You may use } \sum x ^ { 2 } = 60475 , \sum y ^ { 2 } = 53122 , \sum x y = 56076 \text { ] }$$
  63. Showing your working clearly, calculate the product moment correlation coefficient between the interview test and the performance after one year. The product moment correlation coefficient between the skills assessment and the performance after one year is - 0.156 to 3 significant figures.
  64. Use your answer to part (a) to comment on whether or not the interview test and skills assessment are a guide to the performance after one year. Give clear reasons for your answers.
    2. Cotinine is a chemical that is made by the body from nicotine which is found in cigarette smoke. A doctor tested the blood of 12 patients, who claimed to smoke a packet of cigarettes a day, for cotinine. The results, in appropriate units, are shown below. For the Balmoral Hotel,
  65. write down the mode of the age of the residents,
  66. find the values of the lower quartile, the median and the upper quartile.
    1. Find the mean, \(\bar { x }\), of the age of the residents.
    2. Given that \(\sum _ { x } x ^ { 2 } = 81213\), find the standard deviation of the age of the residents. One measure of skewness is found using $$\frac { \text { mean - mode } } { \text { standard deviation } }$$
  67. Evaluate this measure for the Balmoral Hotel. For the Abbey Hotel, the mode is 39 , the mean is 33.2 , the standard deviation is 12.7 and the measure of skewness is - 0.454 .
  68. Compare the two age distributions of the residents of each hotel.
    3. The random variable \(X\) has probability distribution given in the table below. A histogram was drawn and the bar representing the \(10 - 15\) class has a width of 2 cm and a height of 5 cm . For the 16-18 class find
  69. the width,
  70. the height
    of the bar representing this class.
    4. A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The lengths are summarised in the table below.
    Foot length, \(\boldsymbol { l } , ( \mathbf { c m } )\)Number of children
    \(10 \leq l < 12\)5
    \(12 \leq l < 17\)53
    \(17 \leq l < 19\)29
    \(19 \leq l < 21\)15
    \(21 \leq l < 23\)11
    \(23 \leq l < 25\)7
  71. Use interpolation to estimate the median of this distribution.
  72. Calculate estimates for the mean and the standard deviation of these data. One measure of skewness is given by $$\text { Coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  73. Evaluate this coefficient and comment on the skewness of these data. Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old children.
  74. Using the value found in part (c), comment on Greg's suggestion, giving a reason for your answer.
    5. The weight, \(w\) grams, and the length, \(l \mathrm {~mm}\), of 10 randomly selected newborn turtles are given in the table below.
    \(l\)49.052.053.054.554.153.450.051.649.551.2
    \(w\)29323439383530312930
    $$\text { (You may use } S _ { l l } = 33.381 \quad S _ { w l } = 59.99 \quad S _ { w w } = 120.1 \text { ) }$$
  75. Find the equation of the regression line of \(w\) on \(l\) in the form \(w = a + b l\).
  76. Use your regression line to estimate the weight of a newborn turtle of length 60 mm .
  77. Comment on the reliability of your estimate giving a reason for your answer.
    6. The discrete random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \left\{ \begin{array} { c l } a ( 3 - x ) & x = 0,1,2
    b & x = 3 \end{array} \right.$$
  78. Find \(\mathrm { P } ( X = 2 )\) and copy and complete the table below.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(3 a\)\(2 a\)\(b\)
    Given that \(\mathrm { E } ( X ) = 1.6\),
  79. find the value of \(a\) and the value of \(b\). Find
  80. \(\mathrm { P } ( 0.5 < X < 3 )\),
  81. \(\mathrm { E } ( 3 X - 2 )\).
  82. Show that the \(\operatorname { Var } ( X ) = 1.64\)
  83. Calculate \(\operatorname { Var } ( 3 X - 2 )\).
    7. (a) Given that \(\mathrm { P } ( A ) = a\) and \(\mathrm { P } ( B ) = b\) express \(\mathrm { P } ( A \cup B )\) in terms of \(a\) and \(b\) when
    1. \(A\) and \(B\) are mutually exclusive,
    2. \(A\) and \(B\) are independent. Two events \(R\) and \(Q\) are such that $$\mathrm { P } ( R \cap Q \square ) = 0.15 , \quad \mathrm { P } ( Q ) = 0.35 \quad \text { and } \quad \mathrm { P } ( R \mid Q ) = 0.1$$ Find the value of
  84. \(\mathrm { P } ( R \cup Q )\),
  85. \(\mathrm { P } ( R \cap Q )\),
  86. \(\mathrm { P } ( R )\).
Question 8
View details
8. The lifetimes of bulbs used in a lamp are normally distributed. A company \(X\) sells bulbs with a mean lifetime of 850 hours and a standard deviation of 50 hours.
  1. Find the probability of a bulb, from company \(X\), having a lifetime of less than 830 hours.
  2. In a box of 500 bulbs, from company \(X\), find the expected number having a lifetime of less than 830 hours. A rival company \(Y\) sells bulbs with a mean lifetime of 860 hours and \(20 \%\) of these bulbs have a lifetime of less than 818 hours.
  3. Find the standard deviation of the lifetimes of bulbs from company \(Y\). Both companies sell the bulbs for the same price.
  4. State which company you would recommend. Give reasons for your answer. Mathematical Formulae (Pink or Green) Nil \section*{Wednesday 13 January 2010 - Afternoon} Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulae stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2). There are 7 questions on this paper. The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. A jar contains 2 red, 1 blue and 1 green bead. Two beads are drawn at random from the jar without replacement.
    2. Draw a tree diagram to illustrate all the possible outcomes and associated probabilities. State your probabilities clearly.
    3. Find the probability that a blue bead and a green bead are drawn from the jar.
    4. The 19 employees of a company take an aptitude test. The scores out of 40 are illustrated in the stem and leaf diagram below.
    where \(a\) is a constant.
  5. Find the value of \(a\).
  6. Write down \(\mathrm { E } ( X )\).
  7. Find \(\operatorname { Var } ( X )\). The random variable \(Y = 6 - 2 X\).
  8. Find \(\operatorname { Var } ( Y )\).
  9. Calculate \(\mathrm { P } ( X \geq Y )\). Shivani selects a ball and spins the appropriate coin.
  10. Find the probability that she obtains a head. Given that Tom selected a ball at random and obtained a head when he spun the appropriate coin,
  11. find the probability that Tom selected a red ball. Shivani and Tom each repeat this experiment.
  12. Find the probability that the colour of the ball Shivani selects is the same as the colour of the ball Tom selects.
    4. The Venn diagram in Figure 1 shows the number of students in a class who read any of 3 popular magazines \(A , B\) and \(C\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-064_264_615_360_443} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} One of these students is selected at random.
  13. Show that the probability that the student reads more than one magazine is \(\frac { 1 } { 6 }\).
  14. Find the probability that the student reads \(A\) or \(B\) (or both).
  15. Write down the probability that the student reads both \(A\) and \(C\). Given that the student reads at least one of the magazines,
  16. find the probability that the student reads \(C\).
  17. Determine whether or not reading magazine \(B\) and reading magazine \(C\) are statistically independent.
    5. A teacher selects a random sample of 56 students and records, to the nearest hour, the time spent watching television in a particular week. where \(a , b\) and \(c\) are constants.
    The cumulative distribution function \(\mathrm { F } ( y )\) of \(Y\) is given in the following table.
  18. Estimate the number of motorists who were delayed between 8.5 and 13.5 minutes by the roadworks.
    (2)
    2. (a) State in words the relationship between two events \(R\) and \(S\) when \(\mathrm { P } ( R \cap S ) = 0\). The events \(A\) and \(B\) are independent with \(\mathrm { P } ( A ) = \frac { 1 } { 4 }\) and \(\mathrm { P } ( A \cup B ) = \frac { 2 } { 3 }\).
    Find
  19. \(\mathrm { P } ( B )\),
  20. \(\mathrm { P } \left( A ^ { \prime } \cap B \right)\),
  21. \(\mathrm { P } \left( B ^ { \prime } \mid A \right)\).
    3. The discrete random variable \(X\) can take only the values \(2,3,4\) or 6 . For these values the probability distribution function is given by [You may use \(\sum p ^ { 2 } = 1967\) and \(\sum p t = 694\) ]
  22. On graph paper, draw a scatter diagram to represent these data.
  23. Explain why a linear regression model may be appropriate to describe the relationship between \(p\) and \(t\).
  24. Calculate the value of \(S _ { p t }\) and the value of \(S _ { p p }\).
  25. Find the equation of the regression line of \(t\) on \(p\), giving your answer in the form \(t = a + b p\).
  26. Plot the point \(( \bar { p } , \bar { t } )\) and draw the regression line on your scatter diagram. The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
  27. Estimate the minimum thinning of the shell that is likely to result in the death of a chick.
    4. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-077_401_741_296_1761} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} Figure 1 shows how 25 people travelled to work.
    Their travel to work is represented by the events
    B bicycle
    \(T \quad\) train
    \(W\) walk
  28. Write down 2 of these events that are mutually exclusive. Give a reason for your answer.
  29. Determine whether or not \(B\) and \(T\) are independent events. One person is chosen at random.
    Find the probability that this person
  30. walks to work,
  31. travels to work by bicycle and train. Given that this person travels to work by bicycle,
  32. find the probability that they will also take the train.
    5. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-078_618_812_301_315} \captionsetup{labelformat=empty} \caption{Figure 2}
    \end{figure} A policeman records the speed of the traffic on a busy road with a 30 mph speed limit.
    He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results.
  33. Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample.
  34. Estimate the value of the mean speed of the cars in the sample.
  35. Estimate, to 1 decimal place, the value of the median speed of the cars in the sample.
  36. Comment on the shape of the distribution. Give a reason for your answer.
  37. State, with a reason, whether the estimate of the mean or the median is a better representation of the average speed of the traffic on the road.
    6. The heights of an adult female population are normally distributed with mean 162 cm and standard deviation 7.5 cm .
  38. Find the probability that a randomly chosen adult female is taller than 150 cm .
    (3) Sarah is a young girl. She visits her doctor and is told that she is at the 60th percentile for height.
  39. Assuming that Sarah remains at the 60th percentile, estimate her height as an adult.
    (3) The heights of an adult male population are normally distributed with standard deviation 9.0 cm .
    Given that \(90 \%\) of adult males are taller than the mean height of adult females,
  40. find the mean height of an adult male.
    (4)
    7. A manufacturer carried out a survey of the defects in their soft toys. It is found that the probability of a toy having poor stitching is 0.03 and that a toy with poor stitching has a probability of 0.7 of splitting open. A toy without poor stitching has a probability of 0.02 of splitting open.
  41. Draw a tree diagram to represent this information.
    (3)
  42. Find the probability that a randomly chosen soft toy has exactly one of the two defects, poor stitching or splitting open.
    (3) The manufacturer also finds that soft toys can become faded with probability 0.05 and that this defect is independent of poor stitching or splitting open. A soft toy is chosen at random.
  43. Find the probability that the soft toy has none of these 3 defects.
  44. Find the probability that the soft toy has exactly one of these 3 defects. \section*{Advanced Level} \section*{Friday 18 January 2013 - Afternoon} Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has 7 questions.
    The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner.
    Answers without working may not gain full credit. \section*{P41805A}
    1. A teacher asked a random sample of 10 students to record the number of hours of television, \(t\), they watched in the week before their mock exam. She then calculated their grade, \(g\), in their mock exam. The results are summarised as follows.
    $$\sum t = 258 \quad \sum t ^ { 2 } = 8702 \quad \sum g = 63.6 \quad \mathrm {~S} _ { g g } = 7.864 \quad \sum g t = 1550.2$$
  45. Find \(\mathrm { S } _ { t t }\) and \(\mathrm { S } _ { g t }\).
  46. Calculate, to 3 significant figures, the product moment correlation coefficient between \(t\) and \(g\). The teacher also recorded the number of hours of revision, \(v\), these 10 students completed during the week before their mock exam. The correlation coefficient between \(t\) and \(v\) was - 0.753 .
  47. Describe, giving a reason, the nature of the correlation you would expect to find between \(v\) and \(g\).
    2. The discrete random variable \(X\) can take only the values 1,2 and 3 . For these values the cumulative distribution function is defined by $$\mathrm { F } ( x ) = \frac { x ^ { 3 } + k } { 40 } , \quad x = 1,2,3 .$$
  48. Show that \(k = 13\).
  49. Find the probability distribution of \(X\). Given that \(\operatorname { Var } ( X ) = \frac { 259 } { 320 }\),
  50. find the exact value of \(\operatorname { Var } ( 4 X - 5 )\).
    3. A biologist is comparing the intervals ( \(m\) seconds) between the mating calls of a certain species of tree frog and the surrounding temperature ( \(t ^ { \circ } \mathrm { C }\) ). The following results were obtained. [You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716 , \sum t h = 64980\) and \(\mathrm { S } t t = 371.56\) ]
  51. Calculate \(\mathrm { S } _ { \text {th } }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  52. Calculate the product moment correlation coefficient for this data.
  53. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  54. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  55. Interpret the value of \(b\).
  56. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
    2. The marks of a group of female students in a statistics test are summarised in Figure 1. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-083_412_723_331_390} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile
    or more than \(1.5 \times\) interquartile range below the lower quartile.
  57. On graph paper draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  58. Compare and contrast the marks of the male and the female students.
    (2) \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{(a) Write down the mark which is exceeded by \(75 \%\) of the female students.
    (1)
  59. Write down the mark which is exceeded by \(75 \%\) of the female students.
    The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.}
    Mark\(( 2 \mid 6\) means \(26 )\)Totals
    14\(( 1 )\)
    26\(( 1 )\)
    3447\(( 3 )\)
    4066778\(( 6 )\)
    5001113677\(( 9 )\)
    6223338\(( 6 )\)
    7008\(( 3 )\)
    85\(( 1 )\)
    90\(( 1 )\)
\end{table}
  • Find the median and interquartile range of the marks of the male students.
    3. In a company the 200 employees are classified as full-time workers, part-time workers or contractors. The table below shows the number of employees in each category and whether they walk to work or use some form of transport.
  • Find the probability distribution of \(X\).
  • Write down the value of \(\mathrm { F } ( 1.8 )\).
    3. An agriculturalist is studying the yields, \(y \mathrm {~kg}\), from tomato plants. The data from a random sample of 70 tomato plants are summarised below. [You may assume that \(\Sigma h = 7150 , \Sigma t = 110 , \Sigma h ^ { 2 } = 7171500 , \Sigma t ^ { 2 } = 1716 , \Sigma t h = 64980\) and \(\mathrm { S } t t = 371.56\) ]
  • Calculate \(\mathrm { S } _ { \text {th } }\) and \(\mathrm { S } _ { \text {hh } }\). Give your answers to 3 significant figures.
  • Calculate the product moment correlation coefficient for this data.
  • State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  • Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  • Interpret the value of \(b\).
  • Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
    (2)
    2. The marks of a group of female students in a statistics test are summarised in Figure 1. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-090_417_732_328_1802} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure}
  • Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below. (You may use \(\Sigma c = 111 , \Sigma c ^ { 2 } = 2375 , \Sigma s = 21 , \Sigma s ^ { 2 } = 79 , \Sigma c s = 380 , \mathrm {~S} _ { c c } = 321.5\).)
  • Calculate the value of \(\mathrm { S } _ { c s }\) and the value of \(\mathrm { S } _ { s s }\).
  • Calculate the product moment correlation coefficient for these data. Brad is not satisfied with his current internet service and decides to change his provider. He decides to pay a lot more for his new internet service.
  • On the basis of your calculation in part (b), comment on Brad's decision. Give a reason for your answer.
    2. A rugby club coach uses club records to take a random sample of 15 players from 1990 and an independent random sample of 15 players from 2010. The body weight of each player was recorded to the nearest kg and the results from 2010 are summarised in the table below. [You may use \(\Sigma x = 370 , \mathrm {~S} _ { x x } = 2587.5 , \Sigma y = 560 , \Sigma y ^ { 2 } = 39418 , \mathrm {~S} _ { x y } = - 710\) ]
  • Calculate \(\mathrm { S } _ { y y }\).
  • Calculate the product moment correlation coefficient for these data.
  • Interpret your value of the correlation coefficient. The researcher believes that a linear regression model may be appropriate to describe these data.
  • State, giving a reason, whether or not your value of the correlation coefficient supports the researcher's belief.
    (1)
  • Find the equation of the regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\). Jack is a 40-year-old patient.
    1. Use your regression line to estimate the volume of blood pumped by each contraction of Jack's heart.
    2. Comment, giving a reason, on the reliability of your estimate.
      2. The table below shows the distances (to the nearest km ) travelled to work by the 50 employees in an office.
  • Show that \(p = 0.2\). Find
  • \(\mathrm { E } ( X )\)
  • \(\mathrm { F } ( 0 )\)
  • \(\mathrm { P } ( 3 X + 2 > 5 )\) Given that \(\operatorname { Var } ( X ) = 13.35\),
  • find the possible values of \(a\) such that \(\operatorname { Var } ( a X + 3 ) = 53.4\).
    2. The discrete random variable \(X\) has probability distribution $$\mathrm { P } ( X = x ) = \frac { 1 } { 10 } \quad x = 1,2,3 , \ldots 10$$
  • Write down the name given to this distribution.
  • Write down the value of
    1. \(\mathrm { P } ( X = 10 )\)
    2. \(\mathrm { P } ( X < 10 )\) The continuous random variable \(Y\) has the normal distribution \(\mathrm { N } \left( 10,2 ^ { 2 } \right)\).
  • Write down the value of
    1. \(\mathrm { P } ( Y = 10 )\)
    2. \(\mathrm { P } ( Y < 10 )\)
      3. A large company is analysing how much money it spends on paper in its offices every year. The number of employees, \(x\), and the amount of money spent on paper, \(p\) ( \(\pounds\) hundreds), in 8 randomly selected offices are given in the table below. Key: 7|3| means 37 years for Greenslax and 31 years for Penville
      Some of the quartiles for these two distributions are given in the table below. You may use
      \(S _ { v v } = 42587.5\)
      \(S _ { v m } = 31512.5\)
      \(S _ { m m } = 25187.5\)
      \(\Sigma v = 19390\)
      \(\Sigma m = 10610\)
  • Find the product moment correlation coefficient between \(m\) and \(v\).
  • Give a reason to support fitting a regression model of the form \(m = a + b v\) to these data.
  • Find the value of \(b\) correct to 3 decimal places.
  • Find the equation of the regression line of \(m\) on \(v\).
  • Interpret your value of \(b\).
  • Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000.
  • Comment on the reliability of your estimate in part (f). Give a reason for your answer.
    4. In a factory, three machines, \(J , K\) and \(L\), are used to make biscuits. Machine \(J\) makes \(25 \%\) of the biscuits.
    Machine \(K\) makes \(45 \%\) of the biscuits.
    The rest of the biscuits are made by machine \(L\).
    It is known that \(2 \%\) of the biscuits made by machine \(J\) are broken, \(3 \%\) of the biscuits made by machine \(K\) are broken and \(5 \%\) of the biscuits made by machine \(L\) are broken.
  • Draw a tree diagram to illustrate all the possible outcomes and associated probabilities. A biscuit is selected at random.
  • Calculate the probability that the biscuit is made by machine \(J\) and is not broken.
  • Calculate the probability that the biscuit is broken.
  • Given that the biscuit is broken, find the probability that it was not made by machine \(K\).
    5. The discrete random variable \(X\) has the probability function $$P ( X = x ) = \begin{cases} k x & x = 2,4,6
    k ( x - 2 ) x = 8 & x = 8
    0 & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
  • Show that \(k = \frac { 1 } { 18 }\).
  • Find the exact value of \(\mathrm { F } ( 5 )\).
  • Find the exact value of \(\mathrm { E } ( X )\).
  • Find the exact value of \(\mathrm { E } \left( X ^ { 2 } \right)\).
  • Calculate \(\operatorname { Var } ( 3 - 4 X )\) giving your answer to 3 significant figures.
    6. The times, in seconds, spent in a queue at a supermarket by 85 randomly selected customers, are summarised in the table below.
    Time (seconds)Number of customers, \(f\)
    \(0 - 30\)2
    \(30 - 60\)10
    \(60 - 70\)17
    \(70 - 80\)25
    \(80 - 100\)25
    \(100 - 150\)6
    A histogram was drawn to represent these data. The \(30 - 60\) group was represented by a bar of width 1.5 cm and height 1 cm .
  • Find the width and the height of the \(70 - 80\) group.
  • Use linear interpolation to estimate the median of this distribution. Given that \(x\) denotes the midpoint of each group in the table and $$\Sigma f _ { x } = 6460 \quad \Sigma f _ { x ^ { 2 } } = 529400$$
  • calculate an estimate for
    1. the mean,
    2. the standard deviation,
      for the above data. One measure of skewness is given by $$\text { coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  • Evaluate this coefficient and comment on the skewness of these data.
    7. The heights of adult females are normally distributed with mean 160 cm and standard deviation 8 cm .
  • Find the probability that a randomly selected adult female has a height greater than 170 cm . Any adult female whose height is greater than 170 cm is defined as tall.
    An adult female is chosen at random. Given that she is tall,
  • find the probability that she has a height greater than 180 cm . Half of tall adult females have a height greater than \(h \mathrm {~cm}\).
  • Find the value of \(h\).
    8. For the events \(A\) and \(B\), $$\mathrm { P } \left( A ^ { \prime } \cap B \right) = 0.22 \text { and } \mathrm { P } \left( A ^ { \prime } \cap B ^ { \prime } \right) = 0.18$$
  • Find \(\mathrm { P } ( A )\).
  • Find \(\mathrm { P } ( A \cup B )\). Given that \(\mathrm { P } ( A \mid B ) = 0.6\),
  • find \(\mathrm { P } ( A \cap B )\).
  • Determine whether or not \(A\) and \(B\) are independent.