Questions — Edexcel (9670 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
Edexcel S1 Q5
5. The following grouped frequency distribution summarises the number of minutes, to the nearest minute, that a random sample of 200 motorists were delayed by roadworks on a stretch of motorway.
Delay (mins)Number of motorists
\(4 - 6\)15
\(7 - 8\)28
949
1053
\(11 - 12\)30
\(13 - 15\)15
\(16 - 20\)10
  1. Using graph paper represent these data by a histogram.
  2. Give a reason to justify the use of a histogram to represent these data.
  3. Use interpolation to estimate the median of this distribution.
  4. Calculate an estimate of the mean and an estimate of the standard deviation of these data. One coefficient of skewness is given by $$\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } } .$$
  5. Evaluate this coefficient for the above data.
  6. Explain why the normal distribution may not be suitable to model the number of minutes that motorists are delayed by these roadworks.
Edexcel S1 Q6
6. A local authority is investigating the cost of reconditioning its incinerators. Data from 10 randomly chosen incinerators were collected. The variables monitored were the operating time \(x\) (in thousands of hours) since last reconditioning and the reconditioning cost \(y\) (in \(\pounds 1000\) ). None of the incinerators had been used for more than 3000 hours since last reconditioning. The data are summarised below, $$\Sigma x = 25.0 , \Sigma x ^ { 2 } = 65.68 , \Sigma y = 50.0 , \Sigma y ^ { 2 } = 260.48 , \Sigma x y = 130.64$$
  1. Find \(\mathrm { S } _ { x x } , \mathrm {~S} _ { x y } , \mathrm {~S} _ { y y }\).
  2. Calculate the product moment correlation coefficient between \(x\) and \(y\).
  3. Explain why this value might support the fitting of a linear regression model of the form \(y = a + b x\).
  4. Find the values of \(a\) and \(b\).
  5. Give an interpretation of \(a\).
  6. Estimate
    1. the reconditioning cost for an operating time of 2400 hours,
    2. the financial effect of an increase of 1500 hours in operating time.
  7. Suggest why the authority might be cautious about making a prediction of the reconditioning cost of an incinerator which had been operating for 4500 hours since its last reconditioning. Materials required for examination
    Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Items included with question papers
    Nil Paper Reference(s)
    6683 \section*{Edexcel GCE
    Statistics S1
    (New Syllabus)
    Advanced/Advanced Subsidiary
    Tuesday 12 June 2001 - Afternoon
    Time: 1 hour 30 minutes} Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. Pages 6, 7 and 8 are blank. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Each of the 25 students on a computer course recorded the number of minutes \(x\), to the nearest minute, spent surfing the internet during a given day. The results are summarised below.
    $$\Sigma x = 1075 , \Sigma x ^ { 2 } = 46625$$
  8. Find \(\mu\) and \(\sigma\) for these data. Two other students surfed the internet on the same day for 35 and 51 minutes respectively.
  9. Without further calculation, explain the effect on the mean of including these two students.
    2. On a particular day in summer 1993 at 0800 hours the height above sea level, \(x\) metres, and the temperature, \(y ^ { \circ } \mathrm { C }\), were recorded in 10 Mediterranean towns. The following summary statistics were calculated from the results. $$\Sigma x = 7300 , \Sigma x ^ { 2 } = 6599600 , S _ { x y } = - 13060 , S _ { y y } = 140.9 .$$
  10. Find \(S _ { x x }\).
  11. Calculate, to 3 significant figures, the product moment correlation coefficient between \(x\) and \(y\).
  12. Give an interpretation of your coefficient.
    3. The continuous random variable \(Y\) is normally distributed with mean 100 and variance 256 .
  13. Find \(\mathrm { P } ( Y < 80 )\).
  14. Find \(k\) such that \(\mathrm { P } ( 100 - k \leq Y \leq 100 + k ) = 0.516\).
    4. The discrete random variable \(X\) has the probability function shown in the table below.
    \(x\)- 2- 10123
    \(\mathrm { P } ( X = x )\)0.1\(\alpha\)0.30.20.10.1
    Find
  15. \(\alpha\),
  16. \(\mathrm { P } ( - 1 < X \leq 2 )\),
  17. \(\mathrm { F } ( - 0.4 )\),
  18. \(\mathrm { E } ( 3 X + 4 )\),
  19. \(\operatorname { Var } ( 2 X + 3 )\).
    5. A market researcher asked 100 adults which of the three newspapers \(A , B , C\) they read. The results showed that \(30 \operatorname { read } A , 26\) read \(B , 21\) read \(C , 5\) read both \(A\) and \(B\), 7 read both \(B\) and \(C\), 6 read both \(C\) and \(A\) and 2 read all three.
  20. Draw a Venn diagram to represent these data. One of the adults is then selected at random.
    Find the probability that she reads
  21. at least one of the newspapers,
  22. only \(A\),
  23. only one of the newspapers,
  24. \(A\) given that she reads only one newspaper.
    6. Three swimmers Alan, Diane and Gopal record the number of lengths of the swimming pool they swim during each practice session over several weeks. The stem and leaf diagram below shows the results for Alan.
    Lengths\(2 \mid 0\) means 20
    20122\(( 4 )\)
    25567789\(( 7 )\)
    301224\(( 5 )\)
    356679\(( 5 )\)
    40133333444\(( 10 )\)
    4556667788999\(( 12 )\)
    5000\(( 3 )\)
  25. Find the three quartiles for Alan's results. The table below summarises the results for Diane and Gopal.
    DianeGopal
    Smallest value3525
    Lower quartile3734
    Median4242
    Upper quartile5350
    Largest value6557
  26. Using the same scale and on the same sheet of graph paper draw box plots to represent the data for Alan, Diane and Gopal.
  27. Compare and contrast the three box plots.
Edexcel S1 Q7
7. A music teacher monitored the sight-reading ability of one of her pupils over a 10 week period. At the end of each week, the pupil was given a new piece to sight-read and the teacher noted the number of errors \(y\). She also recorded the number of hours \(x\) that the pupil had practised each week. The data are shown in the table below.
  1. Given that \(\mathrm { E } ( X ) = - 0.2\), find the value of \(\alpha\) and the value of \(\beta\).
  2. Write down \(\mathrm { F } ( 0.8 )\).
  3. Evaluate \(\operatorname { Var } ( X )\). Find the value of
  4. \(\mathrm { E } ( 3 X - 2 )\),
  5. \(\operatorname { Var } ( 2 X + 6 )\).
    7. The following stem and leaf diagram shows the aptitude scores \(x\) obtained by all the applicants for a particular job. 3| 1 means 31 \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{Aptitude score}
    3129
    424689
    51335679
    60133356889
    71222455568888
    801235889
    9012
    9
    \end{table}
  6. Write down the modal aptitude score.
  7. Find the three quartiles for these data. Outliers can be defined to be outside the limits \(\mathrm { Q } _ { 1 } - 1.0 \left( \mathrm { Q } _ { 3 } - \mathrm { Q } _ { 1 } \right)\) and \(\mathrm { Q } _ { 3 } + 1.0 \left( \mathrm { Q } _ { 3 } - \mathrm { Q } _ { 1 } \right)\).
  8. On a graph paper, draw a box plot to represent these data. For these data, \(\Sigma x = 3363\) and \(\Sigma x ^ { 2 } = 238305\).
  9. Calculate, to 2 decimal places, the mean and the standard deviation for these data.
  10. Use two different methods to show that these data are negatively skewed. END \section*{Advanced/Advanced Subsidiary} \section*{Wednesday 15 January 2003 - Morning} Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic Nil algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. Pages 6, 7 and 8 are blank. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. The total amount of time a secretary spent on the telephone in a working day was recorded to the nearest minute. The data collected over 40 days are summarised in the table below.
    Draw a histogram to illustrate these data.
    2. The lifetimes of batteries used for a computer game have a mean of 12 hours and a standard deviation of 3 hours. Battery lifetimes may be assumed to be normally distributed. Find the lifetime, \(t\) hours, of a battery such that 1 battery in 5 will have a lifetime longer than \(t\).
    3. A company owns two petrol stations \(P\) and \(Q\) along a main road. Total daily sales in the same week for \(P ( \pounds p )\) and for \(Q ( \pounds q )\) are summarised in the table below. A science teacher believes that students' marks in physics depend upon their mathematical ability. The teacher decides to investigate this relationship using the test marks.
  11. Write down which is the explanatory variable in this investigation.
  12. Draw a scatter diagram to illustrate these data.
  13. Showing your working, find the equation of the regression line of \(p\) on \(m\).
  14. Draw the regression line on your scatter diagram. A ninth student was absent for the physics test, but she sat the mathematics test and scored 15 .
  15. Using this model, estimate the mark she would have scored in the physics test. \section*{END} Materials required for examination
    Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac)
    Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
    6683 \section*{Advanced/Advanced Subsidiary} \section*{Tuesday 4 November 2003 - Morning} In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has six questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. A company wants to pay its employees according to their performance at work. The performance score \(x\) and the annual salary, \(y\) in \(\pounds 100\) s, for a random sample of 10 of its employees for last year were recorded. The results are shown in the table below.
    [You may use \(\left. \Sigma h ^ { 2 } = 272094 , \Sigma c ^ { 2 } = 2878966 , \Sigma h c = 884484 \right]\)
  16. Draw a scatter diagram to illustrate these data.
  17. Find exact values of \(S _ { h c } S _ { h h }\) and \(S _ { c c }\).
  18. Calculate the value of the product moment correlation coefficient for these data.
  19. Give an interpretation of your correlation coefficient.
  20. Calculate the equation of the regression line of \(c\) on \(h\) in the form \(c = a + b h\).
  21. Estimate the level of confidence of a person of height 180 cm .
  22. State the range of values of \(h\) for which estimates of \(c\) are reliable. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has six questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    3. A discrete random variable \(X\) has a probability function as shown in the table below, where \(a\) and \(b\) are constants. Key: \(0 \quad | 18 | 4\) means 180 for Keith and 184 for Asif
    The quartiles for these two distributions are summarised in the table below. \(\quad 00\) means 10 Totals
  23. Find the three quartiles of these data. During the same month, the least number of caravans on Northcliffe caravan site was 31. The maximum number of caravans on this site on any night that month was 72 . The three quartiles for this site were 38, 45 and 52 respectively.
  24. On graph paper and using the same scale, draw box plots to represent the data for both caravan sites. You may assume that there are no outliers.
  25. Compare and contrast these two box plots.
  26. Give an interpretation to the upper quartiles of these two distributions.
    3. The following table shows the height \(x\), to the nearest cm , and the weight \(y\), to the nearest kg , of a random sample of 12 students.
  27. Give a reason to justify the use of a histogram to represent these data.
  28. Calculate the frequency densities needed to draw a histogram for these data.
    (DO NOT DRAW THE HISTOGRAM)
  29. Use interpolation to estimate the median \(Q _ { 2 }\), the lower quartile \(Q _ { 1 }\), and the upper quartile \(Q _ { 3 }\) of these data. The mid-point of each class is represented by \(x\) and the corresponding frequency by \(f\). Calculations then give the following values $$\sum f x = 8379.5 \text { and } \sum f x ^ { 2 } = 557489.75$$
  30. Calculate an estimate of the mean and an estimate of the standard deviation for these data. One coefficient of skewness is given by $$\frac { Q _ { 3 } - 2 Q _ { 2 } + Q _ { 1 } } { Q _ { 3 } - Q _ { 1 } } .$$
  31. Evaluate this coefficient and comment on the skewness of these data.
  32. Give another justification of your comment in part (e).
    3. A long distance lorry driver recorded the distance travelled, \(m\) miles, and the amount of fuel used, \(f\) litres, each day. Summarised below are data from the driver's records for a random sample of 8 days. The data are coded such that \(x = m - 250\) and \(y = f - 100\). $$\sum x = 130 \quad \sum y = 48 \quad \sum x y = 8880 \quad S _ { x x } = 20487.5$$
  33. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\).
  34. Hence find the equation of the regression line of \(f\) on \(m\).
  35. Predict the amount of fuel used on a journey of 235 miles.
    4. Aeroplanes fly from City \(A\) to City \(B\). Over a long period of time the number of minutes delay in take-off from City \(A\) was recorded. The minimum delay was 5 minutes and the maximum delay was 63 minutes. A quarter of all delays were at most 12 minutes, half were at most 17 minutes and \(75 \%\) were at most 28 minutes. Only one of the delays was longer than 45 minutes. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
  36. On graph paper, draw a box plot to represent these data.
  37. Comment on the distribution of delays. Justify your answer.
  38. Suggest how the distribution might be interpreted by a passenger who frequently flies from City \(A\) to City \(B\).
    5. The random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \begin{cases} k x , & x = 1,2,3
    k ( x + 1 ) , & x = 4,5 \end{cases}$$ where \(k\) is a constant.
  39. Find the value of \(k\).
  40. Find the exact value of \(\mathrm { E } ( X )\).
  41. Show that, to 3 significant figures, \(\operatorname { Var } ( X ) = 1.47\).
  42. Find, to 1 decimal place, \(\operatorname { Var } ( 4 - 3 X )\).
    6. A scientist found that the time taken, \(M\) minutes, to carry out an experiment can be modelled by a normal random variable with mean 155 minutes and standard deviation 3.5 minutes. Find
  43. \(\mathrm { P } ( M > 160 )\),
  44. \(\mathrm { P } ( 150 \leq M \leq 157 )\),
  45. the value of \(m\), to 1 decimal place, such that \(\mathrm { P } ( M \leq m ) = 0.30\).
    7. In a school there are 148 students in Years 12 and 13 studying Science, Humanities or Arts subjects. Of these students, 89 wear glasses and the others do not. There are 30 Science students of whom 18 wear glasses. The corresponding figures for the Humanities students are 68 and 44 respectively. A student is chosen at random.
    Find the probability that this student
  46. is studying Arts subjects,
  47. does not wear glasses, given that the student is studying Arts subjects. Amongst the Science students, \(80 \%\) are right-handed. Corresponding percentages for Humanities and Arts students are \(75 \%\) and \(70 \%\) respectively. A student is again chosen at random.
  48. Find the probability that this student is right-handed. \section*{TOTAL FOR PAPER:75 MARKS}
  49. Given that this student is right-handed, find the probability that the student is studying Science subjects. Materials required for examination
    Mathematical Formulae (Green or Lilac) Items included with question papers
    Nil Paper Reference(s)
    6683/01 \section*{Advanced/Advanced Subsidiary} Monday 16 January 2006 - Morning
    Time: 1 hour 30 minutes Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2). There are 7 questions on this paper. The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Over a period of time, the number of people \(x\) leaving a hotel each morning was recorded. These data are summarised in the stem and leaf diagram below.
    Two of the conversations were chosen at random.
  50. Find the probability that both of them were longer than 24.5 minutes. The mid-point of each class was represented by \(x\) and its corresponding frequency by \(f\), giving \(\sum f x = 1060\).
  51. Calculate an estimate of the mean time spent on their conversations. During the following 25 weeks they monitored their weekly conversation and found that at the end of the 80 weeks their overall mean length of conversation was 21 minutes.
  52. Find the mean time spent in conversation during these 25 weeks.
  53. Comment on these two mean values.
    3. A metallurgist measured the length, \(l \mathrm {~mm}\), of a copper rod at various temperatures, \(t ^ { \circ } \mathrm { C }\), and recorded the following results. (You may use: \(\sum x = 315 , \sum x ^ { 2 } = 15225 , \sum y = 620 , \sum y ^ { 2 } = 56550 , \sum x y = 28750\) )
  54. Draw a scatter diagram to represent these data.
  55. Show that \(S _ { x y } = 4337.5\) and find \(S _ { x x }\). The student believes that a linear relationship of the form \(y = a + b x\) could be used to describe these data.
  56. Use linear regression to find the value of \(a\) and the value of \(b\), giving your answers to 1 decimal place.
  57. Draw the regression line on your diagram. The student believes that one brand of chocolate is overpriced.
  58. Use the scatter diagram to
    1. state which brand is overpriced,
    2. suggest a fair price for this brand. Give reasons for both your answers.
      4. A survey of the reading habits of some students revealed that, on a regular basis, \(25 \%\) read quality newspapers, \(45 \%\) read tabloid newspapers and \(40 \%\) do not read newspapers at all.
  59. Find the proportion of students who read both quality and tabloid newspapers.
  60. Draw a Venn diagram to represent this information. A student is selected at random. Given that this student reads newspapers on a regular basis,
  61. find the probability that this student only reads quality newspapers.
    5. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-046_494_926_258_1706} \captionsetup{labelformat=empty} \caption{Figure 2}
    \end{figure} Figure 2 shows a histogram for the variable \(t\) which represents the time taken, in minutes, by a group of people to swim 500 m .
  62. Copy and complete the frequency table for \(t\). $$\text { [You may use } \sum x ^ { 2 } = 60475 , \sum y ^ { 2 } = 53122 , \sum x y = 56076 \text { ] }$$
  63. Showing your working clearly, calculate the product moment correlation coefficient between the interview test and the performance after one year. The product moment correlation coefficient between the skills assessment and the performance after one year is - 0.156 to 3 significant figures.
  64. Use your answer to part (a) to comment on whether or not the interview test and skills assessment are a guide to the performance after one year. Give clear reasons for your answers.
    2. Cotinine is a chemical that is made by the body from nicotine which is found in cigarette smoke. A doctor tested the blood of 12 patients, who claimed to smoke a packet of cigarettes a day, for cotinine. The results, in appropriate units, are shown below. For the Balmoral Hotel,
  65. write down the mode of the age of the residents,
  66. find the values of the lower quartile, the median and the upper quartile.
    1. Find the mean, \(\bar { x }\), of the age of the residents.
    2. Given that \(\sum _ { x } x ^ { 2 } = 81213\), find the standard deviation of the age of the residents. One measure of skewness is found using $$\frac { \text { mean - mode } } { \text { standard deviation } }$$
  67. Evaluate this measure for the Balmoral Hotel. For the Abbey Hotel, the mode is 39 , the mean is 33.2 , the standard deviation is 12.7 and the measure of skewness is - 0.454 .
  68. Compare the two age distributions of the residents of each hotel.
    3. The random variable \(X\) has probability distribution given in the table below. A histogram was drawn and the bar representing the \(10 - 15\) class has a width of 2 cm and a height of 5 cm . For the 16-18 class find
  69. the width,
  70. the height
    of the bar representing this class.
    4. A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The lengths are summarised in the table below.
    Foot length, \(\boldsymbol { l } , ( \mathbf { c m } )\)Number of children
    \(10 \leq l < 12\)5
    \(12 \leq l < 17\)53
    \(17 \leq l < 19\)29
    \(19 \leq l < 21\)15
    \(21 \leq l < 23\)11
    \(23 \leq l < 25\)7
  71. Use interpolation to estimate the median of this distribution.
  72. Calculate estimates for the mean and the standard deviation of these data. One measure of skewness is given by $$\text { Coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  73. Evaluate this coefficient and comment on the skewness of these data. Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old children.
  74. Using the value found in part (c), comment on Greg's suggestion, giving a reason for your answer.
    5. The weight, \(w\) grams, and the length, \(l \mathrm {~mm}\), of 10 randomly selected newborn turtles are given in the table below.
    \(l\)49.052.053.054.554.153.450.051.649.551.2
    \(w\)29323439383530312930
    $$\text { (You may use } S _ { l l } = 33.381 \quad S _ { w l } = 59.99 \quad S _ { w w } = 120.1 \text { ) }$$
  75. Find the equation of the regression line of \(w\) on \(l\) in the form \(w = a + b l\).
  76. Use your regression line to estimate the weight of a newborn turtle of length 60 mm .
  77. Comment on the reliability of your estimate giving a reason for your answer.
    6. The discrete random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \left\{ \begin{array} { c l } a ( 3 - x ) & x = 0,1,2
    b & x = 3 \end{array} \right.$$
  78. Find \(\mathrm { P } ( X = 2 )\) and copy and complete the table below.
    \(x\)0123
    \(\mathrm { P } ( X = x )\)\(3 a\)\(2 a\)\(b\)
    Given that \(\mathrm { E } ( X ) = 1.6\),
  79. find the value of \(a\) and the value of \(b\). Find
  80. \(\mathrm { P } ( 0.5 < X < 3 )\),
  81. \(\mathrm { E } ( 3 X - 2 )\).
  82. Show that the \(\operatorname { Var } ( X ) = 1.64\)
  83. Calculate \(\operatorname { Var } ( 3 X - 2 )\).
    7. (a) Given that \(\mathrm { P } ( A ) = a\) and \(\mathrm { P } ( B ) = b\) express \(\mathrm { P } ( A \cup B )\) in terms of \(a\) and \(b\) when
    1. \(A\) and \(B\) are mutually exclusive,
    2. \(A\) and \(B\) are independent. Two events \(R\) and \(Q\) are such that $$\mathrm { P } ( R \cap Q \square ) = 0.15 , \quad \mathrm { P } ( Q ) = 0.35 \quad \text { and } \quad \mathrm { P } ( R \mid Q ) = 0.1$$ Find the value of
  84. \(\mathrm { P } ( R \cup Q )\),
  85. \(\mathrm { P } ( R \cap Q )\),
  86. \(\mathrm { P } ( R )\).
Edexcel S1 Q8
8. The lifetimes of bulbs used in a lamp are normally distributed. A company \(X\) sells bulbs with a mean lifetime of 850 hours and a standard deviation of 50 hours.
  1. Find the probability of a bulb, from company \(X\), having a lifetime of less than 830 hours.
  2. In a box of 500 bulbs, from company \(X\), find the expected number having a lifetime of less than 830 hours. A rival company \(Y\) sells bulbs with a mean lifetime of 860 hours and \(20 \%\) of these bulbs have a lifetime of less than 818 hours.
  3. Find the standard deviation of the lifetimes of bulbs from company \(Y\). Both companies sell the bulbs for the same price.
  4. State which company you would recommend. Give reasons for your answer. Mathematical Formulae (Pink or Green) Nil \section*{Wednesday 13 January 2010 - Afternoon} Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulae stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2). There are 7 questions on this paper. The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. A jar contains 2 red, 1 blue and 1 green bead. Two beads are drawn at random from the jar without replacement.
    2. Draw a tree diagram to illustrate all the possible outcomes and associated probabilities. State your probabilities clearly.
    3. Find the probability that a blue bead and a green bead are drawn from the jar.
    4. The 19 employees of a company take an aptitude test. The scores out of 40 are illustrated in the stem and leaf diagram below.
    where \(a\) is a constant.
  5. Find the value of \(a\).
  6. Write down \(\mathrm { E } ( X )\).
  7. Find \(\operatorname { Var } ( X )\). The random variable \(Y = 6 - 2 X\).
  8. Find \(\operatorname { Var } ( Y )\).
  9. Calculate \(\mathrm { P } ( X \geq Y )\). Shivani selects a ball and spins the appropriate coin.
  10. Find the probability that she obtains a head. Given that Tom selected a ball at random and obtained a head when he spun the appropriate coin,
  11. find the probability that Tom selected a red ball. Shivani and Tom each repeat this experiment.
  12. Find the probability that the colour of the ball Shivani selects is the same as the colour of the ball Tom selects.
    4. The Venn diagram in Figure 1 shows the number of students in a class who read any of 3 popular magazines \(A , B\) and \(C\). \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-064_264_615_360_443} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} One of these students is selected at random.
  13. Show that the probability that the student reads more than one magazine is \(\frac { 1 } { 6 }\).
  14. Find the probability that the student reads \(A\) or \(B\) (or both).
  15. Write down the probability that the student reads both \(A\) and \(C\). Given that the student reads at least one of the magazines,
  16. find the probability that the student reads \(C\).
  17. Determine whether or not reading magazine \(B\) and reading magazine \(C\) are statistically independent.
    5. A teacher selects a random sample of 56 students and records, to the nearest hour, the time spent watching television in a particular week. where \(a , b\) and \(c\) are constants.
    The cumulative distribution function \(\mathrm { F } ( y )\) of \(Y\) is given in the following table.
  18. Estimate the number of motorists who were delayed between 8.5 and 13.5 minutes by the roadworks.
    (2)
    2. (a) State in words the relationship between two events \(R\) and \(S\) when \(\mathrm { P } ( R \cap S ) = 0\). The events \(A\) and \(B\) are independent with \(\mathrm { P } ( A ) = \frac { 1 } { 4 }\) and \(\mathrm { P } ( A \cup B ) = \frac { 2 } { 3 }\).
    Find
  19. \(\mathrm { P } ( B )\),
  20. \(\mathrm { P } \left( A ^ { \prime } \cap B \right)\),
  21. \(\mathrm { P } \left( B ^ { \prime } \mid A \right)\).
    3. The discrete random variable \(X\) can take only the values \(2,3,4\) or 6 . For these values the probability distribution function is given by [You may use \(\sum p ^ { 2 } = 1967\) and \(\sum p t = 694\) ]
  22. On graph paper, draw a scatter diagram to represent these data.
  23. Explain why a linear regression model may be appropriate to describe the relationship between \(p\) and \(t\).
  24. Calculate the value of \(S _ { p t }\) and the value of \(S _ { p p }\).
  25. Find the equation of the regression line of \(t\) on \(p\), giving your answer in the form \(t = a + b p\).
  26. Plot the point \(( \bar { p } , \bar { t } )\) and draw the regression line on your scatter diagram. The scientist reviews similar studies and finds that pollutant levels above 16 are likely to result in the death of a chick soon after hatching.
  27. Estimate the minimum thinning of the shell that is likely to result in the death of a chick.
    4. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-077_401_741_296_1761} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} Figure 1 shows how 25 people travelled to work.
    Their travel to work is represented by the events
    B bicycle
    \(T \quad\) train
    \(W\) walk
  28. Write down 2 of these events that are mutually exclusive. Give a reason for your answer.
  29. Determine whether or not \(B\) and \(T\) are independent events. One person is chosen at random.
    Find the probability that this person
  30. walks to work,
  31. travels to work by bicycle and train. Given that this person travels to work by bicycle,
  32. find the probability that they will also take the train.
    5. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-078_618_812_301_315} \captionsetup{labelformat=empty} \caption{Figure 2}
    \end{figure} A policeman records the speed of the traffic on a busy road with a 30 mph speed limit.
    He records the speeds of a sample of 450 cars. The histogram in Figure 2 represents the results.
  33. Calculate the number of cars that were exceeding the speed limit by at least 5 mph in the sample.
  34. Estimate the value of the mean speed of the cars in the sample.
  35. Estimate, to 1 decimal place, the value of the median speed of the cars in the sample.
  36. Comment on the shape of the distribution. Give a reason for your answer.
  37. State, with a reason, whether the estimate of the mean or the median is a better representation of the average speed of the traffic on the road.
    6. The heights of an adult female population are normally distributed with mean 162 cm and standard deviation 7.5 cm .
  38. Find the probability that a randomly chosen adult female is taller than 150 cm .
    (3) Sarah is a young girl. She visits her doctor and is told that she is at the 60th percentile for height.
  39. Assuming that Sarah remains at the 60th percentile, estimate her height as an adult.
    (3) The heights of an adult male population are normally distributed with standard deviation 9.0 cm .
    Given that \(90 \%\) of adult males are taller than the mean height of adult females,
  40. find the mean height of an adult male.
    (4)
    7. A manufacturer carried out a survey of the defects in their soft toys. It is found that the probability of a toy having poor stitching is 0.03 and that a toy with poor stitching has a probability of 0.7 of splitting open. A toy without poor stitching has a probability of 0.02 of splitting open.
  41. Draw a tree diagram to represent this information.
    (3)
  42. Find the probability that a randomly chosen soft toy has exactly one of the two defects, poor stitching or splitting open.
    (3) The manufacturer also finds that soft toys can become faded with probability 0.05 and that this defect is independent of poor stitching or splitting open. A soft toy is chosen at random.
  43. Find the probability that the soft toy has none of these 3 defects.
  44. Find the probability that the soft toy has exactly one of these 3 defects. \section*{Advanced Level} \section*{Friday 18 January 2013 - Afternoon} Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S1), the paper reference (6683), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has 7 questions.
    The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner.
    Answers without working may not gain full credit. \section*{P41805A}
    1. A teacher asked a random sample of 10 students to record the number of hours of television, \(t\), they watched in the week before their mock exam. She then calculated their grade, \(g\), in their mock exam. The results are summarised as follows.
    $$\sum t = 258 \quad \sum t ^ { 2 } = 8702 \quad \sum g = 63.6 \quad \mathrm {~S} _ { g g } = 7.864 \quad \sum g t = 1550.2$$
  45. Find \(\mathrm { S } _ { t t }\) and \(\mathrm { S } _ { g t }\).
  46. Calculate, to 3 significant figures, the product moment correlation coefficient between \(t\) and \(g\). The teacher also recorded the number of hours of revision, \(v\), these 10 students completed during the week before their mock exam. The correlation coefficient between \(t\) and \(v\) was - 0.753 .
  47. Describe, giving a reason, the nature of the correlation you would expect to find between \(v\) and \(g\).
    2. The discrete random variable \(X\) can take only the values 1,2 and 3 . For these values the cumulative distribution function is defined by $$\mathrm { F } ( x ) = \frac { x ^ { 3 } + k } { 40 } , \quad x = 1,2,3 .$$
  48. Show that \(k = 13\).
  49. Find the probability distribution of \(X\). Given that \(\operatorname { Var } ( X ) = \frac { 259 } { 320 }\),
  50. find the exact value of \(\operatorname { Var } ( 4 X - 5 )\).
    3. A biologist is comparing the intervals ( \(m\) seconds) between the mating calls of a certain species of tree frog and the surrounding temperature ( \(t ^ { \circ } \mathrm { C }\) ). The following results were obtained. [You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716 , \sum t h = 64980\) and \(\mathrm { S } t t = 371.56\) ]
  51. Calculate \(\mathrm { S } _ { \text {th } }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  52. Calculate the product moment correlation coefficient for this data.
  53. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  54. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  55. Interpret the value of \(b\).
  56. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
    2. The marks of a group of female students in a statistics test are summarised in Figure 1. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-083_412_723_331_390} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile
    or more than \(1.5 \times\) interquartile range below the lower quartile.
  57. On graph paper draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  58. Compare and contrast the marks of the male and the female students.
    (2) \begin{table}[h]
    \captionsetup{labelformat=empty} \caption{(a) Write down the mark which is exceeded by \(75 \%\) of the female students.
    (1)
  59. Write down the mark which is exceeded by \(75 \%\) of the female students.
    The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.}
    Mark\(( 2 \mid 6\) means \(26 )\)Totals
    14\(( 1 )\)
    26\(( 1 )\)
    3447\(( 3 )\)
    4066778\(( 6 )\)
    5001113677\(( 9 )\)
    6223338\(( 6 )\)
    7008\(( 3 )\)
    85\(( 1 )\)
    90\(( 1 )\)
\end{table}
  • Find the median and interquartile range of the marks of the male students.
    3. In a company the 200 employees are classified as full-time workers, part-time workers or contractors. The table below shows the number of employees in each category and whether they walk to work or use some form of transport.
  • Find the probability distribution of \(X\).
  • Write down the value of \(\mathrm { F } ( 1.8 )\).
    3. An agriculturalist is studying the yields, \(y \mathrm {~kg}\), from tomato plants. The data from a random sample of 70 tomato plants are summarised below. [You may assume that \(\Sigma h = 7150 , \Sigma t = 110 , \Sigma h ^ { 2 } = 7171500 , \Sigma t ^ { 2 } = 1716 , \Sigma t h = 64980\) and \(\mathrm { S } t t = 371.56\) ]
  • Calculate \(\mathrm { S } _ { \text {th } }\) and \(\mathrm { S } _ { \text {hh } }\). Give your answers to 3 significant figures.
  • Calculate the product moment correlation coefficient for this data.
  • State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  • Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  • Interpret the value of \(b\).
  • Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
    (2)
    2. The marks of a group of female students in a statistics test are summarised in Figure 1. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{e8afd947-55ac-424b-8db5-d5aa856ef4d7-090_417_732_328_1802} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure}
  • Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below. (You may use \(\Sigma c = 111 , \Sigma c ^ { 2 } = 2375 , \Sigma s = 21 , \Sigma s ^ { 2 } = 79 , \Sigma c s = 380 , \mathrm {~S} _ { c c } = 321.5\).)
  • Calculate the value of \(\mathrm { S } _ { c s }\) and the value of \(\mathrm { S } _ { s s }\).
  • Calculate the product moment correlation coefficient for these data. Brad is not satisfied with his current internet service and decides to change his provider. He decides to pay a lot more for his new internet service.
  • On the basis of your calculation in part (b), comment on Brad's decision. Give a reason for your answer.
    2. A rugby club coach uses club records to take a random sample of 15 players from 1990 and an independent random sample of 15 players from 2010. The body weight of each player was recorded to the nearest kg and the results from 2010 are summarised in the table below. [You may use \(\Sigma x = 370 , \mathrm {~S} _ { x x } = 2587.5 , \Sigma y = 560 , \Sigma y ^ { 2 } = 39418 , \mathrm {~S} _ { x y } = - 710\) ]
  • Calculate \(\mathrm { S } _ { y y }\).
  • Calculate the product moment correlation coefficient for these data.
  • Interpret your value of the correlation coefficient. The researcher believes that a linear regression model may be appropriate to describe these data.
  • State, giving a reason, whether or not your value of the correlation coefficient supports the researcher's belief.
    (1)
  • Find the equation of the regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\). Jack is a 40-year-old patient.
    1. Use your regression line to estimate the volume of blood pumped by each contraction of Jack's heart.
    2. Comment, giving a reason, on the reliability of your estimate.
      2. The table below shows the distances (to the nearest km ) travelled to work by the 50 employees in an office.
  • Show that \(p = 0.2\). Find
  • \(\mathrm { E } ( X )\)
  • \(\mathrm { F } ( 0 )\)
  • \(\mathrm { P } ( 3 X + 2 > 5 )\) Given that \(\operatorname { Var } ( X ) = 13.35\),
  • find the possible values of \(a\) such that \(\operatorname { Var } ( a X + 3 ) = 53.4\).
    2. The discrete random variable \(X\) has probability distribution $$\mathrm { P } ( X = x ) = \frac { 1 } { 10 } \quad x = 1,2,3 , \ldots 10$$
  • Write down the name given to this distribution.
  • Write down the value of
    1. \(\mathrm { P } ( X = 10 )\)
    2. \(\mathrm { P } ( X < 10 )\) The continuous random variable \(Y\) has the normal distribution \(\mathrm { N } \left( 10,2 ^ { 2 } \right)\).
  • Write down the value of
    1. \(\mathrm { P } ( Y = 10 )\)
    2. \(\mathrm { P } ( Y < 10 )\)
      3. A large company is analysing how much money it spends on paper in its offices every year. The number of employees, \(x\), and the amount of money spent on paper, \(p\) ( \(\pounds\) hundreds), in 8 randomly selected offices are given in the table below. Key: 7|3| means 37 years for Greenslax and 31 years for Penville
      Some of the quartiles for these two distributions are given in the table below. You may use
      \(S _ { v v } = 42587.5\)
      \(S _ { v m } = 31512.5\)
      \(S _ { m m } = 25187.5\)
      \(\Sigma v = 19390\)
      \(\Sigma m = 10610\)
  • Find the product moment correlation coefficient between \(m\) and \(v\).
  • Give a reason to support fitting a regression model of the form \(m = a + b v\) to these data.
  • Find the value of \(b\) correct to 3 decimal places.
  • Find the equation of the regression line of \(m\) on \(v\).
  • Interpret your value of \(b\).
  • Use your answer to part (d) to estimate the amount of money spent when the number of visitors to the UK in a month is 2500000.
  • Comment on the reliability of your estimate in part (f). Give a reason for your answer.
    4. In a factory, three machines, \(J , K\) and \(L\), are used to make biscuits. Machine \(J\) makes \(25 \%\) of the biscuits.
    Machine \(K\) makes \(45 \%\) of the biscuits.
    The rest of the biscuits are made by machine \(L\).
    It is known that \(2 \%\) of the biscuits made by machine \(J\) are broken, \(3 \%\) of the biscuits made by machine \(K\) are broken and \(5 \%\) of the biscuits made by machine \(L\) are broken.
  • Draw a tree diagram to illustrate all the possible outcomes and associated probabilities. A biscuit is selected at random.
  • Calculate the probability that the biscuit is made by machine \(J\) and is not broken.
  • Calculate the probability that the biscuit is broken.
  • Given that the biscuit is broken, find the probability that it was not made by machine \(K\).
    5. The discrete random variable \(X\) has the probability function $$P ( X = x ) = \begin{cases} k x & x = 2,4,6
    k ( x - 2 ) x = 8 & x = 8
    0 & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
  • Show that \(k = \frac { 1 } { 18 }\).
  • Find the exact value of \(\mathrm { F } ( 5 )\).
  • Find the exact value of \(\mathrm { E } ( X )\).
  • Find the exact value of \(\mathrm { E } \left( X ^ { 2 } \right)\).
  • Calculate \(\operatorname { Var } ( 3 - 4 X )\) giving your answer to 3 significant figures.
    6. The times, in seconds, spent in a queue at a supermarket by 85 randomly selected customers, are summarised in the table below.
    Time (seconds)Number of customers, \(f\)
    \(0 - 30\)2
    \(30 - 60\)10
    \(60 - 70\)17
    \(70 - 80\)25
    \(80 - 100\)25
    \(100 - 150\)6
    A histogram was drawn to represent these data. The \(30 - 60\) group was represented by a bar of width 1.5 cm and height 1 cm .
  • Find the width and the height of the \(70 - 80\) group.
  • Use linear interpolation to estimate the median of this distribution. Given that \(x\) denotes the midpoint of each group in the table and $$\Sigma f _ { x } = 6460 \quad \Sigma f _ { x ^ { 2 } } = 529400$$
  • calculate an estimate for
    1. the mean,
    2. the standard deviation,
      for the above data. One measure of skewness is given by $$\text { coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
  • Evaluate this coefficient and comment on the skewness of these data.
    7. The heights of adult females are normally distributed with mean 160 cm and standard deviation 8 cm .
  • Find the probability that a randomly selected adult female has a height greater than 170 cm . Any adult female whose height is greater than 170 cm is defined as tall.
    An adult female is chosen at random. Given that she is tall,
  • find the probability that she has a height greater than 180 cm . Half of tall adult females have a height greater than \(h \mathrm {~cm}\).
  • Find the value of \(h\).
    8. For the events \(A\) and \(B\), $$\mathrm { P } \left( A ^ { \prime } \cap B \right) = 0.22 \text { and } \mathrm { P } \left( A ^ { \prime } \cap B ^ { \prime } \right) = 0.18$$
  • Find \(\mathrm { P } ( A )\).
  • Find \(\mathrm { P } ( A \cup B )\). Given that \(\mathrm { P } ( A \mid B ) = 0.6\),
  • find \(\mathrm { P } ( A \cap B )\).
  • Determine whether or not \(A\) and \(B\) are independent.
  • Edexcel S1 Q4
    4. Aeroplanes fly from City \(A\) to City \(B\). Over a long period of time the number of minutes delay in take-off from City \(A\) was recorded. The minimum delay was 5 minutes and the maximum delay was 63 minutes. A quarter of all delays were at most 12 minutes, half were at most 17 minutes and \(75 \%\) were at most 28 minutes. Only one of the delays was longer than 45 minutes. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
    1. On the graph paper opposite draw a box plot to represent these data.
    2. Comment on the distribution of delays. Justify your answer.
    3. Suggest how the distribution might be interpreted by a passenger who frequently flies from City \(A\) to City \(B\).
      \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-008_1190_1487_278_223}
    Edexcel S1 Q7
    7. In a school there are 148 students in Years 12 and 13 studying Science, Humanities or Arts subjects. Of these students, 89 wear glasses and the others do not. There are 30 Science students of whom 18 wear glasses. The corresponding figures for the Humanities students are 68 and 44 respectively. A student is chosen at random. Find the probability that this student
    1. is studying Arts subjects,
    2. does not wear glasses, given that the student is studying Arts subjects. Amongst the Science students, \(80 \%\) are right-handed. Corresponding percentages for Humanities and Arts students are 75\% and 70\% respectively. A student is again chosen at random.
    3. Find the probability that this student is right-handed.
    4. Given that this student is right-handed, find the probability that the student is studying Science subjects.
      Turn over
      1. (a) Describe the main features and uses of a box plot.
      Children from schools \(A\) and \(B\) took part in a fun run for charity. The times, to the nearest minute, taken by the children from school \(A\) are summarised in Figure 1. \begin{figure}[h]
      \captionsetup{labelformat=empty} \caption{Figure 1} \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-015_398_1045_946_461}
      \end{figure}
      1. Write down the time by which \(75 \%\) of the children in school \(A\) had completed the run.
      2. State the name given to this value.
    5. Explain what you understand by the two crosses ( X ) on Figure 1.
      For school \(B\) the least time taken by any of the children was 25 minutes and the longest time was 55 minutes. The three quartiles were 30,37 and 50 respectively.
    6. Draw a box plot to represent the data from school \(B\).
      \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-016_798_1196_580_372}
    7. Compare and contrast these two box plots.
      2. Sunita and Shelley talk to one another once a week on the telephone. Over many weeks they recorded, to the nearest minute, the number of minutes spent in conversation on each occasion. The following table summarises their results. Turn over
      1. As part of a statistics project, Gill collected data relating to the length of time, to the nearest minute, spent by shoppers in a supermarket and the amount of money they spent. Her data for a random sample of 10 shoppers are summarised in the table below, where \(t\) represents time and \(\pounds m\) the amount spent over \(\pounds 20\).
      Turn over
      1. A young family were looking for a new 3 bedroom semi-detached house. A local survey recorded the price \(x\), in \(\pounds 1000\), and the distance \(y\), in miles, from the station of such houses. The following summary statistics were provided
      $$S _ { x x } = 113573 , \quad S _ { y y } = 8.657 , \quad S _ { x y } = - 808.917$$
    8. Use these values to calculate the product moment correlation coefficient.
    9. Give an interpretation of your answer to part (a). Another family asked for the distances to be measured in km rather than miles.
    10. State the value of the product moment correlation coefficient in this case.
      2. The box plot in Figure 1 shows a summary of the weights of the luggage, in kg, for each musician in an orchestra on an overseas tour. \begin{figure}[h]
      \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-045_346_1452_324_228} \captionsetup{labelformat=empty} \caption{Figure 1}
      \end{figure} The airline's recommended weight limit for each musician's luggage was 45 kg . Given that none of the musicians' luggage weighed exactly 45 kg ,
    11. state the proportion of the musicians whose luggage was below the recommended weight limit. A quarter of the musicians had to pay a charge for taking heavy luggage.
    12. State the smallest weight for which the charge was made.
    13. Explain what you understand by the + on the box plot in Figure 1, and suggest an instrument that the owner of this luggage might play.
    14. Describe the skewness of this distribution. Give a reason for your answer. One musician of the orchestra suggests that the weights of luggage, in kg, can be modelled by a normal distribution with quartiles as given in Figure 1.
    15. Find the standard deviation of this normal distribution.
      3. A student is investigating the relationship between the price ( \(y\) pence) of 100 g of chocolate and the percentage ( \(x \%\) ) of cocoa solids in the chocolate.
      The following data is obtained Turn over
      advancing learning, changing lives
      1. A personnel manager wants to find out if a test carried out during an employee's interview and a skills assessment at the end of basic training is a guide to performance after working for the company for one year.
      The table below shows the results of the interview test of 10 employees and their performance after one year. Turn over
      advancing learning, changing lives
      1. A disease is known to be present in \(2 \%\) of a population. A test is developed to help determine whether or not someone has the disease.
      Given that a person has the disease, the test is positive with probability 0.95
      Given that a person does not have the disease, the test is positive with probability 0.03
    16. Draw a tree diagram to represent this information. A person is selected at random from the population and tested for this disease.
    17. Find the probability that the test is positive. A doctor randomly selects a person from the population and tests him for the disease. Given that the test is positive,
    18. find the probability that he does not have the disease.
    19. Comment on the usefulness of this test. 2. The age in years of the residents of two hotels are shown in the back to back stem and leaf diagram below. Abbey Hotel \(8 | 5 | 0\) means 58 years in Abbey hotel and 50 years in Balmoral hotel Balmoral Hotel Turn over
      1. A teacher is monitoring the progress of students using a computer based revision course. The improvement in performance, \(y\) marks, is recorded for each student along with the time, \(x\) hours, that the student spent using the revision course. The results for a random sample of 10 students are recorded below.
      Turn over
      advancing learning, changing lives
      1. The volume of a sample of gas is kept constant. The gas is heated and the pressure, \(p\), is measured at 10 different temperatures, \(t\). The results are summarised below.
        \(\sum p = 445 \quad \sum p ^ { 2 } = 38125 \quad \sum t = 240 \quad \sum t ^ { 2 } = 27520 \quad \sum p t = 26830\)
      2. Find \(\mathrm { S } _ { p p }\) and \(\mathrm { S } _ { p t }\).
      Given that \(\mathrm { S } _ { t t } = 21760\),
    20. calculate the product moment correlation coefficient.
    21. Give an interpretation of your answer to part (b).
      2. On a randomly chosen day the probability that Bill travels to school by car, by bicycle or on foot is \(\frac { 1 } { 2 } , \frac { 1 } { 6 }\) and \(\frac { 1 } { 3 }\) respectively. The probability of being late when using these methods of travel is \(\frac { 1 } { 5 } , \frac { 2 } { 5 }\) and \(\frac { 1 } { 10 }\) respectively.
    22. Draw a tree diagram to represent this information.
    23. Find the probability that on a randomly chosen day
      1. Bill travels by foot and is late,
      2. Bill is not late.
    24. Given that Bill is late, find the probability that he did not travel on foot.
      3. The variable \(x\) was measured to the nearest whole number. Forty observations are given in the table below.
      \(x\)\(10 - 15\)\(16 - 18\)\(19 -\)
      Frequency15916
      A histogram was drawn and the bar representing the \(10 - 15\) class has a width of 2 cm and a height of 5 cm . For the \(16 - 18\) class find
    25. the width,
    26. the height
      of the bar representing this class.
      4. A researcher measured the foot lengths of a random sample of 120 ten-year-old children. The lengths are summarised in the table below.
      Foot length, \(l\), (cm)Number of children
      \(10 \leqslant l < 12\)5
      \(12 \leqslant l < 17\)53
      \(17 \leqslant l < 19\)29
      \(19 \leqslant l < 21\)15
      \(21 \leqslant l < 23\)11
      \(23 \leqslant l < 25\)7
    27. Use interpolation to estimate the median of this distribution.
    28. Calculate estimates for the mean and the standard deviation of these data. One measure of skewness is given by $$\text { Coefficient of skewness } = \frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }$$
    29. Evaluate this coefficient and comment on the skewness of these data. Greg suggests that a normal distribution is a suitable model for the foot lengths of ten-year-old children.
    30. Using the value found in part (c), comment on Greg's suggestion, giving a reason for your answer.
      5. The weight, \(w\) grams, and the length, \(l \mathrm {~mm}\), of 10 randomly selected newborn turtles are given in the table below.
      \(l\)49.052.053.054.554.153.450.051.649.551.2
      \(w\)29323439383530312930
      $$\text { (You may use } \mathrm { S } _ { l l } = 33.381 \quad \mathrm {~S} _ { w l } = 59.99 \quad \mathrm {~S} _ { w w } = 120.1 \text { ) }$$
    31. Find the equation of the regression line of \(w\) on \(l\) in the form \(w = a + b l\).
    32. Use your regression line to estimate the weight of a newborn turtle of length 60 mm .
    33. Comment on the reliability of your estimate giving a reason for your answer.
      6. The discrete random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \left\{ \begin{array} { c l } a ( 3 - x ) & x = 0,1,2
      b & x = 3 \end{array} \right.$$
    34. Find \(\mathrm { P } ( X = 2 )\) and complete the table below.
      \(x\)0123
      \(\mathrm { P } ( X = x )\)\(3 a\)\(2 a\)\(b\)
      Given that \(\mathrm { E } ( X ) = 1.6\)
    35. Find the value of \(a\) and the value of \(b\). Find
    36. \(\mathrm { P } ( 0.5 < X < 3 )\),
    37. \(\mathrm { E } ( 3 X - 2 )\).
    38. Show that the \(\operatorname { Var } ( X ) = 1.64\)
    39. Calculate \(\operatorname { Var } ( 3 X - 2 )\).
      7. (a) Given that \(\mathrm { P } ( A ) = a\) and \(\mathrm { P } ( B ) = b\) express \(\mathrm { P } ( A \cup B )\) in terms of \(a\) and \(b\) when
      1. \(A\) and \(B\) are mutually exclusive,
      2. \(A\) and \(B\) are independent. Two events \(R\) and \(Q\) are such that
        \(\mathrm { P } \left( R \cap Q ^ { \prime } \right) = 0.15 , \quad \mathrm { P } ( Q ) = 0.35\) and \(\mathrm { P } ( R \mid Q ) = 0.1\)
        Find the value of
    40. \(\mathrm { P } ( R \cup Q )\),
    41. \(\mathrm { P } ( R \cap Q )\),
    42. \(\mathrm { P } ( R )\).
    Edexcel S1 Q8
    8. The lifetimes of bulbs used in a lamp are normally distributed. A company \(X\) sells bulbs with a mean lifetime of 850 hours and a standard deviation of 50 hours.
    1. Find the probability of a bulb, from company \(X\), having a lifetime of less than 830 hours.
    2. In a box of 500 bulbs, from company \(X\), find the expected number having a lifetime of less than 830 hours. A rival company \(Y\) sells bulbs with a mean lifetime of 860 hours and \(20 \%\) of these bulbs have a lifetime of less than 818 hours.
    3. Find the standard deviation of the lifetimes of bulbs from company \(Y\). Both companies sell the bulbs for the same price.
    4. State which company you would recommend. Give reasons for your answer.
      \begin{table}[h]
      \captionsetup{labelformat=empty} \caption{physicsandmathstutor.com}
      \end{table} Paper Reference(s)
      6683/01 \section*{Edexcel GCE } Examiner's use only
      \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-112_99_309_493_1636} \(\frac { \text { Materials required for examination } } { \text { Mathematical Formulae (Pink) } } \frac { \text { Items included with question papers } } { \text { Nil } }\) Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulae stored in them. In the boxes above, write your centre number, candidate number, your surname, initials and signature.
      Check that you have the correct question paper.
      Answer ALL the questions.
      You must write your answer to each question in the space following the question.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2).
      There are 7 questions in this question paper. The total mark for this paper is 75.
      There are 28 pages in this question paper. Any blank pages are indicated. You must ensure that your answers to parts of questions are clearly labelled.
      You should show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. Turn over
      advancing learning, changing lives
      1. Gary compared the total attendance, \(x\), at home matches and the total number of goals, \(y\), scored at home during a season for each of 12 football teams playing in a league. He correctly calculated:
      $$S _ { x x } = 1022500 \quad S _ { y y } = 130.9 \quad S _ { x y } = 8825$$
    5. Calculate the product moment correlation coefficient for these data.
    6. Interpret the value of the correlation coefficient. Helen was given the same data to analyse. In view of the large numbers involved she decided to divide the attendance figures by 100 . She then calculated the product moment correlation coefficient between \(\frac { x } { 100 }\) and \(y\).
    7. Write down the value Helen should have obtained.
      2. An experiment consists of selecting a ball from a bag and spinning a coin. The bag contains 5 red balls and 7 blue balls. A ball is selected at random from the bag, its colour is noted and then the ball is returned to the bag. When a red ball is selected, a biased coin with probability \(\frac { 2 } { 3 }\) of landing heads is spun.
      When a blue ball is selected a fair coin is spun.
    8. Complete the tree diagram below to show the possible outcomes and associated probabilities.
      \includegraphics[max width=\textwidth, alt={}, center]{3d4f7bfb-b235-418a-9411-a4d0b3188254-129_787_395_734_548} \section*{Coin}
      \includegraphics[max width=\textwidth, alt={}]{3d4f7bfb-b235-418a-9411-a4d0b3188254-129_1007_488_808_950}
      Shivani selects a ball and spins the appropriate coin.
    9. Find the probability that she obtains a head. Given that Tom selected a ball at random and obtained a head when he spun the appropriate coin,
    10. find the probability that Tom selected a red ball. Shivani and Tom each repeat this experiment.
    11. Find the probability that the colour of the ball Shivani selects is the same as the colour of the ball Tom selects. 3. The discrete random variable \(X\) has probability distribution given by Turn over
      advancing learning, changing lives
      1. A random sample of 50 salmon was caught by a scientist. He recorded the length \(l \mathrm {~cm}\) and weight \(w \mathrm {~kg}\) of each salmon.
      The following summary statistics were calculated from these data.
      \(\sum l = 4027 \quad \sum l ^ { 2 } = 327754.5 \quad \sum w = 357.1 \quad \sum l w = 29330.5 \quad S _ { w w } = 289.6\)
    12. Find \(S _ { l l }\) and \(S _ { l w }\)
    13. Calculate, to 3 significant figures, the product moment correlation coefficient between \(l\) and \(w\).
    14. Give an interpretation of your coefficient.
      1. Keith records the amount of rainfall, in mm , at his school, each day for a week. The results are given below.
        0.0
        0.5
        1.8
        2.8
        2.3
        5.6
        9.4
      Jenny then records the amount of rainfall, \(x \mathrm {~mm}\), at the school each day for the following 21 days. The results for the 21 days are summarised below. $$\sum x = 84.6$$
    15. Calculate the mean amount of rainfall during the whole 28 days. Keith realises that he has transposed two of his figures. The number 9.4 should have been 4.9 and the number 0.5 should have been 5.0 Keith corrects these figures.
    16. State, giving your reason, the effect this will have on the mean.
      3. Over a long period of time a small company recorded the amount it received in sales per month. The results are summarised below. Turn over
      advancing learning, changing lives
      1. On a particular day the height above sea level, \(x\) metres, and the mid-day temperature, \(y ^ { \circ } \mathrm { C }\), were recorded in 8 north European towns. These data are summarised below
      $$\mathrm { S } _ { x x } = 3535237.5 \quad \sum y = 181 \quad \sum y ^ { 2 } = 4305 \quad \mathrm {~S} _ { x y } = - 23726.25$$
    17. Find \(\mathrm { S } _ { y y }\)
    18. Calculate, to 3 significant figures, the product moment correlation coefficient for these data.
    19. Give an interpretation of your coefficient. A student thought that the calculations would be simpler if the height above sea level, \(h\), was measured in kilometres and used the variable \(h = \frac { x } { 1000 }\) instead of \(x\).
    20. Write down the value of \(\mathrm { S } _ { h h }\)
    21. Write down the value of the correlation coefficient between \(h\) and \(y\).
      1. The random variable \(X \sim \mathrm {~N} \left( \mu , 5 ^ { 2 } \right)\) and \(\mathrm { P } ( X < 23 ) = 0.9192\)
      2. Find the value of \(\mu\).
      3. Write down the value of \(\mathrm { P } ( \mu < X < 23 )\).
      4. The discrete random variable \(Y\) has probability distribution
      Turn over
      1. The histogram in Figure 1 shows the time, to the nearest minute, that a random sample of 100 motorists were delayed by roadworks on a stretch of motorway.
      \begin{figure}[h]
      \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-171_1312_673_349_639} \captionsetup{labelformat=empty} \caption{Figure 1}
      \end{figure}
    22. Complete the table. Turn over
      1. A discrete random variable \(X\) has the probability function
      $$\mathrm { P } ( X = x ) = \begin{cases} k ( 1 - x ) ^ { 2 } & x = - 1,0,1 \text { and } 2
      0 & \text { otherwise } \end{cases}$$
    23. Show that \(k = \frac { 1 } { 6 }\)
    24. Find \(\mathrm { E } ( X )\)
    25. Show that \(\mathrm { E } \left( X ^ { 2 } \right) = \frac { 4 } { 3 }\)
    26. Find \(\operatorname { Var } ( 1 - 3 X )\)
      2. A bank reviews its customer records at the end of each month to find out how many customers have become unemployed, \(u\), and how many have had their house repossessed, \(h\), during that month. The bank codes the data using variables \(x = \frac { u - 100 } { 3 }\) and \(y = \frac { h - 20 } { 7 }\) The results for the 12 months of 2009 are summarised below. $$\sum x = 477 \quad S _ { x x } = 5606.25 \quad \sum y = 480 \quad S _ { y y } = 4244 \quad \sum x y = 23070$$
    27. Calculate the value of the product moment correlation coefficient for \(x\) and \(y\).
    28. Write down the product moment correlation coefficient for \(u\) and \(h\). The bank claims that an increase in unemployment among its customers is associated with an increase in house repossessions.
    29. State, with a reason, whether or not the bank's claim is supported by these data.
      3. A scientist is researching whether or not birds of prey exposed to pollutants lay eggs with thinner shells. He collects a random sample of egg shells from each of 6 different nests and tests for pollutant level, \(p\), and measures the thinning of the shell, \(t\). The results are shown in the table below. Turn over
      1. A teacher asked a random sample of 10 students to record the number of hours of television, \(t\), they watched in the week before their mock exam. She then calculated their grade, \(g\), in their mock exam. The results are summarised as follows.
      $$\sum t = 258 \quad \sum t ^ { 2 } = 8702 \quad \sum g = 63.6 \quad \mathrm {~S} _ { g g } = 7.864 \quad \sum g t = 1550.2$$
    30. Find \(\mathrm { S } _ { t t }\) and \(\mathrm { S } _ { g t }\)
    31. Calculate, to 3 significant figures, the product moment correlation coefficient between \(t\) and \(g\). The teacher also recorded the number of hours of revision, \(v\), these 10 students completed during the week before their mock exam. The correlation coefficient between \(t\) and \(v\) was -0.753
    32. Describe, giving a reason, the nature of the correlation you would expect to find between \(v\) and \(g\).
      2. The discrete random variable \(X\) can take only the values 1,2 and 3 . For these values the cumulative distribution function is defined by $$\mathrm { F } ( x ) = \frac { x ^ { 3 } + k } { 40 } \quad x = 1,2,3$$
    33. Show that \(k = 13\)
    34. Find the probability distribution of \(X\). Given that \(\operatorname { Var } ( X ) = \frac { 259 } { 320 }\)
    35. find the exact value of \(\operatorname { Var } ( 4 X - 5 )\).
      3. A biologist is comparing the intervals ( \(m\) seconds) between the mating calls of a certain species of tree frog and the surrounding temperature ( \(t { } ^ { \circ } \mathrm { C }\) ). The following results were obtained. Turn over
      1. Sammy is studying the number of units of gas, \(g\), and the number of units of electricity, \(e\), used in her house each week. A random sample of 10 weeks use was recorded and the data for each week were coded so that \(x = \frac { g - 60 } { 4 }\) and \(y = \frac { e } { 10 }\). The results for the coded data are summarised below
      $$\sum x = 48.0 \quad \sum y = 58.0 \quad \mathrm {~S} _ { x x } = 312.1 \quad \mathrm {~S} _ { y y } = 2.10 \quad \mathrm {~S} _ { x y } = 18.35$$
    36. Find the equation of the regression line of \(y\) on \(x\) in the form \(y = a + b x\). Give the values of \(a\) and \(b\) correct to 3 significant figures.
    37. Hence find the equation of the regression line of \(e\) on \(g\) in the form \(e = c + d g\). Give the values of \(c\) and \(d\) correct to 2 significant figures.
    38. Use your regression equation to estimate the number of units of electricity used in a week when 100 units of gas were used.
      (a)Find the probability distribution of \(X\) .
      (b)Write down the value of \(\mathrm { F } ( 1.8 )\) .
      (a)Find the probability distribution of \(X\) .勤 2.The discrete random variable \(X\) takes the values 1,2 and 3 and has cum
      function \(\mathrm { F } ( x )\) given by Turn over
      1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
      \(h\)140011002608409005501230100770
      \(t\)310209101352416
      [You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
    39. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
    40. Calculate the product moment correlation coefficient for this data.
    41. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
    42. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
    43. Interpret the value of \(b\).
    44. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
      1. The marks of a group of female students in a statistics test are summarised in Figure 1
      \begin{figure}[h]
      \includegraphics[alt={},max width=\textwidth]{3d4f7bfb-b235-418a-9411-a4d0b3188254-227_629_1102_342_429} \captionsetup{labelformat=empty} \caption{Figure 1}
      \end{figure}
    45. Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.
      Mark(2|6 means 26)Totals
      14(1)
      26(1)
      3447(3)
      4066778(6)
      5001113677(9)
      6223338(6)
      7008(3)
      85(1)
      90(1)
    46. Find the median and interquartile range of the marks of the male students. An outlier is a mark that is
      either more than \(1.5 \times\) interquartile range above the upper quartile or more than \(1.5 \times\) interquartile range below the lower quartile.
    47. In the space provided on Figure 1 draw a box plot to represent the marks of the male students, indicating clearly any outliers.
    48. Compare and contrast the marks of the male and the female students.
      3. In a company the 200 employees are classified as full-time workers, part-time workers or contractors.
      The table below shows the number of employees in each category and whether they walk to work or use some form of transport.
      \cline { 2 - 3 } \multicolumn{1}{c|}{}WalkTransport
      Full-time worker28
      Part-time worker3575
      Contractor3050
      The events \(F , H\) and \(C\) are that an employee is a full-time worker, part-time worker or contractor respectively. Let \(W\) be the event that an employee walks to work. An employee is selected at random.
      Find
    49. \(\mathrm { P } ( H )\)
    50. \(\mathrm { P } \left( [ F \cap W ] ^ { \prime } \right)\)
    51. \(\mathrm { P } ( W \mid C )\) Let \(B\) be the event that an employee uses the bus.
      Given that \(10 \%\) of full-time workers use the bus, \(30 \%\) of part-time workers use the bus and \(20 \%\) of contractors use the bus,
    52. draw a Venn diagram to represent the events \(F , H , C\) and \(B\),
    53. find the probability that a randomly selected employee uses the bus to travel to work. 4. The following table summarises the times, \(t\) minutes to the nearest minute, recorded for a group of students to complete an exam.
      Time (minutes) \(t\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 35\)\(36 - 45\)\(46 - 60\)
      Number of students f628816131110
      $$\text { [You may use } \sum \mathrm { f } t ^ { 2 } = 134281.25 \text { ] }$$
    54. Estimate the mean and standard deviation of these data.
    55. Use linear interpolation to estimate the value of the median.
    56. Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
    57. Estimate the interquartile range of this distribution.
    58. Give a reason why the mean and standard deviation are not the most appropriate summary statistics to use with these data. The person timing the exam made an error and each student actually took 5 minutes less than the times recorded above. The table below summarises the actual times.
      Time (minutes) \(t\)\(6 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 55\)
      Number of students f628816131110
    59. Without further calculations, explain the effect this would have on each of the estimates found in parts (a), (b), (c) and (d).
      1. A biased die with six faces is rolled. The discrete random variable \(X\) represents the score on the uppermost face. The probability distribution of \(X\) is shown in the table below.
      \(x\)123456
      \(\mathrm { P } ( X = x )\)\(a\)\(a\)\(a\)\(b\)\(b\)0.3
    60. Given that \(\mathrm { E } ( X ) = 4.2\) find the value of \(a\) and the value of \(b\).
    61. Show that \(\mathrm { E } \left( X ^ { 2 } \right) = 20.4\)
    62. Find \(\operatorname { Var } ( 5 - 3 X )\) A biased die with five faces is rolled. The discrete random variable \(Y\) represents the score which is uppermost. The cumulative distribution function of \(Y\) is shown in the table below.
      \(y\)12345
      \(\mathrm {~F} ( y )\)\(\frac { 1 } { 10 }\)\(\frac { 2 } { 10 }\)\(3 k\)\(4 k\)\(5 k\)
    63. Find the value of \(k\).
    64. Find the probability distribution of \(Y\). Each die is rolled once. The scores on the two dice are independent.
    65. Find the probability that the sum of the two scores equals 2
      1. The weight, in grams, of beans in a tin is normally distributed with mean \(\mu\) and standard deviation 7.8
      Given that \(10 \%\) of tins contain less than 200 g , find
    66. the value of \(\mu\)
    67. the percentage of tins that contain more than 225 g of beans. The machine settings are adjusted so that the weight, in grams, of beans in a tin is normally distributed with mean 205 and standard deviation \(\sigma\).
    68. Given that \(98 \%\) of tins contain between 200 g and 210 g find the value of \(\sigma\).
      \section*{Probability} $$\begin{aligned} & \mathrm { P } ( A \cup B ) = \mathrm { P } ( A ) + \mathrm { P } ( B ) - \mathrm { P } ( A \cap B )
      & \mathrm { P } ( A \cap B ) = \mathrm { P } ( A ) \mathrm { P } ( B \mid A )
      & \mathrm { P } ( A \mid B ) = \frac { \mathrm { P } ( B \mid A ) \mathrm { P } ( A ) } { \mathrm { P } ( B \mid A ) \mathrm { P } ( A ) + \mathrm { P } \left( B \mid A ^ { \prime } \right) \mathrm { P } \left( A ^ { \prime } \right) } \end{aligned}$$ \section*{Discrete distributions} For a discrete random variable \(X\) taking values \(x _ { i }\) with probabilities \(\mathrm { P } \left( X = x _ { i } \right)\)
      Expectation (mean): \(\mathrm { E } ( X ) = \mu = \Sigma x _ { i } \mathrm { P } \left( X = x _ { i } \right)\)
      Variance: \(\operatorname { Var } ( X ) = \sigma ^ { 2 } = \Sigma \left( x _ { i } - \mu \right) ^ { 2 } \mathrm { P } \left( X = x _ { i } \right) = \Sigma x _ { i } ^ { 2 } \mathrm { P } \left( X = x _ { i } \right) - \mu ^ { 2 }\)
      For a function \(\mathrm { g } ( X ) : \mathrm { E } ( \mathrm { g } ( X ) ) = \Sigma \mathrm { g } \left( x _ { i } \right) \mathrm { P } \left( X = x _ { i } \right)\) \section*{Continuous distributions} Standard continuous distribution:
      Distribution of \(X\)P.D.F.MeanVariance
      Normal \(\mathrm { N } \left( \mu , \sigma ^ { 2 } \right)\)\(\frac { 1 } { \sigma \sqrt { 2 \pi } } \mathrm { e } ^ { - \frac { 1 } { 2 } \left( \frac { x - \mu } { \sigma } \right) ^ { 2 } }\)\(\mu\)\(\sigma ^ { 2 }\)
      \section*{Correlation and regression} For a set of \(n\) pairs of values ( \(x _ { i } , y _ { i }\) ) $$\begin{aligned} & S _ { x x } = \Sigma \left( x _ { i } - \bar { x } \right) ^ { 2 } = \Sigma x _ { i } ^ { 2 } - \frac { \left( \Sigma x _ { i } \right) ^ { 2 } } { n }
      & S _ { y y } = \Sigma \left( y _ { i } - \bar { y } \right) ^ { 2 } = \Sigma y _ { i } ^ { 2 } - \frac { \left( \Sigma y _ { i } \right) ^ { 2 } } { n }
      & S _ { x y } = \Sigma \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) = \Sigma x _ { i } y _ { i } - \frac { \left( \Sigma x _ { i } \right) \left( \Sigma y _ { i } \right) } { n } \end{aligned}$$ The product moment correlation coefficient is $$r = \frac { S _ { x y } } { \sqrt { S _ { x x } S _ { y y } } } = \frac { \Sigma \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) } { \sqrt { \left\{ \Sigma \left( x _ { i } - \bar { x } \right) ^ { 2 } \right\} \left\{ \Sigma \left( y _ { i } - \bar { y } \right) ^ { 2 } \right\} } } = \frac { \Sigma x _ { i } y _ { i } - \frac { \left( \Sigma x _ { i } \right) \left( \Sigma y _ { i } \right) } { n } } { \sqrt { \left( \Sigma x _ { i } ^ { 2 } - \frac { \left( \Sigma x _ { i } \right) ^ { 2 } } { n } \right) \left( \Sigma y _ { i } ^ { 2 } - \frac { \left( \Sigma y _ { i } \right) ^ { 2 } } { n } \right) } }$$ The regression coefficient of \(y\) on \(x\) is \(b = \frac { S _ { x y } } { S _ { x x } } = \frac { \Sigma \left( x _ { i } - \bar { x } \right) \left( y _ { i } - \bar { y } \right) } { \Sigma \left( x _ { i } - \bar { x } \right) ^ { 2 } }\) Least squares regression line of \(y\) on \(x\) is \(y = a + b x\) where \(a = \bar { y } - b \bar { x }\) \section*{THE NORMAL DISTRIBUTION FUNCTION} The function tabulated below is \(\Phi ( z )\), defined as \(\Phi ( z ) = \frac { 1 } { \sqrt { 2 \pi } } \int _ { - \infty } ^ { z } e ^ { - \frac { 1 } { 2 } t ^ { 2 } } \mathrm {~d} t\).
      \(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)\(z\)\(\Phi ( z )\)
      0.000.50000.500.69151.000.84131.500.93322.000.9772
      0.010.50400.510.69501.010.84381.510.93452.020.9783
      0.020.50800.520.69851.020.84611.520.93572.040.9793
      0.030.51200.530.70191.030.84851.530.93702.060.9803
      0.040.51600.540.70541.040.85081.540.93822.080.9812
      0.050.51990.550.70881.050.85311.550.93942.100.9821
      0.060.52390.560.71231.060.85541.560.94062.120.9830
      0.070.52790.570.71571.070.85771.570.94182.140.9838
      0.080.53190.580.71901.080.85991.580.94292.160.9846
      0.090.53590.590.72241.090.86211.590.94412.180.9854
      0.100.53980.600.72571.100.86431.600.94522.200.9861
      0.110.54380.610.72911.110.86651.610.94632.220.9868
      0.120.54780.620.73241.120.86861.620.94742.240.9875
      0.130.55170.630.73571.130.87081.630.94842.260.9881
      0.140.55570.640.73891.140.87291.640.94952.280.9887
      0.150.55960.650.74221.150.87491.650.95052.300.9893
      0.160.56360.660.74541.160.87701.660.95152.320.9898
      0.170.56750.670.74861.170.87901.670.95252.340.9904
      0.180.57140.680.75171.180.88101.680.95352.360.9909
      0.190.57530.690.75491.190.88301.690.95452.380.9913
      0.200.57930.700.75801.200.88491.700.95542.400.9918
      0.210.58320.710.76111.210.88691.710.95642.420.9922
      0.220.58710.720.76421.220.88881.720.95732.440.9927
      0.230.59100.730.76731.230.89071.730.95822.460.9931
      0.240.59480.740.77041.240.89251.740.95912.480.9934
      0.250.59870.750.77341.250.89441.750.95992.500.9938
      0.260.60260.760.77641.260.89621.760.96082.550.9946
      0.270.60640.770.77941.270.89801.770.96162.600.9953
      0.280.61030.780.78231.280.89971.780.96252.650.9960
      0.290.61410.790.78521.290.90151.790.96332.700.9965
      0.300.61790.800.78811.300.90321.800.96412.750.9970
      0.310.62170.810.79101.310.90491.810.96492.800.9974
      0.320.62550.820.79391.320.90661.820.96562.850.9978
      0.330.62930.830.79671.330.90821.830.96642.900.9981
      0.340.63310.840.79951.340.90991.840.96712.950.9984
      0.350.63680.850.80231.350.91151.850.96783.000.9987
      0.360.64060.860.80511.360.91311.860.96863.050.9989
      0.370.64430.870.80781.370.91471.870.96933.100.9990
      0.380.64800.880.81061.380.91621.880.96993.150.9992
      0.390.65170.890.81331.390.91771.890.97063.200.9993
      0.400.65540.900.81591.400.91921.900.97133.250.9994
      0.410.65910.910.81861.410.92071.910.97193.300.9995
      0.420.66280.920.82121.420.92221.920.97263.350.9996
      0.430.66640.930.82381.430.92361.930.97323.400.9997
      0.440.67000.940.82641.440.92511.940.97383.500.9998
      0.450.67360.950.82891.450.92651.950.97443.600.9998
      0.460.67720.960.83151.460.92791.960.97503.700.9999
      0.470.68080.970.83401.470.92921.970.97563.800.9999
      0.480.68440.980.83651.480.93061.980.97613.901.0000
      0.490.68790.990.83891.490.93191.990.97674.001.0000
      0.500.69151.000.84131.500.93322.000.9772
      \section*{PERCENTAGE POINTS OF THE NORMAL DISTRIBUTION} The values \(z\) in the table are those which a random variable \(Z \sim N ( 0,1 )\) exceeds with probability \(p\); that is, \(\mathrm { P } ( \mathrm { Z } > \mathrm { z } ) = 1 - \Phi ( \mathrm { z } ) = p\).
      \(p\)\(z\)\(p\)\(z\)
      0.50000.00000.05001.6449
      0.40000.25330.02501.9600
      0.30000.52440.01002.3263
      0.20000.84160.00502.5758
      0.15001.03640.00103.0902
      0.10001.28160.00053.2905
    Edexcel S1 2003 June Q1
    1. In a particular week, a dentist treats 100 patients. The length of time, to the nearest minute, for each patient's treatment is summarised in the table below.
    Time
    (minutes)
    \(4 - 7\)8\(9 - 10\)11\(12 - 16\)\(17 - 20\)
    Number
    of
    patients
    122018221513
    Draw a histogram to illustrate these data.
    Edexcel S1 2003 June Q2
    2. The lifetimes of batteries used for a computer game have a mean of 12 hours and a standard deviation of 3 hours. Battery lifetimes may be assumed to be normally distributed. Find the lifetime, \(t\) hours, of a battery such that 1 battery in 5 will have a lifetime longer than \(t\).
    Edexcel S1 2003 June Q3
    3. A company owns two petrol stations \(P\) and \(Q\) along a main road. Total daily sales in the same week for \(P ( \pounds p )\) and for \(Q ( \pounds q )\) are summarised in the table below.
    \(p\)\(q\)
    Monday47605380
    Tuesday53954460
    Wednesday58404640
    Thursday46505450
    Friday53654340
    Saturday49905550
    Sunday43655840
    When these data are coded using \(x = \frac { p - 4365 } { 100 }\) and \(y = \frac { q - 4340 } { 100 }\), $$\Sigma x = 48.1 , \Sigma y = 52.8 , \Sigma x ^ { 2 } = 486.44 , \Sigma y ^ { 2 } = 613.22 \text { and } \Sigma x y = 204.95 .$$
    1. Calculate \(S _ { x y } , S _ { x x }\) and \(S _ { y y }\).
    2. Calculate, to 3 significant figures, the value of the product moment correlation coefficient between \(x\) and \(y\).
      1. Write down the value of the product moment correlation coefficient between \(p\) and \(q\).
      2. Give an interpretation of this value.
    Edexcel S1 2003 June Q4
    4. The discrete random variable \(X\) has probability function $$\mathrm { P } ( X = x ) = \begin{array} { l l } k \left( x ^ { 2 } - 9 \right) , & x = 4,5,6
    0 , & \text { otherwise } \end{array}$$ where \(k\) is a positive constant.
    1. Show that \(k = \frac { 1 } { 50 }\).
    2. Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
    3. Find \(\operatorname { Var } ( 2 X - 3 )\).
    Edexcel S1 2003 June Q5
    5. The random variable \(X\) represents the number on the uppermost face when a fair die is thrown.
    1. Write down the name of the probability distribution of \(X\).
    2. Calculate the mean and the variance of \(X\). Three fair dice are thrown and the numbers on the uppermost faces are recorded.
    3. Find the probability that all three numbers are 6 .
    4. Write down all the different ways of scoring a total of 16 when the three numbers are added together.
    5. Find the probability of scoring a total of 16 .
    Edexcel S1 2003 June Q6
    6. The number of bags of potato crisps sold per day in a bar was recorded over a two-week period. The results are shown below. $$20,15,10,30,33,40,5,11,13,20,25,42,31,17$$
    1. Calculate the mean of these data.
    2. Draw a stem and leaf diagram to represent these data.
    3. Find the median and the quartiles of these data. An outlier is an observation that falls either \(1.5 \times\) (interquartile range) above the upper quartile or \(1.5 \times\) (interquartile range) below the lower quartile.
    4. Determine whether or not any items of data are outliers.
    5. On graph paper draw a box plot to represent these data. Show your scale clearly.
    6. Comment on the skewness of the distribution of bags of crisps sold per day. Justify your answer.
    Edexcel S1 2003 June Q7
    1. Eight students took tests in mathematics and physics. The marks for each student are given in the table below where \(m\) represents the mathematics mark and \(p\) the physics mark.
    \multirow{2}{*}{}Student
    \(A\)B\(C\)D\(E\)\(F\)G\(H\)
    \multirow{2}{*}{Mark}\(m\)9141310782017
    \(p\)1123211519103126
    A science teacher believes that students' marks in physics depend upon their mathematical ability. The teacher decides to investigate this relationship using the test marks.
    1. Write down which is the explanatory variable in this investigation.
    2. Draw a scatter diagram to illustrate these data.
    3. Showing your working, find the equation of the regression line of \(p\) on \(m\).
    4. Draw the regression line on your scatter diagram. A ninth student was absent for the physics test, but she sat the mathematics test and scored 15 .
    5. Using this model, estimate the mark she would have scored in the physics test.
    Edexcel S2 Q3
    3. In a sack containing a large number of beads \(\frac { 1 } { 4 }\) are coloured gold and the remainder are of different colours. A group of children use some of the beads in a craft lesson and do not replace them. Afterwards the teacher wishes to know whether or not the proportion of gold beads left in the sack has changed. He selects a random sample of 20 beads and finds that 2 of them are coloured gold. Stating your hypotheses clearly test, at the \(10 \%\) level of significance, whether or not there is evidence that the proportion of gold beads has changed.
    nd the probability of
    (c) no accidents in exactly 2 of the next 4 months.
    \includegraphics[max width=\textwidth, alt={}, center]{image-not-found}
    Edexcel S2 Q4
    4. A company always sends letters by second class post unless they are marked first class. Over a long period of time it has been established that \(20 \%\) of letters to be posted are marked first class.
    In a random selection of 10 letters to be posted, find the probability that the number marked first class is
    1. at least 3,
    2. fewer than 2. One Monday morning there are only 12 first class stamps. Given that there are 70 letters to be posted that day,
    3. use a suitable approximation to find the probability that there are enough first class stamps.
    4. State an assumption about these 70 letters that is required in order to make the calculation in part (c) valid.
    Edexcel S2 Q5
    5. The maintenance department of a college receives requests for replacement light bulbs at a rate of 2 per week. Find the probability that in a randomly chosen week the number of requests for replacement light bulbs is
    1. exactly 4,
    2. more than 5 . Three weeks before the end of term the maintenance department discovers that there are only 5 light bulbs left.
    3. Find the probability that the department can meet all requests for replacement light bulbs before the end of term. The following term the principal of the college announces a package of new measures to reduce the amount of damage to college property. In the first 4 weeks following this announcement, 3 requests for replacement light bulbs are received.
    4. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not there is evidence that the rate of requests for replacement light bulbs has decreased.
      (5)
    Edexcel S2 Q7
    7. In a computer game, a star moves across the screen, with constant speed, taking 1 s to travel from one side to the other. The player can stop the star by pressing a key. The object of the game is to stop the star in the middle of the screen by pressing the key exactly 0.5 s after the star first appears. Given that the player actually presses the key \(T \mathrm {~s}\) after the star first appears, a simple model of the game assumes that \(T\) is a continuous uniform random variable defined over the interval \([ 0,1 ]\).
    1. Write down \(\mathrm { P } ( \mathrm { T } < 0.2 )\).
    2. Write down E(T).
    3. Use integration to find \(\operatorname { Var } ( T )\). A group of 20 children each play this game once.
    4. Find the probability that no more than 4 children stop the star in less than 0.2 s . The children are allowed to practise this game so that this continuous uniform model is no longer applicable.
    5. Explain how you would expect the mean and variance of T to change. It is found that a more appropriate model of the game when played by experienced players assumes that \(T\) has a probability density function \(\mathrm { g } ( t )\) given by $$\mathrm { g } ( t ) = \begin{cases} 4 t , & 0 \leq t \leq 0.5
      4 - 4 t , & 0.5 \leq t \leq 1
      0 , & \text { otherwise } \end{cases}$$
    6. Using this model show that \(\mathrm { P } ( T < 0.2 ) = 0.08\). A group of 75 experienced players each played this game once.
    7. Using a suitable approximation, find the probability that more than 7 of them stop the star in less than 0.2 s . \section*{END} Items included with question papers Nil Materials required for examination
      Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. \section*{Edexcel GCE
      Statistics S2
      (New Syllabus)
      Advanced/Advanced Subsidiary} Wednesday 23 January 2002 - Afternoon
      Time: 1 hour 30 minutes In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. Explain what you understand by
      2. a population,
      3. a statistic.
      A questionnaire concerning attitudes to classes in a college was completed by a random sample of 50 students. The students gave the college a mean approval rating of 75\%.
    8. Identify the population and the statistic in this situation.
    9. Explain what you understand by the sampling distribution of this statistic.
      2. The number of houses sold per week by a firm of estate agents follows a Poisson distribution with mean 2.5. The firm appoints a new salesman and wants to find out whether or not house sales increase as a result. After the appointment of the salesman, the number of house sales in a randomly chosen 4-week period is 14 . Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not the new salesman has increased house sales.
      3. An airline knows that overall \(3 \%\) of passengers do not turn up for flights. The airline decides to adopt a policy of selling more tickets than there are seats on a flight. For an aircraft with 196 seats, the airline sold 200 tickets for a particular flight.
    10. Write down a suitable model for the number of passengers who do not turn up for this flight after buying a ticket. By using a suitable approximation, find the probability that
    11. more than 196 passengers turn up for this flight,
    12. there is at least one empty seat on this flight.
      4. Jean catches a bus to work every morning. According to the timetable the bus is due at 8 a.m., but Jean knows that the bus can arrive at a random time between five minutes early and 9 minutes late. The random variable \(X\) represents the time, in minutes, after 7.55 a.m. when the bus arrives.
    13. Suggest a suitable model for the distribution of \(X\) and specify it fully.
    14. Calculate the mean time of arrival of the bus.
    15. Find the cumulative distribution function of \(X\). Jean will be late for work if the bus arrives after 8.05 a.m.
    16. Find the probability that Jean is late for work.
      5. An Internet service provider has a large number of users regularly connecting to its computers. On average only 3 users every hour fail to connect to the Internet at their first attempt.
    17. Give 2 reasons why a Poisson distribution might be a suitable model for the number of failed connections every hour.
    18. Find the probability that in a randomly chosen hour
      1. all Internet users connect at their first attempt,
      2. more than 4 users fail to connect at their first attempt.
    19. Write down the distribution of the number of users failing to connect at their first attempt in an 8 -hour period.
    20. Using a suitable approximation, find the probability that 12 or more users fail to connect at their first attempt in a randomly chosen 8-hour period.
      6. The owner of a small restaurant decides to change the menu. A trade magazine claims that \(40 \%\) of all diners choose organic foods when eating away from home. On a randomly chosen day there are 20 diners eating in the restaurant.
    21. Assuming the claim made by the trade magazine to be correct, suggest a suitable model to describe the number of diners \(X\) who choose organic foods.
    22. Find \(\mathrm { P } ( 5 < X < 15 )\).
    23. Find the mean and standard deviation of \(X\). The owner decides to survey her customers before finalising the new menu. She surveys 10 randomly chosen diners and finds 8 who prefer eating organic foods.
    24. Test, at the \(5 \%\) level of significance, whether or not there is reason to believe that the proportion of diners in her restaurant who prefer to eat organic foods is higher than the trade magazine's claim. State your hypotheses clearly.
      7. A continuous random variable \(X\) has cumulative distribution function \(\mathrm { F } ( x )\) given by $$\mathrm { F } ( x ) = \left\{ \begin{array} { l r } 0 , & x < 0 ,
      k x ^ { 2 } + 2 k x , & 0 \leq x \leq 2 ,
      8 k , & x > 2 . \end{array} \right.$$
    25. Show that \(k = \frac { 1 } { 8 }\).
    26. Find the median of \(X\).
    27. Find the probability density function \(\mathrm { f } ( x )\).
    28. Sketch \(\mathrm { f } ( x )\) for all values of \(x\).
    29. Write down the mode of \(X\).
    30. Find \(\mathrm { E } ( X )\).
    31. Comment on the skewness of this distribution. Items included with question papers
      Nil Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided. Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. The manager of a leisure club is considering a change to the club rules. The club has a large membership and the manager wants to take the views of the members into consideration before deciding whether or not to make the change.
      2. Explain briefly why the manager might prefer to use a sample survey rather than a census to obtain the views.
      3. Suggest a suitable sampling frame.
      4. Identify the sampling units.
      5. A random sample \(X _ { 1 } , X _ { 2 } , \ldots , X _ { n }\) is taken from a finite population. A statistic \(Y\) is based on this sample.
      6. Explain what you understand by the statistic \(Y\).
      7. Give an example of a statistic.
      8. Explain what you understand by the sampling distribution of \(Y\).
      9. The continuous random variable \(R\) is uniformly distributed on the interval \(\alpha \leq R \leq \beta\). Given that \(\mathrm { E } ( R ) = 3\) and \(\operatorname { Var } ( R ) = \frac { 25 } { 3 }\), find
      10. the value of \(\alpha\) and the value of \(\beta\),
      11. \(\mathrm { P } ( R < 6.6 )\).
      12. Past records show that \(20 \%\) of customers who buy crisps from a large supermarket buy them in single packets. During a particular day a random sample of 25 customers who had bought crisps was taken and 2 of them had bought them in single packets.
      13. Use these data to test, at the \(5 \%\) level of significance, whether or not the percentage of customers who bought crisps in single packets that day was lower than usual. State your hypotheses clearly.
        (6)
      At the same supermarket, the manager thinks that the probability of a customer buying a bumper pack of crisps is 0.03 . To test whether or not this hypothesis is true the manager decides to take a random sample of 300 customers.
    32. Stating your hypotheses clearly, find the critical region to enable the manager to test whether or not there is evidence that the probability is different from 0.03 . The probability for each tail of the region should be as close as possible to \(2.5 \%\).
    33. Write down the significance level of this test.
      5. A garden centre sells canes of nominal length 150 cm . The canes are bought from a supplier who uses a machine to cut canes of length \(L\) where \(L \sim \mathrm {~N} \left( \mu , 0.3 ^ { 2 } \right)\).
    34. Find the value of \(\mu\), to the nearest 0.1 cm , such that there is only a \(5 \%\) chance that a cane supplied to the garden centre will have length less than 150 cm . A customer buys 10 of these canes from the garden centre.
    35. Find the probability that at most 2 of the canes have length less than 150 cm . Another customer buys 500 canes.
    36. Using a suitable approximation, find the probability that fewer than 35 of the canes will have length less than 150 cm .
      6. From past records, a manufacturer of twine knows that faults occur in the twine at random and at a rate of 1.5 per 25 m .
    37. Find the probability that in a randomly chosen 25 m length of twine there will be exactly 4 faults. The twine is usually sold in balls of length 100 m . A customer buys three balls of twine.
    38. Find the probability that only one of them will have fewer than 6 faults. As a special order a ball of twine containing 500 m is produced.
    39. Using a suitable approximation, find the probability that it will contain between 23 and 33 faults inclusive.
      7. The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { x } { 15 } , & 0 \leq x \leq 2
      \frac { 2 } { 15 } , & 2 < x < 7
      \frac { 4 } { 9 } - \frac { 2 x } { 45 } , & 7 \leq x \leq 10
      0 , & \text { otherwise } \end{cases}$$
    40. Sketch \(\mathrm { f } ( x )\) for all values of \(x\).
      1. Find expressions for the cumulative distribution function, \(\mathrm { F } ( x )\), for \(0 \leq x \leq 2\) and for \(7 \leq x \leq 10\).
      2. Show that for \(2 < x < 7 , \mathrm {~F} ( x ) = \frac { 2 x } { 15 } - \frac { 2 } { 15 }\).
      3. Specify \(\mathrm { F } ( x )\) for \(x < 0\) and for \(x > 10\).
    41. Find \(\mathrm { P } ( X \leq 8.2 )\).
    42. Find, to 3 significant figures, \(\mathrm { E } ( X )\). Items included with question papers Nil Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
      6684 \section*{Edexcel GCE
      Statistics S2
      Advanced/Advanced Subsidiary Friday 24 January 2003 - Morning Time: 1 hour 30 minutes} In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has six questions. You must ensure that your answers to parts of questions are clearly labelled. You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. An engineer measures, to the nearest cm , the lengths of metal rods.
      2. Suggest a suitable model to represent the difference between the true lengths and the measured lengths.
      3. Find the probability that for a randomly chosen rod the measured length will be within 0.2 cm of the true length.
      Two rods are chosen at random.
    43. Find the probability that for both rods the measured lengths will be within 0.2 cm of their true lengths.
      2. A single observation \(x\) is to be taken from a Poisson distribution with parameter \(\lambda\). This observation is to be used to test \(\mathrm { H } _ { 0 } : \lambda = 7\) against \(\mathrm { H } _ { 1 } : \lambda \neq 7\).
    44. Using a \(5 \%\) significance level, find the critical region for this test assuming that the probability of rejection in either tail is as close as possible to \(2.5 \%\).
    45. Write down the significance level of this test. The actual value of \(x\) obtained was 5 .
    46. State a conclusion that can be drawn based on this value.
      3. A botanist suggests that the number of a particular variety of weed growing in a meadow can be modelled by a Poisson distribution.
    47. Write down two conditions that must apply for this model to be applicable. Assuming this model and a mean of 0.7 weeds per \(\mathrm { m } ^ { 2 }\), find
    48. the probability that in a randomly chosen plot of size \(4 \mathrm {~m} ^ { 2 }\) there will be fewer than 3 of these weeds.
    49. Using a suitable approximation, find the probability that in a plot of \(100 \mathrm {~m} ^ { 2 }\) there will be more than 66 of these weeds.
      4. The continuous random variable \(X\) has cumulative distribution function $$\mathrm { F } ( x ) = \begin{cases} 0 , & x < 0
      \frac { 1 } { 3 } x ^ { 2 } \left( 4 - x ^ { 2 } \right) , & 0 \leq x \leq 1
      1 & x > 1 \end{cases}$$
    50. Find \(\mathrm { P } ( X > 0.7 )\).
    51. Find the probability density function \(\mathrm { f } ( x )\) of \(X\).
    52. Calculate \(\mathrm { E } ( X )\) and show that, to 3 decimal places, \(\operatorname { Var } ( X ) = 0.057\). One measure of skewness is $$\frac { \text { Mean - Mode } } { \text { Standard deviation } }$$
    53. Evaluate the skewness of the distribution of \(X\).
      5. A farmer noticed that some of the eggs laid by his hens had double yolks. He estimated the probability of this happening to be 0.05 . Eggs are packed in boxes of 12 . Find the probability that in a box, the number of eggs with double yolks will be
    54. exactly one,
    55. more than three. A customer bought three boxes.
    56. Find the probability that only 2 of the boxes contained exactly 1 egg with a double yolk. The farmer delivered 10 boxes to a local shop.
    57. Using a suitable approximation, find the probability that the delivery contained at least 9 eggs with double yolks. The weight of an individual egg can be modelled by a normal distribution with mean 65 g and standard deviation 2.4 g .
    58. Find the probability that a randomly chosen egg weighs more than 68 g .
      (3)
      6. A magazine has a large number of subscribers who each pay a membership fee that is due on January 1st each year. Not all subscribers pay their fee by the due date. Based on correspondence from the subscribers, the editor of the magazine believes that \(40 \%\) of subscribers wish to change the name of the magazine. Before making this change the editor decides to carry out a sample survey to obtain the opinions of the subscribers. He uses only those members who have paid their fee on time.
    59. Define the population associated with the magazine.
    60. Suggest a suitable sampling frame for the survey.
    61. Identify the sampling units.
    62. Give one advantage and one disadvantage that would have resulted from the editor using a census rather than a sample survey. As a pilot study the editor took a random sample of 25 subscribers.
    63. Assuming that the editor's belief is correct, find the probability that exactly 10 of these subscribers agreed with changing the name. In fact only 6 subscribers agreed to the name being changed.
    64. Stating your hypotheses clearly test, at the \(5 \%\) level of significance, whether or not the percentage agreeing to the change is less that the editor believes. The full survey is to be carried out using 200 randomly chosen subscribers.
    65. Again assuming the editor's belief to be correct and using a suitable approximation, find the probability that in this sample there will be least 71 but fewer than 83 subscribers who agree to the name being changed. \section*{END} Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Nil Paper Reference(s)
      6684 \section*{Edexcel GCE
      Statistics S2} Advanced/Advanced Subsidiary
      Tuesday 17 June 2003 - Afternoon
      Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. Explain briefly what you understand by
      2. a statistic,
      3. a sampling distribution.
      4. (a) Write down the condition needed to approximate a Poisson distribution by a Normal distribution.
      The random variable \(Y \sim \operatorname { Po } ( 30 )\).
    66. Estimate \(\mathrm { P } ( Y > 28 )\).
      3. In a town, \(30 \%\) of residents listen to the local radio station. Four residents are chosen at random.
    67. State the distribution of the random variable \(X\), the number of these four residents that listen to local radio.
    68. On graph paper, draw the probability distribution of \(X\).
    69. Write down the most likely number of these four residents that listen to the local radio station.
    70. Find \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
      4. (a) Write down the conditions under which the binomial distribution may be a suitable model to use in statistical work. A six-sided die is biased. When the die is thrown the number 5 is twice as likely to appear as any other number. All the other faces are equally likely to appear. The die is thrown repeatedly. Find the probability that
      1. the first 5 will occur on the sixth throw,
      2. in the first eight throws there will be exactly three 5 s .
        5. A drinks machine dispenses lemonade into cups. It is electronically controlled to cut off the flow of lemonade randomly between 180 ml and 200 ml . The random variable \(X\) is the volume of lemonade dispensed into a cup.
    71. Specify the probability density function of \(X\) and sketch its graph.
    72. Find the probability that the machine dispenses
      1. less than 183 ml ,
      2. exactly 183 ml .
    73. Calculate the inter-quartile range of \(X\).
    74. Determine the value of \(x\) such that \(\mathrm { P } ( X \geq x ) = 2 \mathrm { P } ( X \leq x )\).
    75. Interpret in words your value of \(x\).
      6. A doctor expects to see, on average, 1 patient per week with a particular disease.
    76. Suggest a suitable model for the distribution of the number of times per week that the doctor sees a patient with the disease. Give a reason for your answer.
    77. Using your model, find the probability that the doctor sees more than 3 patients with the disease in a 4 week period. The doctor decides to send information to his patients to try to reduce the number of patients he sees with the disease. In the first 6 weeks after the information is sent out, the doctor sees 2 patients with the disease.
    78. Test, at the \(5 \%\) level of significance, whether or not there is reason to believe that sending the information has reduced the number of times the doctor sees patients with the disease. State your hypotheses clearly. Medical research into the nature of the disease discovers that it can be passed from one patient to another.
    79. Explain whether or not this research supports your choice of model. Give a reason for your answer.
      7. A continuous random variable \(X\) has probability density function \(\mathrm { f } ( x )\) where $$\mathrm { f } ( x ) = \begin{cases} k \left( x ^ { 2 } + 2 x + 1 \right) & - 1 \leq x \leq 0
      0 , & \text { otherwise } \end{cases}$$ where \(k\) is a positive integer.
    80. Show that \(k = 3\). Find
    81. \(\mathrm { E } ( X )\),
    82. the cumulative distribution function \(\mathrm { F } ( x )\),
    83. \(\mathrm { P } ( - 0.3 < X < 0.3 )\). \section*{END} Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Items included with question papers Nil Paper Reference(s)
      6684 Statistics S2
      Advanced/Advanced Subsidiary
      Friday 23 January 2004 - Morning
      Time: 1 hour 30 minutes Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. A large dental practice wishes to investigate the level of satisfaction of its patients.
      2. Suggest a suitable sampling frame for the investigation.
      3. Identify the sampling units.
      4. State one advantage and one disadvantage of using a sample survey rather than a census.
      5. Suggest a problem that might arise with the sampling frame when selecting patients.
      6. The random variable \(R\) has the binomial distribution \(\mathrm { B } ( 12,0.35 )\).
      7. Find \(\mathrm { P } ( R \geq 4 )\).
      The random variable \(S\) has the Poisson distribution with mean 2.71.
    84. Find \(\mathrm { P } ( S \leq 1 )\). The random variable \(T\) has the normal distribution \(\mathrm { N } \left( 25,5 ^ { 2 } \right)\).
    85. Find \(\mathrm { P } ( T \leq 18 )\).
      3. The discrete random variable \(X\) is distributed \(\mathrm { B } ( n , p )\).
    86. Write down the value of \(p\) that will give the most accurate estimate when approximating the binomial distribution by a normal distribution.
    87. Give a reason to support your value.
    88. Given that \(n = 200\) and \(p = 0.48\), find \(\mathrm { P } ( 90 \leq X < 105 )\).
      4. (a) Write down two conditions needed to be able to approximate the binomial distribution by the Poisson distribution.
      (2) A researcher has suggested that 1 in 150 people is likely to catch a particular virus.
      Assuming that a person catching the virus is independent of any other person catching it,
    89. find the probability that in a random sample of 12 people, exactly 2 of them catch the virus.
    90. Estimate the probability that in a random sample of 1200 people fewer than 7 catch the virus.
      5. Vehicles pass a particular point on a road at a rate of 51 vehicles per hour.
    91. Give two reasons to support the use of the Poisson distribution as a suitable model for the number of vehicles passing this point. Find the probability that in any randomly selected 10 minute interval
    92. exactly 6 cars pass this point,
    93. at least 9 cars pass this point. After the introduction of a roundabout some distance away from this point it is suggested that the number of vehicles passing it has decreased. During a randomly selected 10 minute interval 4 vehicles pass the point.
    94. Test, at the \(5 \%\) level of significance, whether or not there is evidence to support the suggestion that the number of vehicles has decreased. State your hypotheses clearly.
      (6)
      6. From past records a manufacturer of ceramic plant pots knows that \(20 \%\) of them will have defects. To monitor the production process, a random sample of 25 pots is checked each day and the number of pots with defects is recorded.
    95. Find the critical regions for a two-tailed test of the hypothesis that the probability that a plant pot has defects is 0.20 . The probability of rejection in either tail should be as close as possible to \(2.5 \%\).
    96. Write down the significance level of the above test. A garden centre sells these plant pots at a rate of 10 per week. In an attempt to increase sales, the price was reduced over a six-week period. During this period a total of 74 pots was sold.
    97. Using a \(5 \%\) level of significance, test whether or not there is evidence that the rate of sales per week has increased during this six-week period.
      7. The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} k x ( 5 - x ) , & 0 \leq x \leq 4
      0 , & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
    98. Show that \(k = \frac { 3 } { 56 }\).
    99. Find the cumulative distribution function \(\mathrm { F } ( x )\) for all values of \(x\).
    100. Evaluate \(\mathrm { E } ( X )\).
    101. Find the modal value of \(X\).
    102. Verify that the median value of \(X\) lies between 2.3 and 2.5.
    103. Comment on the skewness of \(X\). Justify your answer. \section*{END} Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Items included with question papers
      Nil Paper Reference(s)
      6684 Statistics S2
      Advanced/Advanced Subsidiary
      Wednesday 23 June 2004 - Morning Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. Explain briefly what you understand by
      2. a sampling frame,
      3. a statistic.
      4. The continuous random variable \(X\) is uniformly distributed over the interval \([ - 1,4 ]\).
      Find
    104. \(\mathrm { P } ( X < 2.7 )\),
    105. \(\mathrm { E } ( X )\),
    106. \(\operatorname { Var } ( X )\).
      3. Brad planted 25 seeds in his greenhouse. He has read in a gardening book that the probability of one of these seeds germinating is 0.25 . Ten of Brad's seeds germinated. He claimed that the gardening book had underestimated this probability. Test, at the \(5 \%\) level of significance, Brad's claim. State your hypotheses clearly.
      (7)
      4. (a) State two conditions under which a random variable can be modelled by a binomial distribution.
      (2) In the production of a certain electronic component it is found that \(10 \%\) are defective.
      The component is produced in batches of 20.
    107. Write down a suitable model for the distribution of defective components in a batch. Find the probability that a batch contains
    108. no defective components,
    109. more than 6 defective components.
    110. Find the mean and the variance of the defective components in a batch. A supplier buys 100 components. The supplier will receive a refund if there are more than 15 defective components.
    111. Using a suitable approximation, find the probability that the supplier will receive a refund.
      (4)
      5. (a) Explain what you understand by a critical region of a test statistic. The number of breakdowns per day in a large fleet of hire cars has a Poisson distribution with mean \(\frac { 1 } { 7 }\).
    112. Find the probability that on a particular day there are fewer than 2 breakdowns.
    113. Find the probability that during a 14-day period there are at most 4 breakdowns. The cars are maintained at a garage. The garage introduced a weekly check to try to decrease the number of cars that break down. In a randomly selected 28 -day period after the checks are introduced, only 1 hire car broke down.
    114. Test, at the \(5 \%\) level of significance, whether or not the mean number of breakdowns has decreased. State your hypotheses clearly.
      (7)
      6. Minor defects occur in a particular make of carpet at a mean rate of 0.05 per \(\mathrm { m } ^ { 2 }\).
    115. Suggest a suitable model for the distribution of the number of defects in this make of carpet. Give a reason for your answer. A carpet fitter has a contract to fit this carpet in a small hotel. The hotel foyer requires \(30 \mathrm {~m} ^ { 2 }\) of this carpet. Find the probability that the foyer carpet contains
    116. exactly 2 defects,
    117. more than 5 defects. The carpet fitter orders a total of \(355 \mathrm {~m} ^ { 2 }\) of the carpet for the whole hotel.
    118. Using a suitable approximation, find the probability that this total area of carpet contains 22 or more defects.
      7. A random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 1 } { 3 } , & 0 \leq x \leq 1
      \frac { 8 x ^ { 3 } } { 45 } , & 1 \leq x \leq 2
      0 , & \text { otherwise } \end{cases}$$
    119. Calculate the mean of \(X\).
    120. Specify fully the cumulative distribution function \(\mathrm { F } ( x )\).
    121. Find the median of \(X\).
    122. Comment on the skewness of the distribution of \(X\). \section*{END} Paper Reference(s)
      6684 \section*{Edexcel GCE
      Statistics S2
      Advanced/Advanced Subsidiary Tuesday 25 January 2005 - Morning Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes } Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. The random variables \(R , S\) and \(T\) are distributed as follows
      $$R \sim \mathrm {~B} ( 15,0.3 ) , \quad S \sim \mathrm { Po } ( 7.5 ) , \quad T \sim \mathrm {~N} \left( 8,2 ^ { 2 } \right) .$$ Find
    123. \(\mathrm { P } ( R = 5 )\),
    124. \(\mathrm { P } ( S = 5 )\),
    125. \(\mathrm { P } ( T = 5 )\).
      2. (a) Explain what you understand by (i) a population and (ii) a sampling frame. The population and the sampling frame may not be the same.
    126. Explain why this might be the case.
    127. Give an example, justifying your choices, to illustrate when you might use
      1. a census,
      2. a sample.
        3. A rod of length \(2 l\) was broken into 2 parts. The point at which the rod broke is equally likely to be anywhere along the rod. The length of the shorter piece of rod is represented by the random variable \(X\).
    128. Write down the name of the probability density function of \(X\), and specify it fully.
    129. Find \(\mathrm { P } \left( X < \frac { 1 } { 3 } l \right)\).
    130. Write down the value of \(\mathrm { E } ( X )\). Two identical rods of length \(2 l\) are broken.
    131. Find the probability that both of the shorter pieces are of length less than \(\frac { 1 } { 3 } l\).
      4. In an experiment, there are 250 trials and each trial results in a success or a failure.
    132. Write down two other conditions needed to make this into a binomial experiment. It is claimed that \(10 \%\) of students can tell the difference between two brands of baked beans. In a random sample of 250 students, 40 of them were able to distinguish the difference between the two brands.
    133. Using a normal approximation, test at the \(1 \%\) level of significance whether or not the claim is justified. Use a one-tailed test.
    134. Comment on the acceptability of the assumptions you needed to carry out the test.
      5. From company records, a manager knows that the probability that a defective article is produced by a particular production line is 0.032 . A random sample of 10 articles is selected from the production line.
    135. Find the probability that exactly 2 of them are defective. On another occasion, a random sample of 100 articles is taken.
    136. Using a suitable approximation, find the probability that fewer than 4 of them are defective. At a later date, a random sample of 1000 is taken.
    137. Using a suitable approximation, find the probability that more than 42 are defective.
      6. Over a long period of time, accidents happened on a stretch of road at random at a rate of 3 per month. Find the probability that
    138. in a randomly chosen month, more than 4 accidents occurred,
    139. in a three-month period, more than 4 accidents occurred. At a later date, a speed restriction was introduced on this stretch of road. During a randomly chosen month only one accident occurred.
    140. Test, at the \(5 \%\) level of significance, whether or not there is evidence to support the claim that this speed restriction reduced the mean number of road accidents occurring per month. The speed restriction was kept on this road. Over a two-year period, 55 accidents occurred.
    141. Test, at the \(5 \%\) level of significance, whether or not there is now evidence that this speed restriction reduced the mean number of road accidents occurring per month.
      7. The random variable \(X\) has probability density function $$\mathrm { f } ( x ) = \left\{ \begin{array} { l c } k \left( - x ^ { 2 } + 5 x - 4 \right) , & 1 \leq x \leq 4
      0 , & \text { otherwise } \end{array} \right.$$
    142. Show that \(k = \frac { 2 } { 9 }\). Find
    143. \(\mathrm { E } ( X )\),
    144. the mode of \(X\).
    145. the cumulative distribution function \(\mathrm { F } ( x )\) for all \(x\).
    146. Evaluate \(\mathrm { P } ( X \leq 2.5 )\),
    147. Deduce the value of the median and comment on the shape of the distribution. Materials required for examination
      Mathematical Formulae (Lilac or Green) Items included with question papers Nil Paper Reference(s)
      6684/01 \section*{Advanced/Advanced Subsidiary} \section*{Wednesday 22 June 2005 - Afternoon} Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. It is estimated that \(4 \%\) of people have green eyes. In a random sample of size \(n\), the expected number of people with green eyes is 5 .
      2. Calculate the value of \(n\).
      The expected number of people with green eyes in a second random sample is 3 .
    148. Find the standard deviation of the number of people with green eyes in this second sample.
      2. The continuous random variable X is uniformly distributed over the interval \([ 2,6 ]\).
    149. Write down the probability density function \(\mathrm { f } ( x )\). Find
    150. \(\mathrm { E } ( X )\),
    151. \(\operatorname { Var } ( X )\),
    152. the cumulative distribution function of \(X\), for all \(x\),
    153. \(\mathrm { P } ( 2.3 < X < 3.4 )\).
      3. The random variable \(X\) is the number of misprints per page in the first draft of a novel.
    154. State two conditions under which a Poisson distribution is a suitable model for \(X\). The number of misprints per page has a Poisson distribution with mean 2.5. Find the probability that
    155. a randomly chosen page has no misprints,
    156. the total number of misprints on 2 randomly chosen pages is more than 7 . The first chapter contains 20 pages.
    157. Using a suitable approximation find, to 2 decimal places, the probability that the chapter will contain less than 40 misprints.
      4. Explain what you understand by
    158. a sampling unit,
    159. a sampling frame,
    160. a sampling distribution.
      5. In a manufacturing process, \(2 \%\) of the articles produced are defective. A batch of 200 articles is selected.
    161. Giving a justification for your choice, use a suitable approximation to estimate the probability that there are exactly 5 defective articles.
    162. Estimate the probability that there are less than 5 defective articles.
      6. A continuous random variable \(X\) has probability density function \(\mathrm { f } ( x )\) where $$\mathrm { f } ( x ) = \begin{cases} k \left( 4 x - x ^ { 3 } \right) , & 0 \leq x \leq 2
      0 , & \text { otherwise } \end{cases}$$ where \(k\) is a positive constant.
    163. Show that \(k = \frac { 1 } { 4 }\). Find
    164. \(\mathrm { E } ( X )\),
    165. the mode of \(X\),
    166. the median of \(X\).
    167. Comment on the skewness of the distribution.
    168. Sketch \(\mathrm { f } ( x )\).
      7. A drugs company claims that \(75 \%\) of patients suffering from depression recover when treated with a new drug. A random sample of 10 patients with depression is taken from a doctor's records.
    169. Write down a suitable distribution to model the number of patients in this sample who recover when treated with the new drug. Given that the claim is correct,
    170. find the probability that the treatment will be successful for exactly 6 patients. The doctor believes that the claim is incorrect and the percentage who will recover is lower. From her records she took a random sample of 20 patients who had been treated with the new drug. She found that 13 had recovered.
    171. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, the doctor's belief.
    172. From a sample of size 20, find the greatest number of patients who need to recover from the test in part (c) to be significant at the \(1 \%\) level.
    Edexcel S2 Q8
    8. The continuous random variable \(X\) has probability density function \(\mathrm { f } ( x )\) given by $$f ( x ) = \left\{ \begin{array} { l c } 2 ( x - 2 ) & 2 \leq x \leq 3
    0 & \text { otherwise } \end{array} \right.$$
    1. Sketch \(\mathrm { f } ( x )\) for all values of \(x\).
    2. Write down the mode of \(X\). Find
    3. \(\mathrm { E } ( X )\),
    4. the median of \(X\).
    5. Comment on the skewness of this distribution. Give a reason for your answer. \section*{Advanced Level} \section*{Friday 23 May 2008 - Morning} \(\frac { \text { Materials required for examination } } { \text { Mathematical Formulae (Lilac or Green) } } \quad \frac { \text { Items included with question papers } } { \text { Nil } }\) Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled. You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit. \section*{H32581A}
      1. Jean regularly takes a break from work to go to the post office. The amount of time Jean waits in the queue to be served at the post office has a continuous uniform distribution between 0 and 10 minutes.
      2. Find the mean and variance of the time Jean spends in the post office queue.
      3. Find the probability that Jean does not have to wait more than 2 minutes.
      Jean visits the post office 5 times.
    6. Find the probability that she never has to wait more than 2 minutes. Jean is in the queue when she receives a message that she must return to work for an urgent meeting. She can only wait in the queue for a further 3 minutes. Given that Jean has already been queuing for 5 minutes,
    7. find the probability that she must leave the post office queue without being served.
      2. In a large college \(58 \%\) of students are female and \(42 \%\) are male. A random sample of 100 students is chosen from the college. Using a suitable approximation find the probability that more than half the sample are female.
      3. A test statistic has a Poisson distribution with parameter \(\lambda\). Given that $$\mathrm { H } _ { 0 } : \lambda = 9 , \mathrm { H } _ { 1 } : \lambda \neq 9$$
    8. find the critical region for the test statistic such that the probability in each tail is as close as possible to \(2.5 \%\).
    9. State the probability of incorrectly rejecting \(\mathrm { H } _ { 0 }\) using this critical region.
      4. Each cell of a certain animal contains 11000 genes. It is known that each gene has a probability 0.0005 of being damaged. A cell is chosen at random.
    10. Suggest a suitable model for the distribution of the number of damaged genes in the cell.
    11. Find the mean and variance of the number of damaged genes in the cell.
    12. Using a suitable approximation, find the probability that there are at most 2 damaged genes in the cell.
      (4)
      5. Sue throws a fair coin 15 times and records the number of times it shows a head.
    13. State the distribution to model the number of times the coin shows a head. Find the probability that Sue records
    14. exactly 8 heads,
    15. at least 4 heads. Sue has a different coin which she believes is biased in favour of heads. She throws the coin 15 times and obtains 13 heads.
    16. Test Sue's belief at the \(1 \%\) level of significance. State your hypotheses clearly.
      6. A call centre agent handles telephone calls at a rate of 18 per hour.
    17. Give two reasons to support the use of a Poisson distribution as a suitable model for the number of calls per hour handled by the agent.
    18. Find the probability that in any randomly selected 15 minute interval the agent handles
      1. exactly 5 calls,
      2. more than 8 calls. The agent received some training to increase the number of calls handled per hour. During a randomly selected 30 minute interval after the training the agent handles 14 calls.
    19. Test, at the \(5 \%\) level of significance, whether or not there is evidence to support the suggestion that the rate at which the agent handles calls has increased. State your hypotheses clearly.
      7. A random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 1 } { 2 } x & 0 \leq x < 1
      k x ^ { 3 } & 1 \leq x \leq 2
      0 & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
    20. Show that \(k = \frac { 1 } { 5 }\).
    21. Calculate the mean of \(X\).
    22. Specify fully the cumulative distribution function \(\mathrm { F } ( x )\).
    23. Find the median of \(X\).
    24. Comment on the skewness of the distribution of \(X\). \section*{Materials required for examination
      Mathematical Formulae (Green)} Items included with question papers Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulae stored in them. \section*{Advanced} \section*{Wednesday 21 January 2009 - Afternoon} In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      The marks for individual questions and the parts of questions are shown in round brackets: e.g. (2). There are 7 questions on this paper. The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. A botanist is studying the distribution of daisies in a field. The field is divided into a number of equal sized squares. The mean number of daisies per square is assumed to be 3 . The daisies are distributed randomly throughout the field.
      Find the probability that, in a randomly chosen square there will be
    25. more than 2 daisies,
    26. either 5 or 6 daisies. The botanist decides to count the number of daisies, \(x\), in each of 80 randomly selected squares within the field. The results are summarised below $$\sum x = 295 \quad \sum x ^ { 2 } = 1386$$
    27. Calculate the mean and the variance of the number of daisies per square for the 80 squares. Give your answers to 2 decimal places.
    28. Explain how the answers from part (c) support the choice of a Poisson distribution as a model.
    29. Using your mean from part (c), estimate the probability that exactly 4 daisies will be found in a randomly selected square.
      2. The continuous random variable \(X\) is uniformly distributed over the interval [-2, 7].
    30. Write down fully the probability density function \(\mathrm { f } ( x )\) of \(X\).
    31. Sketch the probability density function \(\mathrm { f } ( x )\) of \(X\). Find
    32. \(\mathrm { E } \left( X ^ { 2 } \right)\),
    33. \(\mathrm { P } ( - 0.2 < X < 0.6 )\).
      3. A single observation \(x\) is to be taken from a Binomial distribution \(\mathrm { B } ( 20 , p )\). This observation is used to test \(\mathrm { H } _ { 0 } : p = 0.3\) against \(\mathrm { H } _ { 1 } : p \neq 0.3\).
    34. Using a \(5 \%\) level of significance, find the critical region for this test. The probability of rejecting either tail should be as close as possible to \(2.5 \%\).
    35. State the actual significance level of this test. The actual value of \(x\) obtained is 3 .
    36. State a conclusion that can be drawn based on this value, giving a reason for your answer.
      4. The length of a telephone call made to a company is denoted by the continuous random variable \(T\). It is modelled by the probability density function $$\mathrm { f } ( t ) = \begin{cases} k t , & 0 \leq t \leq 10
      0 , & \text { otherwise } \end{cases}$$
    37. Show that the value of \(k\) is \(\frac { 1 } { 50 }\).
    38. Find \(\mathrm { P } ( T > 6 )\).
    39. Calculate an exact value for \(\mathrm { E } ( T )\) and for \(\operatorname { Var } ( T )\).
    40. Write down the mode of the distribution of \(T\). It is suggested that the probability density function, \(\mathrm { f } ( t )\), is not a good model for \(T\).
    41. Sketch the graph of a more suitable probability density function for \(T\).
      5. A factory produces components of which \(1 \%\) are defective. The components are packed in boxes of 10 . A box is selected at random.
    42. Find the probability that the box contains exactly one defective component.
    43. Find the probability that there are at least 2 defective components in the box.
    44. Using a suitable approximation, find the probability that a batch of 250 components contains between 1 and 4 (inclusive) defective components.
      6. A web server is visited on weekdays, at a rate of 7 visits per minute. In a random one minute on a Saturday the web server is visited 10 times.
      1. Test, at the \(10 \%\) level of significance, whether or not there is evidence that the rate of visits is greater on a Saturday than on weekdays. State your hypotheses clearly.
      2. State the minimum number of visits required to obtain a significant result.
    45. State an assumption that has been made about the visits to the server. In a random two minute period on a Saturday the web server is visited 20 times.
    46. Using a suitable approximation, test at the \(10 \%\) level of significance, whether or not the rate of visits is greater on a Saturday.
      7. A random variable \(X\) has probability density function given by $$\mathrm { f } ( x ) = \begin{cases} - \frac { 2 } { 9 } x + \frac { 8 } { 9 } , & 1 \leq x \leq 4
      0 , & \text { otherwise } \end{cases}$$
    47. Show that the cumulative distribution function \(\mathrm { F } ( x )\) can be written in the form \(a x ^ { 2 } + b x + c\), for \(1 \leq x \leq 4\) where \(a , b\) and \(c\) are constants.
    48. Define fully the cumulative distribution function \(\mathrm { F } ( x )\).
    49. Show that the upper quartile of \(X\) is 2.5 and find the lower quartile. Given that the median of \(X\) is 1.88 ,
    50. describe the skewness of the distribution. Give a reason for your answer. \section*{TOTAL FOR PAPER: 75 MARKS} Materials required for examination
      Mathematical Formulae (Orange or Green) Items included with question papers
      Nil \section*{Advanced Level} \section*{Monday 1 June 2009 - Morning} Candidates may use any calculator allowed by the regulations of the Joint
      Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 8 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may not gain full credit.
      1. A bag contains a large number of counters of which \(15 \%\) are coloured red. A random sample of 30 counters is selected and the number of red counters is recorded.
      2. Find the probability of no more than 6 red counters in this sample.
      A second random sample of 30 counters is selected and the number of red counters is recorded.
    51. Using a Poisson approximation, estimate the probability that the total number of red counters in the combined sample of size 60 is less than 13 .
      2. An effect of a certain disease is that a small number of the red blood cells are deformed. Emily has this disease and the deformed blood cells occur randomly at a rate of 2.5 per ml of her blood. Following a course of treatment, a random sample of 2 ml of Emily's blood is found to contain only 1 deformed red blood cell. Stating your hypotheses clearly and using a \(5 \%\) level of significance, test whether or not there has been a decrease in the number of deformed red blood cells in Emily's blood.
      3. A random sample \(X _ { 1 } , X _ { 2 } , \ldots X _ { n }\) is taken from a population with unknown mean \(\mu\) and unknown variance \(\sigma ^ { 2 }\). A statistic \(Y\) is based on this sample.
    52. Explain what you understand by the statistic \(Y\).
    53. Explain what you understand by the sampling distribution of \(Y\).
    54. State, giving a reason which of the following is not a statistic based on this sample.
      1. \(\sum _ { i = 1 } ^ { n } \frac { \left( X _ { i } - \bar { X } \right) ^ { 2 } } { n }\)
      2. \(\sum _ { i = 1 } ^ { n } \left( \frac { X _ { i } - \mu } { \sigma } \right) ^ { 2 }\)
      3. \(\sum _ { i = 1 } ^ { n } X _ { i } { } ^ { 2 }\)
        4. Past records suggest that \(30 \%\) of customers who buy baked beans from a large supermarket buy them in single tins. A new manager questions whether or not there has been a change in the proportion of customers who buy baked beans in single tins. A random sample of 20 customers who had bought baked beans was taken.
    55. Using a \(10 \%\) level of significance, find the critical region for a two-tailed test to answer the manager's question. You should state the probability of rejection in each tail which should be less than 0.05 .
    56. Write down the actual significance level of a test based on your critical region from part (a). The manager found that 11 customers from the sample of 20 had bought baked beans in single tins.
    57. Comment on this finding in the light of your critical region found in part (a).
      5. An administrator makes errors in her typing randomly at a rate of 3 errors every 1000 words.
    58. In a document of 2000 words find the probability that the administrator makes 4 or more errors. The administrator is given an 8000 word report to type and she is told that the report will only be accepted if there are 20 or fewer errors.
    59. Use a suitable approximation to calculate the probability that the report is accepted.
      (7) ⟶
      6. The three independent random variables \(A , B\) and \(C\) each has a continuous uniform distribution over the interval \([ 0,5 ]\).
    60. Find \(\mathrm { P } ( A > 3 )\).
    61. Find the probability that \(A , B\) and \(C\) are all greater than 3 . The random variable \(Y\) represents the maximum value of \(A , B\) and \(C\).
      The cumulative distribution function of \(Y\) is $$\mathrm { F } ( y ) = \begin{cases} 0 , & y < 0
      \frac { y ^ { 3 } } { 125 } , & 0 \leq y \leq 5
      1 , & y > 5 \end{cases}$$
    62. Find the probability density function of \(Y\).
    63. Sketch the probability density function of \(Y\).
    64. Write down the mode of \(Y\).
      (1)
    65. Find \(\mathrm { E } ( Y )\).
    66. Find \(\mathrm { P } ( Y > 3 )\).
      7. \begin{figure}[h]
      \includegraphics[alt={},max width=\textwidth]{4d6f7790-b966-4104-97cc-796d28a8f169-33_327_559_310_1885} \captionsetup{labelformat=empty} \caption{Figure 1}
      \end{figure} Figure 1 shows a sketch of the probability density function \(\mathrm { f } ( x )\) of the random variable \(X\). The part of the sketch from \(x = 0\) to \(x = 4\) consists of an isosceles triangle with maximum at ( \(2,0.5\) ).
    67. Write down \(\mathrm { E } ( X )\). The probability density function \(\mathrm { f } ( x )\) can be written in the following form. $$f ( x ) = \begin{cases} a x & 0 \leq x < 2
      b - a x & 2 \leq x \leq 4
      0 & \text { otherwise } \end{cases}$$
    68. Find the values of the constants \(a\) and \(b\).
    69. Show that \(\sigma\), the standard deviation of \(X\), is 0.816 to 3 decimal places.
    70. Find the lower quartile of \(X\).
    71. State, giving a reason, whether \(\mathrm { P } ( 2 - \sigma < X < 2 + \sigma )\) is more or less than 0.5
      8. A cloth manufacturer knows that faults occur randomly in the production process at a rate of 2 every 15 metres.
    72. Find the probability of exactly 4 faults in a 15 metre length of cloth.
    73. Find the probability of more than 10 faults in 60 metres of cloth. A retailer buys a large amount of this cloth and sells it in pieces of length \(x\) metres. He chooses \(x\) so that the probability of no faults in a piece is 0.80 .
    74. Write down an equation for \(x\) and show that \(x = 1.7\) to 2 significant figures. The retailer sells 1200 of these pieces of cloth. He makes a profit of 60 p on each piece of cloth that does not contain a fault but a loss of \(\pounds 1.50\) on any pieces that do contain faults.
    75. Find the retailer's expected profit. \section*{TOTAL FOR PAPER: 75 MARKS} \section*{END} Materials required for examination
      Mathematical Formulae (Pink or Green) Items included with question papers
      Nil \section*{Advanced Level} \section*{Tuesday 19 January 2010 - Morning} Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. \section*{M35712A} This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
      ©2010 Edexcel Limited
      1. A manufacturer supplies DVD players to retailers in batches of 20 . It has \(5 \%\) of the players returned because they are faulty.
      2. Write down a suitable model for the distribution of the number of faulty DVD players in a batch.
      Find the probability that a batch contains
    76. no faulty DVD players,
    77. more than 4 faulty DVD players.
    78. Find the mean and variance of the number of faulty DVD players in a batch.
      2. A continuous random variable \(X\) has cumulative distribution function $$\mathrm { F } ( x ) = \left\{ \begin{array} { l r } 0 , & x < - 2
      \frac { x + 2 } { 6 } , & - 2 \leq x \leq 4
      1 , & x > 4 \end{array} \right.$$
    79. Find \(\mathrm { P } ( X < 0 )\).
    80. Find the probability density function \(\mathrm { f } ( x )\) of \(X\).
    81. Write down the name of the distribution of \(X\).
    82. Find the mean and the variance of \(X\).
    83. Write down the value of \(\mathrm { P } ( X = 1 )\).
      3. A robot is programmed to build cars on a production line. The robot breaks down at random at a rate of once every 20 hours.
    84. Find the probability that it will work continuously for 5 hours without a breakdown. Find the probability that, in an 8 hour period,
    85. the robot will break down at least once,
    86. there are exactly 2 breakdowns. In a particular 8 hour period, the robot broke down twice.
    87. Write down the probability that the robot will break down in the following 8 hour period. Give a reason for your answer.
      4. The continuous random variable \(X\) has probability density function \(\mathrm { f } ( x )\) given by $$\mathrm { f } ( x ) = \begin{cases} k \left( x ^ { 2 } - 2 x + 2 \right) , & 0 < x \leq 3
      3 k , & 3 < x \leq 4
      0 , & \text { otherwise } \end{cases}$$ where \(k\) is a constant.
    88. Show that \(k = \frac { 1 } { 9 }\).
    89. Find the cumulative distribution function \(\mathrm { F } ( x )\).
    90. Find the mean of \(X\).
    91. Show that the median of \(X\) lies between \(x = 2.6\) and \(x = 2.7\).
      5. A café serves breakfast every morning. Customers arrive for breakfast at random at a rate of 1 every 6 minutes. Find the probability that
    92. fewer than 9 customers arrive for breakfast on a Monday morning between 10 a.m. and 11 a.m. The café serves breakfast every day between 8 a.m. and 12 noon.
    93. Using a suitable approximation, estimate the probability that more than 50 customers arrive for breakfast next Tuesday.
      6. (a) Define the critical region of a test statistic. A discrete random variable \(X\) has a Binomial distribution \(\mathrm { B } ( 30 , p )\). A single observation is used to test \(\mathrm { H } _ { 0 } : p = 0.3\) against \(\mathrm { H } _ { 1 } : p \neq 0.3\)
    94. Using a \(1 \%\) level of significance find the critical region of this test. You should state the probability of rejection in each tail which should be as close as possible to 0.005 .
    95. Write down the actual significance level of the test. The value of the observation was found to be 15 .
    96. Comment on this finding in light of your critical region.
      7. A bag contains a large number of coins. It contains only 1 p and 2 p coins in the ratio \(1 : 3\).
    97. Find the mean \(\mu\) and the variance \(\sigma ^ { 2 }\) of the values of this population of coins. A random sample of size 3 is taken from the bag.
    98. List all the possible samples.
    99. Find the sampling distribution of the mean value of the samples. Nil \section*{Advanced Level} \section*{Wednesday 9 June 2010 - Afternoon} \section*{Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes} Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled. You must show sufficient working to make your methods clear to the Examiner. Answers without working may not gain full credit. \section*{H35396A} \footnotetext{This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
      ©2010 Edexcel Limited }
      1. Explain what you understand by
      2. a population,
      3. a statistic.
      A researcher took a sample of 100 voters from a certain town and asked them who they would vote for in an election. The proportion who said they would vote for Dr Smith was \(35 \%\).
    100. State the population and the statistic in this case.
    101. Explain what you understand by the sampling distribution of this statistic.
      2. Bhim and Joe play each other at badminton and for each game, independently of all others, the probability that Bhim loses is 0.2 . Find the probability that, in 9 games, Bhim loses
    102. exactly 3 of the games,
    103. fewer than half of the games. Bhim attends coaching sessions for 2 months. After completing the coaching, the probability that he loses each game, independently of all others, is 0.05 . Bhim and Joe agree to play a further 60 games.
    104. Calculate the mean and variance for the number of these 60 games that Bhim loses.
    105. Using a suitable approximation calculate the probability that Bhim loses more than 4 games.
      3. A rectangle has a perimeter of 20 cm . The length, \(X \mathrm {~cm}\), of one side of this rectangle is uniformly distributed between 1 cm and 7 cm . Find the probability that the length of the longer side of the rectangle is more than 6 cm long.
      4. The lifetime, \(X\), in tens of hours, of a battery has a cumulative distribution function \(\mathrm { F } ( x )\) given by $$\mathrm { F } ( x ) = \left\{ \begin{array} { l r } 0 & x < 1
      \frac { 4 } { 9 } \left( x ^ { 2 } + 2 x - 3 \right) & 1 \leq x \leq 1.5
      1 & x > 1.5 \end{array} \right.$$
    106. Find the median of \(X\), giving your answer to 3 significant figures.
    107. Find, in full, the probability density function of the random variable \(X\).
    108. Find \(\mathrm { P } ( X \geq 1.2 )\) A camping lantern runs on 4 batteries, all of which must be working. Four new batteries are put into the lantern.
    109. Find the probability that the lantern will still be working after 12 hours.
      5. A company has a large number of regular users logging onto its website. On average 4 users every hour fail to connect to the company's website at their first attempt.
    110. Explain why the Poisson distribution may be a suitable model in this case. Find the probability that, in a randomly chosen 2 hour period,
      1. all users connect at their first attempt,
      2. at least 4 users fail to connect at their first attempt. The company suffered from a virus infecting its computer system. During this infection it was found that the number of users failing to connect at their first attempt, over a 12 hour period, was 60 .
    111. Using a suitable approximation, test whether or not the mean number of users per hour who failed to connect at their first attempt had increased. Use a \(5 \%\) level of significance and state your hypotheses clearly.
      (9)
      6. A company claims that a quarter of the bolts sent to them are faulty. To test this claim the number of faulty bolts in a random sample of 50 is recorded.
    112. Give two reasons why a binomial distribution may be a suitable model for the number of faulty bolts in the sample.
    113. Using a \(5 \%\) significance level, find the critical region for a two-tailed test of the hypothesis that the probability of a bolt being faulty is \(\frac { 1 } { 4 }\). The probability of rejection in either tail should be as close as possible to 0.025 .
    114. Find the actual significance level of this test. In the sample of 50 the actual number of faulty bolts was 8 .
    115. Comment on the company's claim in the light of this value. Justify your answer. The machine making the bolts was reset and another sample of 50 bolts was taken. Only 5 were found to be faulty.
    116. Test at the \(1 \%\) level of significance whether or not the probability of a faulty bolt has decreased. State your hypotheses clearly.
      7. The random variable \(Y\) has probability density function \(f ( y )\) given by $$\mathrm { f } ( y ) = \begin{cases} k y ( a - y ) & 0 \leq y \leq 3
      0 & \text { otherwise } \end{cases}$$ where \(k\) and \(a\) are positive constants.
      1. Explain why \(a \geq 3\).
      2. Show that \(k = \frac { 2 } { 9 ( a - 2 ) }\). Given that \(\mathrm { E } ( Y ) = 1.75\),
    117. show that \(a = 4\) and write down the value of \(k\). For these values of \(a\) and \(k\),
    118. sketch the probability density function,
    119. write down the mode of \(Y\). \section*{Advanced Level} \section*{Friday 14 January 2011 - Morning} Mathematical Formulae (Pink) Items included with question papers Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. \section*{H35411A}
      1. A disease occurs in \(3 \%\) of a population.
      2. State any assumptions that are required to model the number of people with the disease in a random sample of size n as a binomial distribution.
      3. Using this model, find the probability of exactly 2 people having the disease in a random sample of 10 people.
      4. Find the mean and variance of the number of people with the disease in a random sample of 100 people.
      A doctor tests a random sample of 100 patients for the disease. He decides to offer all patients a vaccination to protect them from the disease if more than 5 of the sample have the disease.
    120. Using a suitable approximation, find the probability that the doctor will offer all patients a vaccination.
      2. A student takes a multiple choice test. The test is made up of 10 questions each with 5 possible answers. The student gets 4 questions correct. Her teacher claims she was guessing the answers. Using a one tailed test, at the \(5 \%\) level of significance, test whether or not there is evidence to reject the teacher's claim. State your hypotheses clearly.
      3. The continuous random variable \(X\) is uniformly distributed over the interval \([ - 1,3 ]\). Find
    121. \(\mathrm { E } ( X )\)
    122. \(\operatorname { Var } ( X )\)
    123. \(\mathrm { E } \left( X ^ { 2 } \right)\)
    124. \(\mathrm { P } ( X < 1.4 )\) A total of 40 observations of \(X\) are made.
    125. Find the probability that at least 10 of these observations are negative.
      4. Richard regularly travels to work on a ferry. Over a long period of time, Richard has found that the ferry is late on average 2 times every week. The company buys a new ferry to improve the service. In the 4 -week period after the new ferry is launched, Richard finds the ferry is late 3 times and claims the service has improved. Assuming that the number of times the ferry is late has a Poisson distribution, test Richard's claim at the \(5 \%\) level of significance. State your hypotheses clearly.
      5. A continuous random variable \(X\) has the probability density function \(\mathrm { f } ( x )\) shown in Figure 1. \begin{figure}[h]
      \includegraphics[alt={},max width=\textwidth]{4d6f7790-b966-4104-97cc-796d28a8f169-40_396_465_612_520} \captionsetup{labelformat=empty} \caption{Figure 1}
      \end{figure}
    126. Show that \(\mathrm { f } ( x ) = 4 - 8 x\) for \(0 \leq x \leq 0.5\) and specify \(\mathrm { f } ( x )\) for all real values of \(x\).
    127. Find the cumulative distribution function \(\mathrm { F } ( x )\).
    128. Find the median of \(X\).
    129. Write down the mode of \(X\).
    130. State, with a reason, the skewness of \(X\).
      6. Cars arrive at a motorway toll booth at an average rate of 150 per hour.
    131. Suggest a suitable distribution to model the number of cars arriving at the toll booth, \(X\), per minute.
    132. State clearly any assumptions you have made by suggesting this model. Using your model,
    133. find the probability that in any given minute
      1. no cars arrive,
      2. more than 3 cars arrive.
    134. In any given 4 minute period, find \(m\) such that \(\mathrm { P } ( \mathrm { X } > m ) = 0.0487\)
    135. Using a suitable approximation find the probability that fewer than 15 cars arrive in any given 10 minute period.
      7. The queuing time in minutes, \(X\), of a customer at a post office is modelled by the probability density function $$f ( x ) = \begin{cases} k x \left( 81 - x ^ { 2 } \right) & 0 \leq x \leq 9
      0 & \text { otherwise } \end{cases}$$
    136. Show that \(k = \frac { 4 } { 6561 }\). Using integration, find
    137. the mean queuing time of a customer,
    138. the probability that a customer will queue for more than 5 minutes. Three independent customers shop at the post office.
    139. Find the probability that at least 2 of the customers queue for more than 5 minutes. \section*{Advanced Level} \section*{Thursday 26 May 2011 - Morning} Materials required for examination
      Mathematical Formulae (Pink) Items included with question papers
      Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. \section*{P38165A}
      1. A factory produces components. Each component has a unique identity number and it is assumed that \(2 \%\) of the components are faulty. On a particular day, a quality control manager wishes to take a random sample of 50 components.
      2. Identify a sampling frame.
      The statistic \(F\) represents the number of faulty components in the random sample of size 50 .
    140. Specify the sampling distribution of \(F\).
      2. A traffic officer monitors the rate at which vehicles pass a fixed point on a motorway. When the rate exceeds 36 vehicles per minute he must switch on some speed restrictions to improve traffic flow.
    141. Suggest a suitable model to describe the number of vehicles passing the fixed point in a 15 s interval. The traffic officer records 12 vehicles passing the fixed point in a 15 s interval.
    142. Stating your hypotheses clearly, and using a \(5 \%\) level of significance, test whether or not the traffic officer has sufficient evidence to switch on the speed restrictions.
    143. Using a 5\% level of significance, determine the smallest number of vehicles the traffic officer must observe in a 10 s interval in order to have sufficient evidence to switch on the speed restrictions.
      (3)
      3. \begin{figure}[h]
      \includegraphics[alt={},max width=\textwidth]{4d6f7790-b966-4104-97cc-796d28a8f169-42_287_627_301_404} \captionsetup{labelformat=empty} \caption{Figure 1}
      \end{figure} Figure 1 shows a sketch of the probability density function \(\mathrm { f } ( x )\) of the random variable \(X\).
      For \(0 \leq x \leq 3 , \mathrm { f } ( x )\) is represented by a curve \(O B\) with equation \(\mathrm { f } ( x ) = k x ^ { 2 }\), where \(k\) is a constant.
      For \(3 \leq x \leq a\), where \(a\) is a constant, \(\mathrm { f } ( x )\) is represented by a straight line passing through \(B\) and the point \(( a , 0 )\). For all other values of \(x , \mathrm { f } ( x ) = 0\).
      Given that the mode of \(X =\) the median of \(X\), find
    144. the mode,
    145. the value of \(k\),
    146. the value of \(a\). Without calculating \(\mathrm { E } ( X )\) and with reference to the skewness of the distribution
    147. state, giving your reason, whether \(\mathrm { E } ( X ) < 3 , \mathrm { E } ( X ) = 3\) or \(\mathrm { E } ( X ) > 3\).
      (2)
      [0pt] 4. In a game, players select sticks at random from a box containing a large number of sticks of different lengths. The length, in cm , of a randomly chosen stick has a continuous uniform distribution over the interval [7,10]. A stick is selected at random from the box.
    148. Find the probability that the stick is shorter than 9.5 cm . To win a bag of sweets, a player must select 3 sticks and wins if the length of the longest stick is more than 9.5 cm .
    149. Find the probability of winning a bag of sweets. To win a soft toy, a player must select 6 sticks and wins the toy if more than four of the sticks are shorter than 7.6 cm .
    150. Find the probability of winning a soft toy.
      5. Defects occur at random in planks of wood with a constant rate of 0.5 per 10 cm length. Jim buys a plank of length 100 cm .
    151. Find the probability that Jim's plank contains at most 3 defects. Shivani buys 6 planks each of length 100 cm .
    152. Find the probability that fewer than 2 of Shivani's planks contain at most 3 defects.
    153. Using a suitable approximation, estimate the probability that the total number of defects on Shivani's 6 planks is less than 18.
      (6)
      6. A shopkeeper knows, from past records, that \(15 \%\) of customers buy an item from the display next to the till. After a refurbishment of the shop, he takes a random sample of 30 customers and finds that only 1 customer has bought an item from the display next to the till.
    154. Stating your hypotheses clearly, and using a \(5 \%\) level of significance, test whether or not there has been a change in the proportion of customers buying an item from the display next to the till. During the refurbishment a new sandwich display was installed. Before the refurbishment \(20 \%\) of customers bought sandwiches. The shopkeeper claims that the proportion of customers buying sandwiches has now increased. He selects a random sample of 120 customers and finds that 31 of them have bought sandwiches.
    155. Using a suitable approximation and stating your hypotheses clearly, test the shopkeeper's claim. Use a \(10 \%\) level of significance.
      7. The continuous random variable \(X\) has probability density function given by $$f ( x ) = \begin{cases} \frac { 3 } { 32 } ( x - 1 ) ( 5 - x ) & 1 \leq x \leq 5
      0 & \text { otherwise } \end{cases}$$
    156. Sketch \(\mathrm { f } ( x )\) showing clearly the points where it meets the \(x\)-axis.
    157. Write down the value of the mean, \(\mu\), of \(X\).
    158. Show that \(\mathrm { E } \left( X ^ { 2 } \right) = 9.8\).
    159. Find the standard deviation, \(\sigma\), of \(X\). The cumulative distribution function of \(X\) is given by $$\mathrm { F } ( x ) = \begin{cases} 0 & x < 1
      \frac { 1 } { 32 } \left( a - 15 x + 9 x ^ { 2 } - x ^ { 3 } \right) & 1 \leq x \leq 5
      1 & x > 5 \end{cases}$$ where \(a\) is a constant.
    160. Find the value of \(a\).
    161. Show that the lower quartile of \(X , q _ { 1 }\), lies between 2.29 and 2.31.
    162. Hence find the upper quartile of \(X\), giving your answer to 1 decimal place.
    163. Find, to 2 decimal places, the value of \(k\) so that $$\mathrm { P } ( \mu - k \sigma < X < \mu + k \sigma ) = 0.5$$ END \section*{Advanced Level} \section*{Tuesday 17 January 2012 - Morning} Mathematical Formulae (Pink) Items included with question papers
      Nil Candidates may use any calculator allowed by the regulations of the Joint
      Council for Qualifications. Calculators must not have the facility for symbolic
      algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit.
      1. The time in minutes that Elaine takes to checkout at her local supermarket follows a continuous uniform distribution defined over the interval [ 3,9 ].
      Find
    164. Elaine's expected checkout time,
    165. the variance of the time taken to checkout at the supermarket,
    166. the probability that Elaine will take more than 7 minutes to checkout. Given that Elaine has already spent 4 minutes at the checkout,
    167. find the probability that she will take a total of less than 6 minutes to checkout.
      2. David claims that the weather forecasts produced by local radio are no better than those achieved by tossing a fair coin and predicting rain if a head is obtained or no rain if a tail is obtained. He records the weather for 30 randomly selected days. The local radio forecast is correct on 21 of these days. Test David's claim at the \(5 \%\) level of significance.
      State your hypotheses clearly.
      3. The probability of a telesales representative making a sale on a customer call is 0.15 . Find the probability that
    168. no sales are made in 10 calls,
    169. more than 3 sales are made in 20 calls. Representatives are required to achieve a mean of at least 5 sales each day.
    170. Find the least number of calls each day a representative should make to achieve this requirement.
    171. Calculate the least number of calls that need to be made by a representative for the probability of at least 1 sale to exceed 0.95 .
      4. A website receives hits at a rate of 300 per hour.
    172. State a distribution that is suitable to model the number of hits obtained during a 1 minute interval.
    173. State two reasons for your answer to part (a). Find the probability of
    174. 10 hits in a given minute,
    175. at least 15 hits in 2 minutes. The website will go down if there are more than 70 hits in 10 minutes.
    176. Using a suitable approximation, find the probability that the website will go down in a particular 10 minute interval.
      5. The probability of an electrical component being defective is 0.075 . The component is supplied in boxes of 120 .
    177. Using a suitable approximation, estimate the probability that there are more than 3 defective components in a box. A retailer buys 2 boxes of components.
    178. Estimate the probability that there are at least 4 defective components in each box.
      6. A random variable \(X\) has probability density function given by $$\mathrm { f } ( x ) = \begin{cases} \frac { 1 } { 2 } , & 0 \leq x < 1
      x - \frac { 1 } { 2 } , & 1 \leq x \leq k
      0 & \text { otherwise } \end{cases}$$ where \(k\) is a positive constant.
    179. Sketch the graph of \(\mathrm { f } ( x )\).
    180. Show that \(k = \frac { 1 } { 2 } ( 1 + \sqrt { } 5 )\).
    181. Define fully the cumulative distribution function \(\mathrm { F } ( x )\).
    182. Find \(\mathrm { P } ( 0.5 < X < 1.5 )\).
    183. Write down the median of \(X\) and the mode of \(X\).
    184. Describe the skewness of the distribution of \(X\). Give a reason for your answer.
      7. (a) Explain briefly what you understand by
      1. a critical region of a test statistic,
      2. the level of significance of a hypothesis test.
    185. An estate agent has been selling houses at a rate of 8 per month. She believes that the rate of sales will decrease in the next month.
      1. Using a 5\% level of significance, find the critical region for a one tailed test of the hypothesis that the rate of sales will decrease from 8 per month.
      2. Write down the actual significance level of the test in part (b)(i). The estate agent is surprised to find that she actually sold 13 houses in the next month. She now claims that this is evidence of an increase in the rate of sales per month.
    186. Test the estate agent's claim at the \(5 \%\) level of significance. State your hypotheses clearly. Mathematical Formulae (Pink) Items included with question papers
      Nil \section*{Advanced Level} \section*{Thursday 24 May 2012 - Morning} \section*{Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes} Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 8 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. \section*{P40106A}
      1. A manufacturer produces sweets of length \(L \mathrm {~mm}\) where \(L\) has a continuous uniform distribution with range [15,30].
      2. Find the probability that a randomly selected sweet has length greater than 24 mm .
      These sweets are randomly packed in bags of 20 sweets.
    187. Find the probability that a randomly selected bag will contain at least 8 sweets with length greater than 24 mm .
    188. Find the probability that 2 randomly selected bags will both contain at least 8 sweets with length greater than 24 mm .
      2. A test statistic has a distribution \(\mathrm { B } ( 25 , p )\). Given that $$\mathrm { H } _ { 0 } : p = 0.5 , \quad \mathrm { H } _ { 1 } : p \neq 0.5$$
    189. find the critical region for the test statistic such that the probability in each tail is as close as possible to \(2.5 \%\).
    190. State the probability of incorrectly rejecting \(\mathrm { H } _ { 0 }\) using this critical region.
      3. (a) Write down the two conditions needed to approximate the binomial distribution by the Poisson distribution. A machine which manufactures bolts is known to produce \(3 \%\) defective bolts. The machine breaks down and a new machine is installed. A random sample of 200 bolts is taken from those produced by the new machine and 12 bolts are defective.
    191. Using a suitable approximation, test at the \(5 \%\) level of significance whether or not the proportion of defective bolts is higher with the new machine than with the old machine. State your hypotheses clearly.
      4. The number of houses sold by an estate agent follows a Poisson distribution, with a mean of 2 per week.
    192. Find the probability that in the next four weeks the estate agent sells
      1. exactly 3 houses,
      2. more than 5 houses. The estate agent monitors sales in periods of 4 weeks.
    193. Find the probability that in the next twelve of those 4 week periods there are exactly nine periods in which more than 5 houses are sold. The estate agent will receive a bonus if he sells more than 25 houses in the next 10 weeks.
    194. Use a suitable approximation to estimate the probability that the estate agent receives a bonus.
      5. The queuing time, \(X\) minutes, of a customer at a till of a supermarket has probability density function $$f ( x ) = \begin{cases} \frac { 3 } { 32 } x ( k - x ) & 0 \leq x \leq k
      0 & \text { otherwise } \end{cases}$$
    195. Show that the value of \(k\) is 4 .
    196. Write down the value of \(\mathrm { E } ( X )\).
    197. Calculate Var ( \(X\) ).
    198. Find the probability that a randomly chosen customer's queuing time will differ from the mean by at least half a minute.
      6. A bag contains a large number of balls. 65\% are numbered 1
      35\% are numbered 2
      A random sample of 3 balls is taken from the bag.
      Find the sampling distribution for the range of the numbers on the 3 selected balls.
      7. The continuous random variable \(X\) has probability density function \(\mathrm { f } ( x )\) given by $$\mathrm { f } ( x ) = \begin{cases} \frac { x ^ { 2 } } { 45 } & 0 \leq x \leq 3
      \frac { 1 } { 5 } & 3 < x < 4
      \frac { 1 } { 3 } - \frac { x } { 30 } & 4 \leq x \leq 10
      0 & \text { otherwise } \end{cases}$$
    199. Sketch \(\mathrm { f } ( x )\) for \(0 \leq x \leq 10\).
    200. Find the cumulative distribution function \(\mathrm { F } ( x )\) for all values of \(x\).
    201. Find \(\mathrm { P } ( X \leq 8 )\).
      8. In a large restaurant an average of 3 out of every 5 customers ask for water with their meal. A random sample of 10 customers is selected.
    202. Find the probability that
      1. exactly 6 ask for water with their meal,
      2. less than 9 ask for water with their meal. A second random sample of 50 customers is selected.
    203. Find the smallest value of \(n\) such that $$\mathrm { P } ( X < n ) \geq 0.9$$ where the random variable \(X\) represents the number of these customers who ask for water. END \section*{Advanced Level} \section*{Friday 18 January 2013 - Afternoon} \section*{Materials required for examination
      Mathematical Formulae (Pink)} \section*{Items included with question papers
      Nil} Candidates may use any calculator allowed by the regulations of the Joint
      Council for Qualifications. Calculators must not have the facility for symbolic
      algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. \section*{P41482A}
      1. (a) Write down the conditions under which the Poisson distribution can be used as an approximation to the binomial distribution.
      The probability of any one letter being delivered to the wrong house is 0.01 .
      On a randomly selected day Peter delivers 1000 letters.
    204. Using a Poisson approximation, find the probability that Peter delivers at least 4 letters to the wrong house. Give your answer to 4 decimal places.
      2. In a village, power cuts occur randomly at a rate of 3 per year.
    205. Find the probability that in any given year there will be
      1. exactly 7 power cuts,
      2. at least 4 power cuts.
    206. Use a suitable approximation to find the probability that in the next 10 years the number of power cuts will be less than 20 .
      3. A random variable \(X\) has the distribution \(\mathrm { B } ( 12 , p )\).
    207. Given that \(p = 0.25\), find
      1. \(\mathrm { P } ( X < 5 )\),
      2. \(\mathrm { P } ( X \geq 7 )\).
    208. Given that \(\mathrm { P } ( X = 0 ) = 0.05\), find the value of \(p\) to 3 decimal places.
    209. Given that the variance of \(X\) is 1.92 , find the possible values of \(p\).
      4. The continuous random variable \(X\) is uniformly distributed over the interval \([ - 4,6 ]\).
    210. Write down the mean of \(X\).
    211. Find \(\mathrm { P } ( X \leq 2.4 )\).
    212. Find \(\mathrm { P } ( - 3 < X - 5 < 3 )\). The continuous random variable \(Y\) is uniformly distributed over the interval \([ a , 4 a ]\).
    213. Use integration to show that \(\mathrm { E } \left( Y ^ { 2 } \right) = 7 a ^ { 2 }\).
    214. Find \(\operatorname { Var } ( Y )\).
    215. Given that \(\mathrm { P } \left( X < \frac { 8 } { 3 } \right) = \mathrm { P } \left( Y < \frac { 8 } { 3 } \right)\), find the value of \(a\).
      5. The continuous random variable \(T\) is used to model the number of days, \(t\), a mosquito survives after hatching. The probability that the mosquito survives for more than \(t\) days is $$\frac { 225 } { ( t + 15 ) ^ { 2 } } , \quad t \geq 0$$
    216. Show that the cumulative distribution function of \(T\) is given by $$\mathrm { F } ( t ) = \begin{cases} 1 - \frac { 225 } { ( t + 15 ) ^ { 2 } } , & t \geq 0
      0 , & \text { otherwise } \end{cases}$$
    217. Find the probability that a randomly selected mosquito will die within 3 days of hatching.
    218. Given that a mosquito survives for 3 days, find the probability that it will survive for at least 5 more days. A large number of mosquitoes hatch on the same day.
    219. Find the number of days after which only \(10 \%\) of these mosquitoes are expected to survive.
      6. (a) Explain what you understand by a hypothesis.
    220. Explain what you understand by a critical region. Mrs George claims that \(45 \%\) of voters would vote for her.
      In an opinion poll of 20 randomly selected voters it was found that 5 would vote for her.
    221. Test at the \(5 \%\) level of significance whether or not the opinion poll provides evidence to support Mrs George's claim. In a second opinion poll of \(n\) randomly selected people it was found that no one would vote for Mrs George.
    222. Using a \(1 \%\) level of significance, find the smallest value of \(n\) for which the hypothesis \(\mathrm { H } _ { 0 } : p = 0.45\) will be rejected in favour of \(\mathrm { H } _ { 1 } : p < 0.45\).
      7. The continuous random variable \(X\) has the following probability density function $$f ( x ) = \begin{cases} a + b x , & 0 \leq x \leq 5
      0 , & \text { otherwise } \end{cases}$$ where \(a\) and \(b\) are constants.
    223. Show that \(10 a + 25 b = 2\). Given that \(\mathrm { E } ( X ) = \frac { 35 } { 12 }\),
    224. find a second equation in \(a\) and \(b\),
    225. hence find the value of \(a\) and the value of \(b\).
    226. Find, to 3 significant figures, the median of \(X\).
    227. Comment on the skewness. Give a reason for your answer. \section*{Statistics S2 (R)} \section*{Advanced/Advanced Subsidiary} \section*{Friday 24 May 2013 - Morning} Mathematical Formulae (Pink) Nil Candidates may use any calculator allowed by the regulations of the Joint Council for Qualifications. Calculators must not have the facility for symbolic algebra manipulation or symbolic differentiation/integration, or have retrievable mathematical formulae stored in them. This paper is strictly for students outside the UK. In the boxes above, write your centre number, candidate number, your surname, initials and signature.
      Check that you have the correct question paper.
      Answer ALL the questions.
      You must write your answer for each question in the space following the question.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      The marks for the parts of questions are shown in round brackets, e.g. (2).
      There are 7 questions in this question paper. The total mark for this paper is 75 .
      There are 24 pages in this question paper. Any blank pages are indicated. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit.
      1. A bag contains a large number of counters. A third of the counters have a number 5 on them and the remainder have a number 1 .
      A random sample of 3 counters is selected.
    228. List all possible samples.
    229. Find the sampling distribution for the range.
      2. The continuous random variable \(Y\) has cumulative distribution function $$\mathrm { F } ( y ) = \left\{ \begin{array} { c c } 0 & y < 0
      \frac { 1 } { 4 } \left( y ^ { 3 } - 4 y ^ { 2 } + k y \right) & 0 \leq y \leq 2
      1 & y > 2 \end{array} \right.$$ where \(k\) is a constant.
    230. Find the value of \(k\).
    231. Find the probability density function of \(Y\), specifying it for all values of \(y\).
    232. Find \(\mathrm { P } ( Y > 1 )\).
      3. The random variable \(X\) has a continuous uniform distribution on \([ a , b ]\) where \(a\) and \(b\) are positive numbers. Given that \(\mathrm { E } ( X ) = 23\) and \(\operatorname { Var } ( X ) = 75\),
    233. find the value of \(a\) and the value of \(b\). Given that \(\mathrm { P } ( X > c ) = 0.32\),
    234. find \(\mathrm { P } ( 23 < X < c )\).
      4. The random variable \(X\) has probability density function \(\mathrm { f } ( x )\) given by $$f ( x ) = \left\{ \begin{array} { c c } k \left( 3 + 2 x - x ^ { 2 } \right) & 0 \leq x \leq 3
      0 & \text { otherwise } \end{array} \right.$$ where \(k\) is a constant.
    235. Show that \(k = \frac { 1 } { 9 }\).
    236. Find the mode of \(X\).
    237. Use algebraic integration to find \(\mathrm { E } ( X )\). By comparing your answers to parts (b) and (c),
    238. describe the skewness of \(X\), giving a reason for your answer.
      5. In a village shop the customers must join a queue to pay. The number of customers joining the queue in a 10 minute interval is modelled by a Poisson distribution with mean 3. Find the probability that
    239. exactly 4 customers join the queue in the next 10 minutes,
    240. more than 10 customers join the queue in the next 20 minutes. When a customer reaches the front of the queue the customer pays the assistant. The time each customer takes paying the assistant, \(T\) minutes, has a continuous uniform distribution over the interval \([ 0,5 ]\). The random variable \(T\) is independent of the number of people joining the queue.
    241. Find \(\mathrm { P } ( T > 3.5 )\). In a random sample of 5 customers, the random variable \(C\) represents the number of customers who took more than 3.5 minutes paying the assistant.
    242. Find \(\mathrm { P } ( C \geq 3 )\). Bethan has just reached the front of the queue and starts paying the assistant.
    243. Find the probability that in the next 4 minutes Bethan finishes paying the assistant and no other customers join the queue.
      6. Frugal bakery claims that their packs of 10 muffins contain on average 80 raisins per pack. A Poisson distribution is used to describe the number of raisins per muffin. A muffin is selected at random to test whether or not the mean number of raisins per muffin has changed.
    244. Find the critical region for a two-tailed test using a \(10 \%\) level of significance. The probability of rejection in each tail should be less than 0.05 .
    245. Find the actual significance level of this test. The bakery has a special promotion claiming that their muffins now contain even more raisins. A random sample of 10 muffins is selected and is found to contain a total of 95 raisins.
    246. Use a suitable approximation to test the bakery's claim. You should state your hypotheses clearly and use a \(5 \%\) level of significance.
      7. As part of a selection procedure for a company, applicants have to answer all 20 questions of a multiple choice test. If an applicant chooses answers at random the probability of choosing a correct answer is 0.2 and the number of correct answers is represented by the random variable \(X\).
    247. Suggest a suitable distribution for \(X\). Each applicant gains 4 points for each correct answer but loses 1 point for each incorrect answer. The random variable \(S\) represents the final score, in points, for an applicant who chooses answers to this test at random.
    248. Show that \(S = 5 X - 20\).
    249. Find \(\mathrm { E } ( S )\) and \(\operatorname { Var } ( S )\). An applicant who achieves a score of at least 20 points is invited to take part in the final stage of the selection process.
    250. Find \(\mathrm { P } ( S \geq 20 )\). Cameron is taking the final stage of the selection process which is a multiple choice test consisting of 100 questions. He has been preparing for this test and believes that his chance of answering each question correctly is 0.4 .
    251. Using a suitable approximation, estimate the probability that Cameron answers more than half of the questions correctly. 6684/01
      Edexcel GCE \section*{Advanced Level} \section*{Friday 24 May 2013 - Morning} Mathematical Formulae (Pink) Nil Candidates may use any calculator allowed by the regulations of the Joint
      Council for Qualifications. Calculators must not have the facility for symbolic
      algebra manipulation, differentiation and integration, or have retrievable mathematical formulas stored in them. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S2), the paper reference (6684), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has 7 questions.
      The total mark for this paper is 75 . You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner.
      Answers without working may not gain full credit. \section*{P42035A} This publication may only be reproduced in accordance with Edexcel Limited copyright policy.
      ©2013 Edexcel Limited
      1. A bag contains a large number of \(1 \mathrm { p } , 2 \mathrm { p }\) and 5 p coins.
        \(50 \%\) are 1 p coins
        \(20 \%\) are 2 p coins
        \(30 \%\) are 5 p coins
        A random sample of 3 coins is chosen from the bag.
      2. List all the possible samples of size 3 with median 5p.
      3. Find the probability that the median value of the sample is 5p.
      4. Find the sampling distribution of the median of samples of size 3 .
      5. The number of defects per metre in a roll of cloth has a Poisson distribution with mean 0.25 .
      Find the probability that
    252. a randomly chosen metre of cloth has 1 defect,
    253. the total number of defects in a randomly chosen 6 metre length of cloth is more than 2 . A tailor buys 300 metres of cloth.
    254. Using a suitable approximation find the probability that the tailor's cloth will contain less than 90 defects.
      (5)
      3. An online shop sells a computer game at an average rate of 1 per day.
    255. Find the probability that the shop sells more than 10 games in a 7 day period. Once every 7 days the shop has games delivered before it opens.
    256. Find the least number of games the shop should have in stock immediately after a delivery so that the probability of running out of the game before the next delivery is less than 0.05 . In an attempt to increase sales of the computer game, the price is reduced for six months. A random sample of 28 days is taken from these six months. In the sample of 28 days, 36 computer games are sold.
    257. Using a suitable approximation and a \(5 \%\) level of significance, test whether or not the average rate of sales per day has increased during these six months. State your hypotheses clearly.
      4. A continuous random variable \(X\) is uniformly distributed over the interval \([ b , 4 b ]\) where \(b\) is a constant.
    258. Write down \(\mathrm { E } ( X )\).
    259. Use integration to show that \(\operatorname { Var } ( X ) = \frac { 3 b ^ { 2 } } { 4 }\).
    260. Find \(\operatorname { Var } ( 3 - 2 X )\). Given that \(b = 1\), find
    261. the cumulative distribution function of \(X , \mathrm {~F} ( x )\), for all values of \(x\),
    262. the median of \(X\).
      5. The continuous random variable \(X\) has a cumulative distribution function $$\mathrm { F } ( x ) = \begin{cases} 0 , & x < 1
      \frac { x ^ { 3 } } { 10 } + \frac { 3 x ^ { 2 } } { 10 } + a x + b , & 1 \leq x \leq 2
      1 , & x > 2 \end{cases}$$ where \(a\) and \(b\) are constants.
    263. Find the value of \(a\) and the value of \(b\).
    264. Show that \(\mathrm { f } ( x ) = \frac { 3 } { 10 } \left( x ^ { 2 } + 2 x - 2 \right) , \quad 1 \leq x \leq 2\).
    265. Use integration to find \(\mathrm { E } ( X )\).
    266. Show that the lower quartile of \(X\) lies between 1.425 and 1.435.
      6. In a manufacturing process \(25 \%\) of articles are thought to be defective. Articles are produced in batches of 20.
    267. A batch is selected at random. Using a \(5 \%\) significance level, find the critical region for a two tailed test that the probability of an article chosen at random being defective is 0.25 . You should state the probability in each tail, which should be as close as possible to 0.025 . The manufacturer changes the production process to try to reduce the number of defective articles. She then chooses a batch at random and discovers there are 3 defective articles.
    268. Test at the \(5 \%\) level of significance whether or not there is evidence that the changes to the process have reduced the percentage of defective articles. State your hypotheses clearly.
      7. A telesales operator is selling a magazine. Each day he chooses a number of people to telephone. The probability that each person he telephones buys the magazine is 0.1 .
    269. Suggest a suitable distribution to model the number of people who buy the magazine from the telesales operator each day.
    270. On Monday, the telesales operator telephones 10 people. Find the probability that he sells at least 4 magazines.
    271. Calculate the least number of people he needs to telephone on Tuesday, so that the probability of selling at least 1 magazine, on that day, is greater than 0.95 . A call centre also sells the magazine. The probability that a telephone call made by the call centre sells a magazine is 0.05 . The call centre telephones 100 people every hour.
    272. Using a suitable approximation, find the probability that more than 10 people telephoned by the call centre buy a magazine in a randomly chosen hour. \section*{\textbackslash section*\{END\}}
    Edexcel S3 Q1
    1. A hotel has 160 rooms of which 20 are classified as De-luxe, 40 Premier and 100 as Standard. The manager wants to obtain information about room usage in the hotel by taking a \(10 \%\) sample of the rooms.
      1. Suggest a suitable sampling method.
      2. Explain in detail how the manager should obtain the sample.
      3. A random sample of 100 classical CDs produced by a record company had a mean playing time of 70.6 minutes and a standard deviation of 9.1 minutes. An independent random sample of 120 CDs produced by a different company had a mean playing time of 67.2 minutes with a standard deviation of 8.4 minutes.
      4. Using a \(1 \%\) level of significance, test whether or not there is a difference in the mean playing times of the CDs produced by these two companies. State your hypotheses clearly.
      5. State an assumption you made in carrying out the test in part (a).
      6. The weights of a group of males are normally distributed with mean 80 kg and standard deviation 2.6 kg . A random sample of 10 of these males is selected.
      7. Write down the distribution of \(\bar { M }\), the mean weight, in kg , of this sample.
      8. Find \(\mathrm { P } ( \bar { M } < 78.5 )\).
      The weights of a group of females are normally distributed with mean 59 kg and standard deviation 1.9 kg . A random sample of 6 of the males and 4 of the females enters a lift that can carry a maximum load of 730 kg .
    2. Find the probability that the maximum load will be exceeded when these 10 people enter the lift.
      4. At the end of a season an athletics coach graded a random sample of ten athletes according to their performances throughout the season and their dedication to training. The results, expressed as percentages, are shown in the table below.
      AthletePerformanceDedication
      \(A\)8672
      \(B\)6069
      \(C\)7859
      \(D\)5668
      \(E\)8080
      \(F\)6684
      \(G\)3165
      \(H\)5955
      \(I\)7379
      \(J\)4953
    3. Calculate the Spearman rank correlation coefficient between performance and dedication.
    4. Stating clearly your hypotheses and using a \(10 \%\) level of significance, interpret your rank correlation coefficient.
    5. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data.
      5. The manager of a leisure centre collected data on the usage of the facilities in the centre by its members. A random sample from her records is summarised below.
      FacilityMaleFemale
      Pool4068
      Jacuzzi2633
      Gym5231
      Making your method clear, test whether or not there is any evidence of an association between gender and use of the club facilities. State your hypotheses clearly and use a \(5 \%\) level of significance.
      6. Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
      Number of femalesObserved number of littersExpected number of litters
      010.78
      196.25
      22721.88
      346\(R\)
      449\(S\)
      535\(T\)
      62621.88
      756.25
      820.78
    6. Find the values of \(R , S\) and \(T\).
    7. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a \(5 \%\) level of significance. An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
    8. Explain how this would have affected the test.
      7. The weights of tubs of margarine are known to be normally distributed. A random sample of 10 tubs of margarine were weighed, to the nearest gram, and the results were as follows. $$\begin{array} { l l l l l l l l l l } 498 & 502 & 500 & 496 & 509 & 504 & 511 & 497 & 506 & 499 \end{array}$$
    9. Find unbiased estimates of the mean and the variance of the population from which this sample was taken. Given that the population standard deviation is 5.0 g ,
    10. estimate limits, to 2 decimal places, between which \(90 \%\) of the weights of the tubs lie,
    11. find a \(95 \%\) confidence interval for the mean weight of the tubs. A second random sample of 15 tubs was found to have a mean weight of 501.9 g .
    12. Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the mean weight of these tubs is greater than 500 g . \section*{END} \section*{Items included with question papers Nil} Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
      6685 \section*{Edexcel GCE
      Statistics S3} Advanced/Advanced Subsidiary
      Thursday 5 June 2003 - Morning
      Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. Explain how to obtain a sample from a population using
      2. stratified sampling,
      3. quota sampling.
      Give one advantage and one disadvantage of each sampling method.
    Edexcel S3 Q5
    5. The manager of a leisure centre collected data on the usage of the facilities in the centre by its members. A random sample from her records is summarised below.
    FacilityMaleFemale
    Pool4068
    Jacuzzi2633
    Gym5231
    Making your method clear, test whether or not there is any evidence of an association between gender and use of the club facilities. State your hypotheses clearly and use a \(5 \%\) level of significance.
    Edexcel S3 Q6
    6. Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
    Number of femalesObserved number of littersExpected number of litters
    010.78
    196.25
    22721.88
    346\(R\)
    449\(S\)
    535\(T\)
    62621.88
    756.25
    820.78
    1. Find the values of \(R , S\) and \(T\).
    2. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a \(5 \%\) level of significance. An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
    3. Explain how this would have affected the test.
    Edexcel S3 Q7
    7. The weights of tubs of margarine are known to be normally distributed. A random sample of 10 tubs of margarine were weighed, to the nearest gram, and the results were as follows. $$\begin{array} { l l l l l l l l l l } 498 & 502 & 500 & 496 & 509 & 504 & 511 & 497 & 506 & 499 \end{array}$$
    1. Find unbiased estimates of the mean and the variance of the population from which this sample was taken. Given that the population standard deviation is 5.0 g ,
    2. estimate limits, to 2 decimal places, between which \(90 \%\) of the weights of the tubs lie,
    3. find a \(95 \%\) confidence interval for the mean weight of the tubs. A second random sample of 15 tubs was found to have a mean weight of 501.9 g .
    4. Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the mean weight of these tubs is greater than 500 g . \section*{END} \section*{Items included with question papers Nil} Answer Book (AB16)
      Graph Paper (ASG2)
      Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
      6685 \section*{Edexcel GCE
      Statistics S3} Advanced/Advanced Subsidiary
      Thursday 5 June 2003 - Morning
      Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. Explain how to obtain a sample from a population using
      2. stratified sampling,
      3. quota sampling.
      Give one advantage and one disadvantage of each sampling method.
      2. A random sample of 30 apples was taken from a batch. The mean weight of the sample was 124 g with standard deviation 20 g .
    5. Find a \(99 \%\) confidence interval for the mean weight \(\mu\) grams of the population of apples. Write down any assumptions you made in your calculations. Given that the actual value of \(\mu\) is 140 ,
    6. state, with a reason, what you can conclude about the sample of 30 apples.
      3. Given the random variables \(X \sim \mathrm {~N} ( 20,5 )\) and \(Y \sim \mathrm {~N} ( 10,4 )\) where \(X\) and \(Y\) are independent, find
    7. \(\mathrm { E } ( X - Y )\),
    8. \(\operatorname { Var } ( X - Y )\),
    9. \(\mathrm { P } ( 13 < X - Y < 16 )\).
      4. A new drug to treat the common cold was used with a randomly selected group of 100 volunteers. Each was given the drug and their health was monitored to see if they caught a cold. A randomly selected control group of 100 volunteers was treated with a dummy pill. The results are shown in the table below.
    10. Write down a suitable model for \(X\).
    11. Test, at the \(1 \%\) level of significance, the suitability of your model for these data.
    12. Explain how the test would have been modified if it had not been assumed that the dice were fair.
      7. The random variable \(D\) is defined as $$D = A - 3 B + 4 C$$ where \(A \sim \mathrm {~N} \left( 5,2 ^ { 2 } \right) , B \sim \mathrm {~N} \left( 7,3 ^ { 2 } \right)\) and \(C \sim \mathrm {~N} \left( 9,4 ^ { 2 } \right)\), and \(A , B\) and \(C\) are independent.
    13. Find \(\mathrm { P } ( \mathrm { D } < 44 )\). The random variables \(B _ { 1 } , B _ { 2 }\) and \(B _ { 3 }\) are independent and each has the same distribution as \(B\). The random variable \(X\) is defined as $$X = A - \sum _ { i = 1 } ^ { 3 } B _ { i } + 4 C$$
    14. Find \(\mathrm { P } ( X > 0 )\). \section*{END} \section*{6685/01 6691/01
      Edexcel GCE} \section*{Thursday 9 June 2005 - Morning} Materials required for examination
      Mathematical Formulae (Lilac)
      Graph Paper (ASG2) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
      Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
      Full marks may be obtained for answers to ALL questions.
      This paper has seven questions.
      The total mark for this paper is 75 . Items included with question papers
      Nil
      Nil You must ensure that your answers to parts of questions are clearly labelled.
      You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
      1. A researcher carried out a survey of three treatments for a fruit tree disease. The contingency table below shows the results of a survey of a random sample of 60 diseased trees.
      Using a \(5 \%\) significance level, test whether or not there is an association between gender and acceptance or rejection of an annual flu injection. State your hypotheses clearly.
      5. Upon entering a school, a random sample of eight girls and an independent random sample of eighty boys were given the same examination in mathematics. The girls and boys were then taught in separate classes. After one year, they were all given another common examination in mathematics. The means and standard deviations of the boys' and the girls' marks are shown in the table.
    15. Find, to 3 decimal places, the Spearman rank correlation coefficient between the distance of the shop from the tourist attraction and the price of an ice cream.
    16. Stating your hypotheses clearly and using a \(5 \%\) one-tailed test, interpret your rank correlation coefficient.
      5. The workers in a large office block use a lift that can carry a maximum load of 1090 kg . The weights of the male workers are normally distributed with mean 78.5 kg and standard deviation 12.6 kg . The weights of the female workers are normally distributed with mean 62.0 kg and standard deviation 9.8 kg . Random samples of 7 males and 8 females can enter the lift.
    17. Find the mean and variance of the total weight of the 15 people that enter the lift.
    18. Comment on any relationship you have assumed in part (a) between the two samples.
    19. Find the probability that the maximum load of the lift will be exceeded by the total weight of the 15 people.
      6. A research worker studying colour preference and the age of a random sample of 50 children obtained the results shown below.
      Age in yearsRedBlueTotals
      412618
      810717
      126915
      Totals282250
      Using a \(5 \%\) significance level, carry out a test to decide whether or not there is an association between age and colour preference. State your hypotheses clearly.
      7. A machine produces metal containers. The weights of the containers are normally distributed. A random sample of 10 containers from the production line was weighed, to the nearest 0.1 kg , and gave the following results $$\begin{array} { l l l l l } 49.7 , & 50.3 , & 51.0 , & 49.5 , & 49.9
      50.1 , & 50.2 , & 50.0 , & 49.6 , & 49.7 . \end{array}$$
    20. Find unbiased estimates of the mean and variance of the weights of the population of metal containers. The machine is set to produce metal containers whose weights have a population standard deviation of 0.5 kg .
    21. Estimate the limits between which \(95 \%\) of the weights of metal containers lie.
    22. Determine the \(99 \%\) confidence interval for the mean weight of metal containers.
    Edexcel S3 Q8
    8. Five coins were tossed 100 times and the number of heads recorded. The results are shown in the table below.
    1. Calculate Spearman's rank correlation coefficient for the marks awarded by the two judges. After the show, one competitor complained about the judges. She claimed that there was no positive correlation between their marks.
    2. Stating your hypotheses clearly, test whether or not this sample provides support for the competitor's claim. Use a \(5 \%\) level of significance.
      (4)
      2. The Director of Studies at a large college believed that students' grades in Mathematics were independent of their grades in English. She examined the results of a random group of candidates who had studied both subjects and she recorded the number of candidates in each of the 6 categories shown. Showing your working clearly, test, at the \(1 \%\) level of significance, whether or not there is an association between gender and the type of course taken. State your hypotheses clearly.
      3. The product moment correlation coefficient is denoted by \(r\) and Spearman's rank correlation coefficient is denoted by \(r _ { s }\).
    3. Sketch separate scatter diagrams, with five points on each diagram, to show
      1. \(r = 1\),
      2. \(r _ { s } = - 1\) but \(r > - 1\). Two judges rank seven collie dogs in a competition. The collie dogs are labelled \(A\) to \(G\) and the rankings are as follows.
    4. Calculate Spearman's rank correlation coefficient for these data.
    5. Stating your hypotheses clearly and using a one tailed test with a \(5 \%\) level of significance, interpret your rank correlation coefficient.
    6. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data.
      (1)
      4. A sample of size 8 is to be taken from a population that is normally distributed with mean 55 and standard deviation 3 . Find the probability that the sample mean will be greater than 57 .
      (5)
      5. The number of goals scored by a football team is recorded for 100 games. The results are summarised in Table 1 below. \begin{table}[h]
    7. Calculate Spearman's rank correlation coefficient between \(b\) and \(s\).
    8. Stating your hypotheses clearly, test whether or not the data provides support for the researcher's claim. Use a \(1 \%\) level of significance.
      (4)
      5. A random sample of 100 people were asked if their finances were worse, the same or better than this time last year. The sample was split according to their annual income and the results are shown in the table below.
    9. Calculate the Spearman's rank correlation coefficient between \(h\) and \(c\). After collecting the data, the councillor thinks there is no correlation between hardship and the number of calls to the emergency services.
    10. Test, at the \(5 \%\) level of significance, the councillor's claim. State your hypotheses clearly.
      3. A factory manufactures batches of an electronic component. Each component is manufactured in one of three shifts. A component may have one of two types of defect, \(D _ { 1 }\) or \(D _ { 2 }\), at the end of the manufacturing process. A production manager believes that the type of defect is dependent upon the shift that manufactured the component. He examines 200 randomly selected defective components and classifies them by defect type and shift. The results are shown in the table below.
    11. Calculate Spearman's rank correlation coefficient for these data.
    12. Test, at the \(5 \%\) level of significance, whether there is agreement between the rankings awarded by each manager. State your hypotheses clearly. Manager \(Y\) later discovered he had miscopied his score for candidate \(D\) and it should be 54 .
    13. Without carrying out any further calculations, explain how you would calculate Spearman rank correlation in this case.
      (2)
      2. A lake contains 3 species of fish. There are estimated to be 1400 trout, 600 bass and 450 pike in the lake. A survey of the health of the fish in the lake is carried out and a sample of 30 fish is chosen.
    14. Give a reason why stratified random sampling cannot be used.
    15. State an appropriate sampling method for the survey.
    16. Give one advantage and one disadvantage of this sampling method.
    17. Explain how this sampling method could be used to select the sample of 30 fish. You must show your working.
      (4)
      3. (a) Explain what you understand by the Central Limit Theorem. A garage services hire cars on behalf of a hire company. The garage knows that the lifetime of the brake pads has a standard deviation of 5000 miles. The garage records the lifetimes, \(x\) miles, of the brake pads it has replaced. The garage takes a random sample of 100 brake pads and finds that \(\sum x = 1740000\).
    18. Find a 95\% confidence interval for the mean lifetime of a brake pad.
    19. Explain the relevance of the Central Limit Theorem in part (b). Brake pads are made to be changed very 20000 miles on average. The hire car company complain that the garage is changing the brake pads too soon.
    20. Comment on the hire company's complaint. Give a reason for your answer.
      4. Two breeds of chicken are surveyed to measure their egg yield. The results are shown in the table below.
    21. Find, to 3 decimal places, Spearman's rank correlation coefficient between the population and the number of council employees.
    22. Use your value of Spearman's rank correlation coefficient to test for evidence of a positive correlation between the population and the number of council employees. Use a \(2.5 \%\) significance level. State your hypotheses clearly. It is suggested that a product moment correlation coefficient would be a more suitable calculation in this case. The product moment correlation coefficient for these data is 0.627 to 3 decimal places.
    23. Use the value of the product moment correlation coefficient to test for evidence of a positive correlation between the population and the number of council employees. Use a \(2.5 \%\) significance level.
    24. Interpret and comment on your results from part (b) and part (c).
      4. John thinks that a person's eye colour is related to their hair colour. He takes a random sample of 600 people and records their eye and hair colours. The results are shown in Table 1. \begin{table}[h] Using a \(5 \%\) level of significance, test whether or not there is an association between cholesterol level and intake of saturated fats. State your hypotheses and show your working clearly.
      2. The table below shows the number of students per member of staff and the student satisfaction scores for 7 universities.
    25. Calculate Spearman's rank correlation coefficient for these data. The journalist believes that car models with higher fuel efficiency will achieve higher sales.
    26. Stating your hypotheses clearly, test whether or not the data support the journalist's belief. Use a \(5 \%\) level of significance.
    27. State the assumption necessary for a product moment correlation coefficient to be valid in this case.
      (1)
    28. The mean and median fuel efficiencies of the car models in the random sample are \(14.5 \mathrm {~km} /\) litre and \(15.65 \mathrm {~km} /\) litre respectively. Considering these statistics, as well as the distribution of the fuel efficiency data, state whether or not the data suggest that the assumption in part (c) might be true in this case. Give a reason for your answer.
      (No further calculations are required.)
      2. A survey asked a random sample of 200 people their age and the main use of their mobile phone. The results are shown in Table 1 below. \begin{table}[h] Stating your hypotheses, test at the \(5 \%\) level of significance, whether or not there is evidence of an association between happiness and gender. Show your working clearly.
      4. The random variable \(A\) is defined as $$A = B + 4 C - 3 D$$ where \(B\), \(C\) and \(D\) are independent random variables with $$B \sim \mathrm {~N} \left( 6,2 ^ { 2 } \right) \quad C \sim \mathrm {~N} \left( 7,3 ^ { 2 } \right) \quad D \sim \mathrm {~N} \left( 4,1.5 ^ { 2 } \right)$$ Find \(\mathrm { P } ( A < 45 )\).
      5. A research station is doing some work on the germination of a new variety of genetically modified wheat. They planted 120 rows containing 7 seeds in each row.
      The number of seeds germinating in each row was recorded. The results are as follows Starting with the top left-hand corner (319) and working across, the committee selects 50 random numbers. The first 2 suitable numbers are 241 and 278 . Numbers greater than 300 are ignored.
    29. Find the next two suitable numbers. When the club's committee looks at the members corresponding to their random numbers they find that only 1 female has been selected.
      The committee does not want to be accused of being biased towards males so considers using a systematic sample instead.
      1. Explain clearly how the committee could take a systematic sample.
      2. Explain why a systematic sample may not give a sample that represents the proportion of males and females in the club. The committee decides to use a stratified sample instead.
    30. Describe how to choose members for the stratified sample.
    31. Explain an advantage of using a stratified sample rather than a quota sample.
      2. The random variable \(X\) follows a continuous uniform distribution over the interval \([ \alpha - 3,2 \alpha + 3 ]\) where \(\alpha\) is a constant.
      The mean of a random sample of size \(n\) is denoted by \(\bar { X }\).
    32. Show that \(\bar { X }\) is a biased estimator of \(\alpha\), and state the bias. Given that \(Y = k \bar { X }\) is an unbiased estimator for \(\alpha\),
    33. find the value of \(k\). A random sample of 10 values of \(X\) is taken and the results are as follows $$\begin{array} { l l l l l l l l l l } 3 & 5 & 8 & 12 & 4 & 13 & 10 & 8 & 5 & 12 \end{array}$$
    34. Hence estimate the maximum value of \(X\).
      3. A grocer believes that the average weight of a grapefruit from farm \(A\) is greater than the average weight of a grapefruit from farm \(B\). The weights, in grams, of 80 grapefruit selected at random from farm \(A\) have a mean value of 532 g and a standard deviation, \(s _ { A }\), of 35 g . A random sample of 100 grapefruit from farm \(B\) have a mean weight of 520 g and a standard deviation, \(S _ { B }\), of 28 g . Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the grocer's belief is supported by the data.
      4. In a survey 10 randomly selected men had their systolic blood pressure, \(x\), and weight, \(w\), measured. Their results are as follows:
      Man\(\boldsymbol { A }\)\(\boldsymbol { B }\)\(\boldsymbol { C }\)\(\boldsymbol { D }\)\(\boldsymbol { E }\)\(\boldsymbol { F }\)\(\boldsymbol { G }\)\(\boldsymbol { H }\)\(\boldsymbol { I }\)\(\boldsymbol { J }\)
      \(x\)123128137143149153154159162168
      \(w\)78938583759888879599
    35. Calculate the value of Spearman's rank correlation coefficient between \(x\) and \(w\).
    36. Stating your hypotheses clearly, test at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight. The product moment correlation coefficient for these data is 0.5114 .
    37. Use the value of the product moment correlation coefficient to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight.
    38. Using your conclusions to part (b) and part (c), describe the relationship between systolic blood pressure and weight.
      5. A random sample of 200 people were asked which hot drink they preferred from tea, coffee and hot chocolate. The results are given below.
      \multirow{2}{*}{}Type of drink preferred\multirow{2}{*}{Total}
      TeaCoffeeHot Chocolate
      \multirow{2}{*}{Gender}Males57261194
      Females424717106
      Total997328200
    39. Test, at the \(5 \%\) significance level, whether or not there is an association between type of drink preferred and gender. State your hypotheses and show your working clearly. You should state your expected frequencies to 2 decimal places.
    40. State what difference using a \(0.5 \%\) significance level would make to your conclusion. Give a reason for your answer.
      6. Eight tasks were given to each of 125 randomly selected job applicants. The number of tasks failed by each applicant is recorded. The results are as follows:
      Number of tasks
      failed by an
      applicant
      012345
      6 or
      more
      Frequency22145421230
    41. Show that the probability of a randomly selected task, from this sample, being failed is 0.3 . An employer believes that a binomial distribution might provide a good model for the number of tasks, out of 8 , that an applicant fails. He uses a binomial distribution, with the estimated probability 0.3 of a task being failed. The calculated expected frequencies are as follows
      Number of tasks
      failed by an
      applicant
      012345
      6 or
      more
      Frequency7.2124.7137.06\(r\)17.025.83\(s\)
    42. Find the value of \(r\) and the value of \(s\) giving your answers to 2 decimal places.
    43. Test, at the \(5 \%\) level of significance, whether or not a binomial distribution is a suitable model for these data. State your hypotheses and show your working clearly. The employer believes that all applicants have the same probability of failing each task.
    44. Use your result from part (c) to comment on this belief.
      7. The random variable \(X\) is defined as $$X = 4 Y - 3 W$$ where \(Y \sim \mathrm {~N} \left( 40,3 ^ { 2 } \right) , W \sim \mathrm {~N} \left( 50,2 ^ { 2 } \right)\) and \(Y\) and \(W\) are independent.
    45. Find \(\mathrm { P } ( X > 25 )\). The random variables \(Y _ { 1 } , Y _ { 2 }\) and \(Y _ { 3 }\) are independent and each has the same distribution as \(Y\). The random variable \(A\) is defined as $$A = \sum _ { i = 1 } ^ { 3 } Y _ { i }$$ The random variable \(C\) is such that \(C \sim \mathrm {~N} \left( 115 , \sigma ^ { 2 } \right)\).
      Given that \(\mathrm { P } ( A - C < 0 ) = 0.2\) and that \(A\) and \(C\) are independent,
    46. find the variance of \(C\).
    Edexcel D1 2014 January Q1
    1. 11
    17
    10
    14
    8
    13
    6
    4
    15
    7
    1. Use the bubble sort algorithm to perform ONE complete pass towards sorting these numbers into ascending order. The original list is now to be sorted into descending order.
    2. Use a quick sort to obtain the sorted list, giving the state of the list after each complete pass. You must make your pivots clear. The numbers are to be packed into bins of size 26
    3. Calculate a lower bound for the minimum number of bins required. You must show your working.