Edexcel S1 (Statistics 1) 2013 June

Question 1
View details
  1. A meteorologist believes that there is a relationship between the height above sea level, \(h \mathrm {~m}\), and the air temperature, \(t ^ { \circ } \mathrm { C }\). Data is collected at the same time from 9 different places on the same mountain. The data is summarised in the table below.
\(h\)140011002608409005501230100770
\(t\)310209101352416
[You may assume that \(\sum h = 7150 , \sum t = 110 , \sum h ^ { 2 } = 7171500 , \sum t ^ { 2 } = 1716\), \(\sum t h = 64980\) and \(\mathrm { S } _ { t t } = 371.56\) ]
  1. Calculate \(\mathrm { S } _ { t h }\) and \(\mathrm { S } _ { h h }\). Give your answers to 3 significant figures.
  2. Calculate the product moment correlation coefficient for this data.
  3. State whether or not your value supports the use of a regression equation to predict the air temperature at different heights on this mountain. Give a reason for your answer.
  4. Find the equation of the regression line of \(t\) on \(h\) giving your answer in the form \(t = a + b h\).
  5. Interpret the value of \(b\).
  6. Estimate the difference in air temperature between a height of 500 m and a height of 1000 m .
Question 2
View details
  1. The marks of a group of female students in a statistics test are summarised in Figure 1
\begin{figure}[h]
\includegraphics[alt={},max width=\textwidth]{6faf2dd2-a114-40b7-88ae-4a75dbfb4706-04_629_1102_342_429} \captionsetup{labelformat=empty} \caption{Figure 1}
\end{figure}
  1. Write down the mark which is exceeded by \(75 \%\) of the female students. The marks of a group of male students in the same statistics test are summarised by the stem and leaf diagram below.
    Mark(2|6 means 26)Totals
    14(1)
    26(1)
    3447(3)
    4066778(6)
    5001113677(9)
    6223338(6)
    7008(3)
    85(1)
    90(1)
  2. Find the median and interquartile range of the marks of the male students. An outlier is a mark that is
    either more than \(1.5 \times\) interquartile range above the upper quartile or more than \(1.5 \times\) interquartile range below the lower quartile.
  3. In the space provided on Figure 1 draw a box plot to represent the marks of the male students, indicating clearly any outliers.
  4. Compare and contrast the marks of the male and the female students.
Question 3
View details
3. In a company the 200 employees are classified as full-time workers, part-time workers or contractors.
The table below shows the number of employees in each category and whether they walk to work or use some form of transport.
\cline { 2 - 3 } \multicolumn{1}{c|}{}WalkTransport
Full-time worker28
Part-time worker3575
Contractor3050
The events \(F , H\) and \(C\) are that an employee is a full-time worker, part-time worker or contractor respectively. Let \(W\) be the event that an employee walks to work. An employee is selected at random.
Find
  1. \(\mathrm { P } ( H )\)
  2. \(\mathrm { P } \left( [ F \cap W ] ^ { \prime } \right)\)
  3. \(\mathrm { P } ( W \mid C )\) Let \(B\) be the event that an employee uses the bus.
    Given that \(10 \%\) of full-time workers use the bus, \(30 \%\) of part-time workers use the bus and \(20 \%\) of contractors use the bus,
  4. draw a Venn diagram to represent the events \(F , H , C\) and \(B\),
  5. find the probability that a randomly selected employee uses the bus to travel to work.
Question 4
View details
4. The following table summarises the times, \(t\) minutes to the nearest minute, recorded for a group of students to complete an exam.
Time (minutes) \(t\)\(11 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 35\)\(36 - 45\)\(46 - 60\)
Number of students f628816131110
$$\text { [You may use } \sum \mathrm { f } t ^ { 2 } = 134281.25 \text { ] }$$
  1. Estimate the mean and standard deviation of these data.
  2. Use linear interpolation to estimate the value of the median.
  3. Show that the estimated value of the lower quartile is 18.6 to 3 significant figures.
  4. Estimate the interquartile range of this distribution.
  5. Give a reason why the mean and standard deviation are not the most appropriate summary statistics to use with these data. The person timing the exam made an error and each student actually took 5 minutes less than the times recorded above. The table below summarises the actual times.
    Time (minutes) \(t\)\(6 - 15\)\(16 - 20\)\(21 - 25\)\(26 - 30\)\(31 - 40\)\(41 - 55\)
    Number of students f628816131110
  6. Without further calculations, explain the effect this would have on each of the estimates found in parts (a), (b), (c) and (d).
Question 5
View details
  1. A biased die with six faces is rolled. The discrete random variable \(X\) represents the score on the uppermost face. The probability distribution of \(X\) is shown in the table below.
\(x\)123456
\(\mathrm { P } ( X = x )\)\(a\)\(a\)\(a\)\(b\)\(b\)0.3
  1. Given that \(\mathrm { E } ( X ) = 4.2\) find the value of \(a\) and the value of \(b\).
  2. Show that \(\mathrm { E } \left( X ^ { 2 } \right) = 20.4\)
  3. Find \(\operatorname { Var } ( 5 - 3 X )\) A biased die with five faces is rolled. The discrete random variable \(Y\) represents the score which is uppermost. The cumulative distribution function of \(Y\) is shown in the table below.
    \(y\)12345
    \(\mathrm {~F} ( y )\)\(\frac { 1 } { 10 }\)\(\frac { 2 } { 10 }\)\(3 k\)\(4 k\)\(5 k\)
  4. Find the value of \(k\).
  5. Find the probability distribution of \(Y\). Each die is rolled once. The scores on the two dice are independent.
  6. Find the probability that the sum of the two scores equals 2
Question 6
View details
  1. The weight, in grams, of beans in a tin is normally distributed with mean \(\mu\) and standard deviation 7.8
Given that \(10 \%\) of tins contain less than 200 g , find
  1. the value of \(\mu\)
  2. the percentage of tins that contain more than 225 g of beans. The machine settings are adjusted so that the weight, in grams, of beans in a tin is normally distributed with mean 205 and standard deviation \(\sigma\).
  3. Given that \(98 \%\) of tins contain between 200 g and 210 g find the value of \(\sigma\).