Edexcel S1 (Statistics 1) 2015 June

Question 1
View details
  1. The discrete random variable \(X\) can only take the values \(1,2,3\) and 4 For these values the cumulative distribution function is defined by
$$\mathrm { F } ( x ) = k x ^ { 2 } \text { for } x = 1,2,3,4$$ where \(k\) is a constant.
  1. Find the value of \(k\).
  2. Find the probability distribution of \(X\).
Question 2
View details
2. Paul believes there is a relationship between the value and the floor size of a house. He takes a random sample of 20 houses and records the value, \(\pounds v\), and the floor size, \(s \mathrm {~m} ^ { 2 }\) The data were coded using \(x = \frac { s - 50 } { 10 }\) and \(y = \frac { v } { 100000 }\) and the following statistics obtained. $$\sum x = 441.5 , \quad \sum y = 59.8 , \quad \sum x ^ { 2 } = 11261.25 , \quad \sum y ^ { 2 } = 196.66 , \quad \sum x y = 1474.1$$
  1. Find the value of \(S _ { x y }\) and the value of \(S _ { x x }\)
  2. Find the equation of the least squares regression line of \(y\) on \(x\) in the form \(y = a + b x\) The least squares regression line of \(v\) on \(s\) is \(v = c + d s\)
  3. Show that \(d = 1020\) to 3 significant figures and find the value of \(c\)
  4. Estimate the value of a house of floor size \(130 \mathrm {~m} ^ { 2 }\)
  5. Interpret the value \(d\) Paul wants to increase the value of his house. He decides to add an extension to increase the floor size by \(31 \mathrm {~m} ^ { 2 }\)
  6. Estimate the increase in the value of Paul's house after adding the extension.
Question 3
View details
  1. A company employs 90 administrators. The length of time that they have been employed by the company and their gender are summarised in the table below.
Length of time employed, \(x\) yearsFemaleMale
\(x < 4\)916
\(4 \leqslant x < 10\)1420
\(10 \leqslant x\)724
One of the 90 administrators is selected at random.
  1. Find the probability that the administrator is female.
  2. Given that the administrator has been employed by the company for less than 4 years, find the probability that this administrator is male.
  3. Given that the administrator has been employed by the company for less than 10 years, find the probability that this administrator is male.
  4. State, with a reason, whether or not the event 'selecting a male' is independent of the event 'selecting an administrator who has been employed by the company for less than 4 years'.
Question 4
View details
  1. A bag contains 19 red beads and 1 blue bead only.
Linda selects a bead at random from the bag. She notes its colour and replaces the bead in the bag. She then selects a second bead at random from the bag and notes its colour. Find the probability that
  1. both beads selected are blue,
  2. exactly one bead selected is red. In another bag there are 9 beads, 4 of which are green and the rest are yellow.
    Linda selects 3 beads from this bag at random without replacement.
  3. Find the probability that 2 of these beads are yellow and 1 is green. Linda replaces the 3 beads and then selects another 4 at random without replacement.
  4. Find the probability that at least 1 of the beads is green.
Question 5
View details
  1. Police measure the speed of cars passing a particular point on a motorway. The random variable \(X\) is the speed of a car.
    \(X\) is modelled by a normal distribution with mean 55 mph (miles per hour).
    1. Draw a sketch to illustrate the distribution of \(X\). Label the mean on your sketch.
    The speed limit on the motorway is 70 mph . Car drivers can choose to travel faster than the speed limit but risk being caught by the police. The distribution of \(X\) has a standard deviation of 20 mph .
  2. Find the percentage of cars that are travelling faster than the speed limit. The fastest \(1 \%\) of car drivers will be banned from driving.
  3. Show that the lowest speed, correct to 3 significant figures, for a car driver to be banned is 102 mph . Show your working clearly. Car drivers will just be given a caution if they are travelling at a speed \(m\) such that $$\mathrm { P } ( 70 < X < m ) = 0.1315$$
  4. Find the value of \(m\). Show your working clearly.
Question 6
View details
  1. The random variable \(X\) has a discrete uniform distribution and takes the values \(1,2,3,4\) Find
    1. \(\mathrm { F } ( 3 )\), where \(\mathrm { F } ( x )\) is the cumulative distribution function of \(X\),
    2. \(\mathrm { E } ( X )\).
    3. Show that \(\operatorname { Var } ( X ) = \frac { 5 } { 4 }\)
    The random variable \(Y\) has a discrete uniform distribution and takes the values $$3,3 + k , 3 + 2 k , 3 + 3 k$$ where \(k\) is a constant.
  2. Write down \(\mathrm { P } ( Y = y )\) for \(y = 3,3 + k , 3 + 2 k , 3 + 3 k\) The relationship between \(X\) and \(Y\) may be written in the form \(Y = k X + c\) where \(c\) is a constant.
  3. Find \(\operatorname { Var } ( Y )\) in terms of \(k\).
  4. Express \(c\) in terms of \(k\).
Question 7
View details
7. A doctor is investigating the correlation between blood protein, \(p\), and body mass index, \(b\). He takes a random sample of 8 patients and the data are shown in the table below.
Patient\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)\(H\)
\(b\)3236404442212737
\(p\)1821313921121970
  1. Draw a scatter diagram of these data on the axes provided.
    \includegraphics[max width=\textwidth, alt={}, center]{36cf6341-1957-45b9-9f7d-0914506f5919-13_938_673_785_614} The doctor decides to leave out patient \(H\) from his calculations.
  2. Give a reason for the doctor's decision. For the 7 patients \(A , B , C , D , E , F\) and \(G\), $$S _ { b p } = 369 , \quad S _ { p p } = 490 \text { and } S _ { b b } = 423 \frac { 5 } { 7 }$$
  3. Find the product moment correlation coefficient, \(r\), for these 7 patients.
  4. Without any further calculations, state how \(r\) would differ from your answer in part (c) if it was calculated for all 8 patients. \begin{figure}[h]
    \includegraphics[alt={},max width=\textwidth]{36cf6341-1957-45b9-9f7d-0914506f5919-15_1322_1593_207_173} \captionsetup{labelformat=empty} \caption{Figure 1}
    \end{figure} The histogram in Figure 1 summarises the times, in minutes, that 200 people spent shopping in a supermarket.
  5. Give a reason to justify the use of a histogram to represent these data. Given that 40 people spent between 11 and 21 minutes shopping in the supermarket, estimate
  6. the number of people that spent between 18 and 25 minutes shopping in the supermarket,
  7. the median time spent shopping in the supermarket by these 200 people. The mid-point of each bar is represented by \(x\) and the corresponding frequency by f .
  8. Show that \(\sum \mathrm { f } x = 6390\) Given that \(\sum \mathrm { f } x ^ { 2 } = 238430\)
  9. for the data shown in the histogram, calculate estimates of
    1. the mean,
    2. the standard deviation. A coefficient of skewness is given by \(\frac { 3 ( \text { mean } - \text { median } ) } { \text { standard deviation } }\)
  10. Calculate this coefficient of skewness for these data. The manager of the supermarket decides to model these data with a normal distribution.
  11. Comment on the manager's decision. Give a justification for your answer.