OCR FS1 AS (Further Statistics 1 AS) 2017 December

Question 1
View details
1 Bill and Gill send letters to potential sponsors of a show. On past experience, they know that \(5 \%\) of letters receive a favourable reply.
  1. Bill sends a letter to each of 40 potential sponsors. Assuming that the number \(N\) of favourable responses can be modelled by a binomial distribution, find the mean and variance of \(N\).
  2. Gill sends one letter at a time to potential sponsors. \(L\) is the number of letters she sends, up to and including the first letter that receives a favourable response.
    (a) State two assumptions needed for \(L\) to be well modelled by a geometric distribution.
    (b) Using the assumptions in part (ii)(a), find the smallest number of letters that Gill has to send in order to have at least a \(90 \%\) chance of receiving at least one favourable reply.
Question 2
View details
2 Each letter of the words NEW COURSE is written on a card (including one blank card, representing the space between the words), so that there are 10 cards altogether.
  1. All 10 cards are arranged in a random order in a straight line. Find the probability that the two cards containing an E are next to each other.
  2. 4 cards are chosen at random. Find the probability that at least three consonants ( \(\mathrm { N } , \mathrm { W } , \mathrm { C } , \mathrm { R } , \mathrm { S }\) ) are on the cards chosen.
Question 3
View details
3 Over a long period Jenny counts the number of trolleys used at her local supermarket between 10 am and 10.20 am each day. She finds that the mean number of trolleys used between these times on a weekday is 40.00. You should assume that the use of trolleys occurs randomly, independently of one another, and at a constant average rate.
  1. Calculate the probability that, on a randomly chosen weekday, the number of trolleys used between these times is between 32 and 50 inclusive.
  2. Write down an expression for the probability that, on a randomly chosen weekday, exactly 5 trolleys are used during a time period of \(t\) minutes between 10 am and 10.20 am. Jenny carries out this process for seven consecutive days. She finds that the mean number of trolleys used between 10 am and 10.20 am is 35.14 and the variance is 91.55 .
  3. Explain why this suggests that the distribution of the number of trolleys used between these times on these seven consecutive days is not well modelled by a Poisson distribution.
  4. Give a reason why it might not be appropriate to apply the Poisson model to the total number of trolleys used between these times on seven consecutive days.
Question 4
View details
4 The discrete random variable \(X\) has the distribution \(\mathrm { U } ( n )\).
  1. Use the results \(\sum _ { r = 1 } ^ { n } r ^ { 2 } = \frac { 1 } { 6 } n ( n + 1 ) ( 2 n + 1 )\) and \(\mathrm { E } ( X ) = \frac { n + 1 } { 2 }\) to show that \(\operatorname { Var } ( X ) = \frac { 1 } { 12 } \left( n ^ { 2 } - 1 \right)\). It is given that \(\mathrm { E } ( X ) = 13\).
  2. Find the value of \(n\).
  3. Find \(\mathrm { P } ( X < 7.5 )\). It is given that \(\mathrm { E } ( a X + b ) = 10\) and \(\operatorname { Var } ( a X + b ) = 117\), where \(a\) and \(b\) are positive.
  4. Calculate the value of \(a\) and the value of \(b\).
Question 5
View details
5 A shop manager recorded the maximum daytime temperature \(T ^ { \circ } \mathrm { C }\) and the number \(C\) of ice creams sold on 9 summer days. The results are given in the table and illustrated in the scatter diagram.
\(T\)172125262727293030
\(C\)211620383237353942
\includegraphics[max width=\textwidth, alt={}]{64d7ed6d-fadd-4c59-afb0-97d1788ba369-3_661_1189_1320_431}
$$n = 9 , \Sigma t = 232 , \Sigma c = 280 , \Sigma t ^ { 2 } = 6130 , \Sigma c ^ { 2 } = 9444 , \Sigma t c = 7489$$
  1. State, with a reason, whether one of the variables \(C\) or \(T\) is likely to be dependent upon the other.
  2. Calculate Pearson's product-moment correlation coefficient \(r\) for the data.
  3. State with a reason what the value of \(r\) would have been if the temperature had been measured in \({ } ^ { \circ } \mathrm { F }\) rather than \({ } ^ { \circ } \mathrm { C }\).
  4. Calculate the equation of the least squares regression line of \(c\) on \(t\).
  5. The regression line is drawn on the copy of the scatter diagram in the Printed Answer Booklet. Use this diagram to explain what is meant by "least squares".
Question 6
View details
6 Arlosh, Sarah and Desi are investigating the ratings given to six different films by two critics.
  1. Arlosh calculates Spearman's rank correlation coefficient \(r _ { s }\) for the critics' ratings. He calculates that \(\Sigma d ^ { 2 } = 72\). Show that this value must be incorrect.
  2. Arlosh checks his working with Sarah, whose answer \(r _ { s } = \frac { 29 } { 35 }\) is correct. Find the correct value of \(\Sigma d ^ { 2 }\).
  3. Carry out an appropriate two-tailed significance test of the value of \(r _ { s }\) at the \(5 \%\) significance level, stating your hypotheses clearly. Each critic gives a score out of 100 to each film. Desi uses these scores to calculate Pearson's product-moment correlation coefficient. She carries out a two-tailed significance test of this value at the \(5 \%\) significance level.
  4. Explain with a reason whether you would expect the conclusion of Desi's test to be the same as the result of the test in part (iii).
Question 7
View details
7 Josh is investigating whether sticking pins into a map at random, while blindfolded, provides a random sample of regions of the map. Josh divides the map into 49 squares of equal size and asks each of 98 friends to stick a pin into the map at random, while blindfolded. He then notes the number of pins in each square. To analyse the results he groups the squares as shown in the diagram.
DDDDDDD
DCCCCCD
DCBBBCD
DCBABCD
DCBBBCD
DCCCCCD
DDDDDDD
The results are summarised in the table.
RegionABCD
Number of squares181624
Number of pins6213338
  1. Test at the 10\% significance level whether the use of pins in this way provides a random sample of regions of the map.
  2. What can be deduced from considering the different contributions to the test statistic? \section*{OCR} \section*{Oxford Cambridge and RSA}