Questions Further Statistics (108 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks PURE Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 PURE S1 S2 S3 S4 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 Pre-U Pre-U 9794/1 Pre-U 9794/2 Pre-U 9794/3 Pre-U 9795 Pre-U 9795/1 Pre-U 9795/2 WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR Further Statistics 2024 June Q1
8 marks Standard +0.3
1 A discrete random variable \(X\) has the following distribution, where \(a , b\) and \(c\) are constants.
\(x\)0123
\(\mathrm { P } ( \mathrm { X } = \mathrm { x } )\)\(a\)\(b\)\(c\)0.1
It is given that \(\mathrm { E } ( X ) = 1.25\) and \(\operatorname { Var } ( X ) = 0.8875\).
  1. Determine the values of \(a\), \(b\) and \(c\).
  2. The random variable \(Y\) is defined by \(Y = 7 - 2 X\). Write down the value of \(\operatorname { Var } ( Y )\).
  3. Twenty independent observations of \(X\) are obtained. The number of those observations for which \(X = 3\) is denoted by \(T\). Find the value of \(\operatorname { Var } ( T )\).
OCR Further Statistics 2024 June Q2
9 marks Standard +0.3
2 A newspaper article claimed that "taller dog owners have taller dogs as pets". Alex investigated this claim and obtained data from a random sample of 16 fellow students who owned exactly one dog. The results are summarised as follows, where the height of the student, in cm, is denoted by \(h\) and the height, in cm, of their dog is denoted by \(d\). \(\mathrm { n } = 16 \quad \sum \mathrm {~h} = 2880 \quad \sum \mathrm {~d} = 660 \quad \sum \mathrm {~h} ^ { 2 } = 519276 \quad \sum \mathrm {~d} ^ { 2 } = 30000 \quad \sum \mathrm { hd } = 119425\)
  1. Calculate the value of Pearson's product moment correlation coefficient for the data.
  2. State what your answer tells you about a scatter diagram illustrating the data.
  3. Use the data to test, at the \(5 \%\) significance level, the claim of the newspaper article.
  4. Explain whether the answer to part (a) would be likely to be different if the dogs' weights had been used instead of their heights.
OCR Further Statistics 2024 June Q3
11 marks Standard +0.3
3 Research suggests that the mean reading age of a child about to start secondary school is 10.75 . The reading ages, \(X\) years, of a random sample of 80 children who were about to start secondary school in a particular district were measured, and the results are summarised as follows. $$\mathrm { n } = 80 \quad \sum \mathrm { x } = 893 \quad \sum \mathrm { x } ^ { 2 } = 10267$$
  1. Test at the \(5 \%\) significance level whether the mean reading age of children about to start secondary school in this district is not 10.75 .
  2. A student wrote: "Although we do not know that the distribution of \(X\) is normal, the central limit theorem allows us to assume that it is, as the sample size is large." This statement is incorrect. Give a corrected version of the student's statement.
OCR Further Statistics 2024 June Q4
6 marks Challenging +1.8
4
  1. Write down the number of ways of choosing 5 objects from 12 distinct objects.
  2. Each possible set of 5 different integers selected from the integers \(1,2 , \ldots , 12\) is obtained, and for each set, the sum of the 5 integers is found. The sum \(S\) can take values between 15 and 50 inclusive. Part of the frequency distribution of \(S\) is shown in the following table, together with the cumulative frequencies.
    S151617181920212223
    Frequency112357101317
    Cumulative Frequency12471219294259
    Use these numbers to determine the critical region for a 1-tail Wilcoxon rank-sum test at the \(2 \%\) significance level when \(m = 5\) and \(n = 7\).
  3. A student says that, for a Wilcoxon rank-sum test on samples of size \(m\) and \(n\), where \(m\) and \(n\) are large, the mean and variance of the test statistic \(R _ { m }\) are 200 and \(616 \frac { 2 } { 3 }\) respectively. Show that at least one of these values must be incorrect.
OCR Further Statistics 2024 June Q5
12 marks Standard +0.3
5 Some bird-watchers study the song of chaffinches in a particular wood. They investigate whether the number, \(N\), of separate bursts of song in a 5 minute period can be modelled by a Poisson distribution. They assume that a burst of song can be considered as a single event, and that bursts of song occur randomly. \section*{(a) State two further assumptions needed for \(N\) to be well modelled by a Poisson distribution.} The bird-watchers record the value of \(N\) in each of 60 periods of 5 minutes. The mean and variance of the results are 3.55 and 5.6475 respectively.
(b) Explain what this suggests about the validity of a Poisson distribution as a model in this context. The complete results are shown in the table.
\(n\)012345678\(\geqslant 9\)
Frequency103781366250
The bird-watchers carry out a \(\chi ^ { 2 }\) goodness of fit test at the \(5 \%\) significance level.
(c) State suitable hypotheses for the test.
(d) Determine the contribution to the test statistic for \(n = 3\).
(e) The total value of the test statistic, obtained by combining the cells for \(n \leqslant 1\) and also for \(n \geqslant 6\), is 9.202 , correct to 4 significant figures. Complete the goodness of fit test.
(f) It is known that chaffinches are more likely to sing in the presence of other chaffinches. Explain whether this fact affects the validity of a Poisson model for \(N\).
OCR Further Statistics 2024 June Q6
11 marks Standard +0.3
6 A bag contains 6 identical blue counters and 5 identical yellow counters.
  1. Three counters are selected at random, without replacement. Find the probability that at least two of the counters are blue. All 11 counters are now arranged in a row in a random order.
  2. Find the probability that all the yellow counters are next to each other.
  3. Find the probability that no yellow counter is next to another yellow counter.
  4. Find the probability that the counters are arranged in such a way that both of the following conditions hold.
OCR Further Statistics 2024 June Q7
8 marks Standard +0.3
7 The coordinates of a set of 10 points are denoted by ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) for \(i = 1,2 , \ldots , 10\). For a particular set of values of ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) and any constants \(a\) and \(b\) it can be shown that \(\Sigma \left( y _ { i } - a - b x _ { i } \right) ^ { 2 } = 10 ( 11 - a - 6 b ) ^ { 2 } + 126 \left( b - \frac { 83 } { 42 } \right) ^ { 2 } + \frac { 139 } { 14 }\).
    1. Explain why \(\sum \left( \mathrm { y } _ { \mathrm { i } } - \mathrm { a } - \mathrm { bx } _ { \mathrm { i } } \right) ^ { 2 }\) is minimised by taking \(b = \frac { 83 } { 42 }\) and \(\mathrm { a } = 11 - 6 \mathrm {~b}\).
    2. Hence explain why the equation of the regression line of \(y\) on \(x\) for these points is given by the corresponding values of \(a\) and \(b\) (so that the equation is \(\mathrm { y } = \frac { 83 } { 42 } \mathrm { x } - \frac { 6 } { 7 }\) ).
  1. State which of the following terms cannot apply to the variable \(X\) if the regression line of \(y\) on \(x\) can be used for estimating values of \(Y\). Dependent Independent Controlled Response
  2. Use the regression line to estimate the value of \(y\) corresponding to \(x = 8\).
  3. State what must be true of the value \(x = 8\) if the estimate in part (c) is to be reliable.
  4. Variables \(u\) and \(v\) are related to \(x\) and \(y\) by the following relationships. \(u = 2 + 4 x \quad v = 8 - 2 y\) Show that the gradient of the regression line of \(v\) on \(u\) is very close to - 1 .
OCR Further Statistics 2024 June Q8
10 marks Standard +0.3
8 A random sample of 100 students were given a task and the time taken by each student to complete the task was recorded. The maximum time allowed to complete the task was one minute and all students completed the task within the maximum time. The times, \(T\) minutes, for the random sample of students are summarised as follows. \(n = 100 \quad \sum t = 61.88\) A researcher proposes that \(T\) can be modelled by the continuous random variable with probability density function \(f ( t ) = \begin{cases} \alpha t ^ { \alpha - 1 } & 0 \leqslant t \leqslant 1 , \\ 0 & \text { otherwise, } \end{cases}\) where \(\alpha\) is a positive constant. \section*{(a) In this question you must show detailed reasoning.} By finding \(\mathbf { E } ( T )\) according to the researcher's model, determine an approximation for the value of \(\alpha\). Give your answer correct to \(\mathbf { 3 }\) significant figures. Further information about the times taken for the sample of 100 students to complete the task is given in the table.
Time \(t\)\(0 \leqslant t < \frac { 1 } { 3 }\)\(\frac { 1 } { 3 } \leqslant t < \frac { 2 } { 3 }\)\(\frac { 2 } { 3 } \leqslant t \leqslant 1\)
Frequency183745
(b) Using the value of \(\alpha\) found in part (a), determine the extent to which the proposed model is a good model. (Do not carry out a goodness of fit test.)
OCR Further Statistics 2021 November Q1
6 marks Standard +0.3
1 At a seaside resort the number \(X\) of ice-creams sold and the temperature \(Y ^ { \circ } \mathrm { F }\) were recorded on 20 randomly chosen summer days. The data can be summarised as follows. \(\sum x = 1506 \quad \sum x ^ { 2 } = 127542 \quad \sum y = 1431 \quad \sum y ^ { 2 } = 104451 \quad \sum x y = 111297\)
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  2. Explain the significance for the regression line of the quantity \(\sum \left[ y _ { i } - \left( a x _ { i } + b \right) \right] ^ { 2 }\).
  3. It is decided to measure the temperature in degrees Centigrade instead of degrees Fahrenheit. If the same temperature is measured both as \(f ^ { \circ }\) Fahrenheit and \(c ^ { \circ }\) Centigrade, the relationship between \(f\) and \(c\) is \(\mathrm { c } = \frac { 5 } { 9 } ( \mathrm { f } - 32 )\). Find the equation of the new regression line.
OCR Further Statistics 2021 November Q2
7 marks Moderate -0.3
2 A discrete random variable \(D\) has the following probability distribution, where \(a\) is a constant.
\(d\)0246
\(\mathrm { P } ( D = d )\)\(a\)0.10.30.2
Determine the value of \(\operatorname { Var } ( 3 D + 4 )\).
OCR Further Statistics 2021 November Q3
9 marks Standard +0.3
3 In a large collection of coloured marbles of identical size, the proportion of green marbles is \(p\). One marble is chosen randomly, its colour is noted, and it is then replaced. This process is repeated until a green marble is chosen. The first green marble chosen is the \(X\) th marble chosen.
  1. You are given that \(p = 0.3\).
    1. Find \(\mathrm { P } ( 5 \leqslant X \leqslant 10 )\).
    2. Determine the smallest value of \(n\) for which \(\mathrm { P } ( X = n ) < 0.1\).
  2. You are given instead that \(\operatorname { Var } ( X ) = 42\). Determine the value of \(\mathrm { E } ( X )\).
OCR Further Statistics 2021 November Q4
9 marks Standard +0.3
4 A random sample of 160 observations of a random variable \(X\) is selected. The sample can be summarised as follows. \(n = 160 \quad \sum x = 2688 \quad \sum x ^ { 2 } = 48398\)
  1. Calculate unbiased estimates of the following.
    1. \(\mathrm { E } ( X )\)
    2. \(\operatorname { Var } ( X )\)
  2. Find a 99\% confidence interval for \(\mathrm { E } ( X )\), giving the end-points of the interval correct to 4 significant figures.
  3. Explain whether it was necessary to use the Central Limit Theorem in answering
    1. part (a),
    2. part (b).
OCR Further Statistics 2021 November Q5
10 marks Standard +0.3
5 The numbers of each of 9 items sold in two different supermarkets in a week are given in the following table.
Item123456789
Supermarket \(A\)1728414362697593115
Supermarket \(B\)24718124729584237
A researcher wants to test whether there is association between the numbers of these items sold in the two supermarkets. However, it is known that the collection of data in Supermarket \(B\) was done inaccurately and each of the numbers in the corresponding row of the table could have been in error by as much as 2 items greater or 2 items fewer.
  1. Explain why Spearman's rank correlation coefficient might be preferred to the use of Pearson's product-moment correlation coefficient in this context.
  2. Carry out the test at the \(5 \%\) significance level using Spearman's rank correlation coefficient.
OCR Further Statistics 2021 November Q6
11 marks Standard +0.3
6 A practice examination paper is taken by 500 candidates, and the organiser wishes to know what continuous distribution could be used to model the actual time, \(X\) minutes, taken by candidates to complete the paper. The organiser starts by carrying out a goodness-of-fit test for the distribution \(\mathrm { N } \left( 100,15 ^ { 2 } \right)\) at the \(5 \%\) significance level. The grouped data and the results of some of the calculations are shown in the following table.
Time\(0 \leqslant X < 80\)\(80 \leqslant X < 90\)\(90 \leqslant X < 100\)\(100 \leqslant X < 110\)\(X \geqslant 110\)
Observed frequency \(O\)3695137129103
Expected frequency \(E\)45.60680.641123.754123.754126.246
\(\frac { ( O - E ) ^ { 2 } } { E }\)2.0232.5571.4180.2224.280
  1. State suitable hypotheses for the test.
  2. Show how the figures 123.754 and 0.222 in the column for \(100 \leqslant X < 110\) were obtained. [3]
  3. Carry out the test. The organiser now wants to suggest an improved model for the data.
    1. Suggest an aspect of the data that the organiser should take into account in considering an improved model.
    2. The graph of the probability density function for the distribution \(\mathrm { N } \left( 100,15 ^ { 2 } \right)\) is shown in the diagram in the Printed Answer Booklet. On the same diagram sketch the probability density function of an improved model that takes into account the aspect of the data in part (d)(i).
OCR Further Statistics 2021 November Q7
12 marks Standard +0.3
7 In a school opinion poll a random sample of 8 pupils were asked to rate school lunches on a scale of 0 to 20 . The results were as follows. \(\begin{array} { l l l l l l l l } 0 & 1 & 2 & 3 & 4 & 10 & 11 & 13 \end{array}\) After a new menu was introduced, the test was repeated with a different random sample of 8 pupils. The results were as follows. \(\begin{array} { l l l l l l l l } 7 & 8 & 9 & 14 & 15 & 17 & 19 & 20 \end{array}\)
  1. Carry out an appropriate Wilcoxon test at the \(5 \%\) significance level to test whether pupils' opinions of school lunches have changed. A statistics student tells the organisers of the opinion poll that it would have been better to have asked the same 8 pupils both times.
  2. Explain why the statistics student's suggestion would produce a better test.
  3. State which test should be used if the student's suggestion is followed.
  4. You are given that there are 12870 ways in which 8 different integers can be chosen from the integers 1 to 16 inclusive. Estimate the number of ways of selecting 8 different digits between 1 and 16 inclusive that have a sum less than or equal to the critical value used in the test in part (a).
OCR Further Statistics 2021 November Q8
11 marks Challenging +1.8
8 The continuous random variable \(Y\) has a uniform distribution on [0,2].
  1. It is given that \(\mathrm { E } [ a \cos ( a Y ) ] = 0.3\), where \(a\) is a constant between 0 and 1 , and \(a Y\) is measured in radians. Determine the value of the constant \(a\).
  2. Determine the \(60 ^ { \text {th } }\) percentile of \(Y ^ { 2 }\).
OCR Further Statistics Specimen Q1
6 marks Easy -1.2
1 The table below shows the typical stopping distances \(d\) metres for a particular car travelling at \(v\) miles per hour.
\(v\)203040506070
\(d\)132436527294
  1. State each of the following words that describe the variable \(v\). \section*{Independent Dependent Controlled Response}
  2. Calculate the equation of the regression line of \(d\) on \(v\).
  3. Use the equation found in part (ii) to estimate the typical stopping distance when this car is travelling at 45 miles per hour. It is given that the product moment correlation coefficient for the data is 0.990 correct to three significant figures.
  4. Explain whether your estimate found in part (iii) is reliable.
OCR Further Statistics Specimen Q2
6 marks Standard +0.8
2 The mass \(J \mathrm {~kg}\) of a bag of randomly chosen Jersey potatoes is a normally distributed random variable with mean 1.00 and standard deviation 0.06. The mass Kkg of a bag of randomly chosen King Edward potatoes is an independent normally distributed random variable with mean 0.80 and standard deviation 0.04 .
  1. Find the probability that the total mass of 6 bags of Jersey potatoes and 8 bags of King Edward potatoes is greater than 12.70 kg .
  2. Find the probability that the mass of one bag of King Edward potatoes is more than \(75 \%\) of the mass of one bag of Jersey potatoes.
OCR Further Statistics Specimen Q3
8 marks Standard +0.3
3 A game is played as follows. A fair six-sided dice is thrown once. If the score obtained is even, the amount of money, in \(\pounds\), that the contestant wins is half the score on the dice, otherwise it is twice the score on the dice.
  1. Find the probability distribution of the amount of money won by the contestant.
  2. The contestant pays \(\pounds 5\) for every time the dice is thrown. Find the standard deviation of the loss made by the contestant in 120 throws of the dice.
OCR Further Statistics Specimen Q4
7 marks Standard +0.3
4 A psychologist investigated the scores of pairs of twins on an aptitude test. Seven pairs of twins were chosen randomly, and the scores are given in the following table.
Elder twin65376079394088
Younger twin58396162502684
  1. Carry out an appropriate Wilcoxon test at the \(10 \%\) significance level to investigate whether there is evidence of a difference in test scores between the elder and the younger of a pair of twins.
  2. Explain the advantage in this case of a Wilcoxon test over a sign test.
OCR Further Statistics Specimen Q5
8 marks Moderate -0.8
5 The number of goals scored by the home team in a randomly chosen hockey match is denoted by \(X\).
  1. In order for \(X\) to be modelled by a Poisson distribution it is assumed that goals scored are random events. State two other conditions needed for \(X\) to be modelled by a Poisson distribution in this context. Assume now that \(X\) can be modelled by the distribution \(\operatorname { Po } ( 1.9 )\).
  2. (a) Write down an expression for \(\mathrm { P } ( X = r )\).
    (b) Hence find \(\mathrm { P } ( X = 3 )\).
  3. Assume also that the number of goals scored by the away team in a randomly chosen hockey match has an independent Poisson distribution with mean \(\lambda\) between 1.31 and 1.32. Find an estimate for the probability that more than 3 goals are scored altogether in a randomly chosen match.
OCR Further Statistics Specimen Q6
7 marks Standard +0.8
6 A bag contains 3 green counters, 3 blue counters and \(w\) white counters. Counters are selected at random, one at a time, with replacement, until a white counter is drawn.
The total number of counters selected, including the white counter, is denoted by \(X\).
  1. In the case when \(w = 2\),
    1. write down the distribution of \(X\),
    2. find \(P ( 3 < X \leq 7 )\).
    3. In the case when \(\mathrm { E } ( X ) = 2\), determine the value of \(w\).
    4. In the case when \(w = 2\) and \(X = 6\), find the probability that the first five counters drawn alternate in colour.
OCR Further Statistics Specimen Q7
9 marks Moderate -0.3
7 Sweet pea plants grown using a standard plant food have a mean height of 1.6 m . A new plant food is used for a random sample of 49 randomly chosen plants and the heights, \(x\) metres, of this sample can be summarised by the following. $$\begin{aligned} n & = 49 \\ \Sigma x & = 74.48 \\ \Sigma x ^ { 2 } & = 120.8896 \end{aligned}$$ Test, at the \(5 \%\) significance level, whether, when the new plant food is used, the mean height of sweet pea plants is less than 1.6 m .
OCR Further Statistics Specimen Q8
15 marks Standard +0.3
8 A continuous random variable \(X\) has probability density function given by $$\mathrm { f } ( x ) = \left\{ \begin{array} { c c } 0.8 \mathrm { e } ^ { - 0.8 x } & x \geq 0 \\ 0 & x < 0 \end{array} \right.$$
  1. Find the mean and variance of \(X\). The lifetime of a certain organism is thought to have the same distribution as \(X\). The lifetimes in days of a random sample of 60 specimens of the organism were found. The observed frequencies, together with the expected frequencies correct to 3 decimal places, are given in the table.
    Range\(0 \leq x < 1\)\(1 \leq x < 2\)\(2 \leq x < 3\)\(3 \leq x < 4\)\(x \geq 4\)
    Observed24221031
    Expected33.04014.8466.6712.9972.446
  2. Show how the expected frequency for \(1 \leq x < 2\) is obtained.
  3. Carry out a goodness of fit test at the \(5 \%\) significance level.
OCR Further Statistics Specimen Q9
9 marks Challenging +1.2
9 The continuous random variable \(X\) has cumulative distribution function given by $$\mathrm { F } ( x ) = \left\{ \begin{array} { c c } 0 & x < 0 \\ \frac { 1 } { 16 } x ^ { 2 } & 0 \leq x \leq 4 \\ 1 & x > 4 \end{array} \right.$$
  1. The random variable \(Y\) is defined by \(Y = \frac { 1 } { X ^ { 2 } }\). Find the cumulative distribution function of \(Y\).
  2. Show that \(\mathrm { E } ( Y )\) is not defined. \section*{END OF QUESTION PAPER}