Questions — OCR Further Statistics AS (58 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 SPS SPS ASFM SPS ASFM Mechanics SPS ASFM Pure SPS ASFM Statistics SPS FM SPS FM Mechanics SPS FM Pure SPS FM Statistics SPS SM SPS SM Mechanics SPS SM Pure SPS SM Statistics WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR Further Statistics AS 2023 June Q4
4 A discrete random variable \(W\) has the probability distribution shown in the following table, in which \(a\) and \(b\) are constants.
\(w\)585960616263
\(\mathrm { P } ( W = w )\)\(a\)\(b\)0.20.20.10.1
It is given that \(\mathrm { E } ( W - 60 ) = 0.15\). Determine the value of \(\operatorname { Var } ( 4 W - 60 )\).
OCR Further Statistics AS 2023 June Q5
5 A psychologist investigates the relationship between 'openness' and 'creativity' in adults. Each member of a random sample of 15 adults is given two tests, one on openness and one on creativity. Each test has a maximum score of 75 . The results are given in the table.
AdultABCDEFGHIJKLMNO
Openness, \(x\)393429204035203655314143333033
Creativity, \(y\)593417294946455460384635435634
\(n = 15 \quad \sum x = 519 \quad \sum y = 645 \quad \sum x ^ { 2 } = 19033 \quad \sum y ^ { 2 } = 29751 \quad \sum x y = 23034\)
  1. Use Pearson's product-moment correlation coefficient to test, at the \(5 \%\) significance level, whether there is positive association between openness and creativity.
  2. State what the value of Pearson's product-moment correlation coefficient shows about a scatter diagram illustrating the data.
  3. A student suggests that there is a way to obtain a more accurate measure of the correlation. Before carrying out the test it would be better to standardise the test scores so that they have the same mean and variance. Explain whether you agree with this suggestion.
OCR Further Statistics AS 2023 June Q6
6 A machine is used to toss a coin repeatedly. Rosa believes that the outcome of each toss made by the machine is not independent of the previous toss. Rosa gets the machine to toss a coin 6 times and record the number of heads, \(X\), obtained. After recording the number of heads obtained, Rosa resets the machine and gets it to toss the coin 6 more times. Rosa again records the number of heads obtained and she repeats this procedure until she has recorded 88 independent values of \(X\).
  1. The sample mean and sample variance of \(X\) are 3.35 and 3.392 respectively. Explain what these results suggest about the validity of a binomial model \(\mathrm { B } ( 6 , p )\) for the data. Rosa uses a computer spreadsheet to work out the probabilities for a more sophisticated model in which the outcome of each toss is dependent on the outcome of the previous toss. Her model suggests that the probabilities \(\mathrm { P } ( X = x )\), for \(x = 0,1,2,3,4,5,6\), are approximately in the ratio \(5 : 6 : 7 : 8 : 7 : 6 : 5\). She carries out a \(\chi ^ { 2 }\) test to investigate whether this model is a good fit for the data. The following table shows the full results of the experiments, together with some of the calculations needed for the test.
    \(x\)0123456Total
    Observed frequency710161515111488
    Expected frequency
    Contribution to \(\chi ^ { 2 }\) statistic0.90.33330.28570.06250.0714
  2. In the Printed Answer Booklet, complete the table.
  3. Carry out the test, using a 10\% significance level.
  4. Rosa says that the results definitely show that one of the two proposed models is correct. Comment on this statement.
OCR Further Statistics AS 2023 June Q7
7 A town council is planning to introduce a new set of parking regulations. An interviewer contacts randomly chosen people in the town and asks them whether they are in favour of the proposal. The first person who is not in favour of the regulation is the \(R\) th person interviewed. It can be assumed that the probability that any randomly chosen person is not in favour of the proposal is a constant \(p\), and that \(p\) does not equal 0 or 1 . Assume first that \(\mathrm { E } ( R ) = 10\).
  1. Determine \(\mathrm { P } ( R \geqslant 14 )\). Now, without the assumption that \(\mathrm { E } ( R ) = 10\), consider a general value of \(p\).
    It is given that \(\mathrm { P } ( R = 3 ) - 0.4 \times \mathrm { P } ( R = 2 ) - a \times \mathrm { P } ( R = 1 ) = 0\), where \(a\) is a positive constant.
  2. Determine the range of possible values of \(a\).
OCR Further Statistics AS 2024 June Q1
1 The random variable \(W\) can take values 1,2 or 3 and has a discrete uniform distribution.
  1. Write down the value of \(\mathrm { E } ( 2 W )\).
  2. Find the value of \(\operatorname { Var } ( 2 W )\).
  3. Determine the value of the constant \(k\) for which \(\mathrm { E } ( 2 \mathrm {~W} + \mathrm { k } ) = \operatorname { Var } ( 2 \mathrm {~W} + \mathrm { k } )\). The random variable \(S\) has the probability distribution shown in the following table.
    \(S\)23456
    \(P ( S = S )\)\(\frac { 2 } { 9 }\)\(\frac { 1 } { 9 }\)\(\frac { 1 } { 3 }\)\(\frac { 1 } { 9 }\)\(\frac { 2 } { 9 }\)
  4. Calculate \(\operatorname { Var } ( S )\).
OCR Further Statistics AS 2024 June Q2
2 For a random sample of 160 employees of a large company, the principal method of transport for getting to work, arranged according to grade of employee, is shown in the table.
GradeWalk or cyclePrivate motorised transportPublic transport
A9136
B164341
C11813
A test is carried out at the \(5 \%\) significance level of whether there is association between grade of employee and method of transport.
  1. State appropriate hypotheses for the test. The contributions to the test statistic are shown in the following table, correct to 3 decimal places.
    GradeWalk or cyclePrivate motorised transportPublic transport
    A1.1570.2891.929
    B1.8780.2250.327
    C2.0061.8000.083
  2. Show how the value 0.225 is obtained.
  3. Complete the test, stating the conclusion.
  4. Which combination of grade of employee and method of transport most strongly suggests association? Justify your answer.
OCR Further Statistics AS 2024 June Q3
3 The ages, \(x\) years, and the reaction time, \(t\) seconds, in an experiment carried out on a sample of 15 volunteers are summarised as follows.
\(n = 15 \quad \sum x = 762 \quad \sum t = 8.7 \quad \sum x ^ { 2 } = 44204 \quad \sum t ^ { 2 } = 5.65 \quad \sum x t = 490.1\)
  1. Calculate the value of the product moment correlation coefficient between \(x\) and \(t\).
  2. Calculate the equation of the line of regression of \(t\) on \(x\). Give your answer in the form \(\mathrm { t } = \mathrm { a } + \mathrm { bx }\) where \(a\) and \(b\) are constants to be determined.
  3. Explain the relevance of the quantity \(\sum ( t - a - b x ) ^ { 2 }\) to your answer to part (b).
  4. Estimate the reaction time, in seconds, for a volunteer aged 42. It is subsequently decided to measure the reaction time in tenths of a second rather than in seconds (so, for example, a time of 0.6 seconds would now be recorded as 6 ).
    1. State what effect, if any, this change would have on your answer to part (a).
    2. State what effect, if any, this change would have on your answer to part (b). It is known that the sample of 15 volunteers consisted almost entirely of students and retired people.
  5. Using this information, and the value of the product moment correlation coefficient, comment on the reliability of your estimate in part (d).
OCR Further Statistics AS 2024 June Q5
5 In a fashion competition, two judges gave marks to a large number of contestants. The value of Spearman's rank correlation coefficient, \(\mathrm { r } _ { \mathrm { s } }\), between the marks given to 7 randomly chosen contestants is \(\frac { 27 } { 28 }\).
  1. An excerpt from the table of critical values of \(\mathrm { r } _ { \mathrm { s } }\) is shown below. \section*{Critical values of Spearman's rank correlation coefficient}
    1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2\%1\%
    \multirow{3}{*}{\(n\)}60.82860.88570.94291.0000
    70.71430.78570.89290.9286
    80.64290.73810.83330.8810
    Test whether there is evidence, at the 1\% significance level, that the judges agree with each another. The marks given by the two judges to the 7 randomly chosen contestants were as follows, where \(x\) is an integer.
    ContestantABCD\(E\)\(F\)G
    Judge 164656778798086
    Judge 2616378808190\(x\)
  2. Use the value \(\mathrm { r } _ { \mathrm { s } } = \frac { 27 } { 28 }\) to determine the range of possible values of \(x\).
  3. Give a reason why it might be preferable to use the product moment correlation coefficient rather than Spearman's rank correlation coefficient in this context.
OCR Further Statistics AS 2024 June Q6
6 Anika walks along a street that contains parked cars. The number of cars that Anika passes, up to and including the first car that is white, is denoted by \(X\).
  1. State two assumptions needed for \(X\) to be well modelled by a geometric distribution. Assume now that \(X\) can be well modelled by the distribution \(\operatorname { Geo } ( p )\), where \(0 < p < 1\).
  2. For \(p = 0.1\), find \(\mathrm { P } ( X > 6 )\). The number of cars that Anika passes, up to but not including the first car that is white, is denoted by \(Y\).
  3. For a general value of \(p\), determine a simplified expression for \(\mathrm { E } ( Y ) \div \operatorname { Var } ( Y )\), in terms of \(p\). Ben walks along a different street that also contains parked cars. The number of cars that Ben passes, up to and including the first white car on which the last digit of the number plate is even is denoted by \(Z\). It may be assumed that \(Z\) can be well modelled by the distribution \(\operatorname { Geo } \left( \frac { 1 } { 2 } p \right)\), where \(p\) is the parameter of the distribution of \(X\). It is given that \(\mathrm { P } ( \mathrm { Z } = 3 ) = \mathrm { kP } ( \mathrm { X } = 3 )\), where \(k\) is a positive constant.
  4. Determine the range of possible values of \(k\).
OCR Further Statistics AS 2020 November Q1
1 Five observations of bivariate data \(( x , y )\) are given in the table.
\(x\)781264
\(y\)201671723
  1. Find the value of Pearson's product-moment correlation coefficient.
  2. State what your answer to part (a) tells you about a scatter diagram representing the data.
  3. A new variable \(a\) is defined by \(\mathrm { a } = 3 \mathrm { x } + 4\). Dee says "The value of Pearson's product-moment correlation coefficient between \(a\) and \(y\) will not be the same as the answer to part (a)." State with a reason whether you agree with Dee.
OCR Further Statistics AS 2020 November Q2
2 Every time a spinner is spun, the probability that it shows the number 4 is 0.2 , independently of all other spins.
  1. A pupil spins the spinner repeatedly until it shows the number 4. Find the mean of the number of spins required.
  2. Calculate the probability that the number of spins required is between 3 and 10 inclusive.
  3. Each pupil in a class of 30 spins the spinner until it shows the number 4. Out of the 30 pupils, the number of pupils who require at least 10 spins is denoted by \(X\). Determine the variance of \(X\).
OCR Further Statistics AS 2020 November Q3
3 An investor obtains data about the profits of 8 randomly chosen investment accounts over two one-year periods. The profit in the first year for each account is \(p \%\) and the profit in the second year for each account is \(q \%\). The results are shown in the table and in the scatter diagram.
AccountABCDEFGH
\(p\)1.62.12.42.72.83.35.28.4
\(q\)1.62.32.22.23.12.97.64.8
\(n = 8 \quad \sum \mathrm { p } = 28.5 \quad \sum \mathrm { q } = 26.7 \quad \sum \mathrm { p } ^ { 2 } = 136.35 \quad \sum \mathrm { q } ^ { 2 } = 116.35 \quad \sum \mathrm { pq } = 116.70\)
\includegraphics[max width=\textwidth, alt={}, center]{bf1468d1-e02e-47d2-bf41-5bc8f5b4d7c4-3_782_1280_998_242}
  1. State which, if either, of the variables \(p\) and \(q\) is independent.
  2. Calculate the equation of the regression line of \(q\) on \(p\).
    1. Use the regression line to estimate the value of \(q\) for an investment account for which \(p = 2.5\).
    2. Give two reasons why this estimate could be considered reliable.
  3. Comment on the reliability of using the regression line to predict the value of \(q\) when \(p = 7.0\).
OCR Further Statistics AS 2020 November Q4
4 After a holiday organised for a group, the company organising the holiday obtained scores out of 10 for six different aspects of the holiday. The company obtained responses from 100 couples and 100 single travellers. The total scores for each of the aspects are given in the following table.
AspectCouplesSingle travellers
Organisation884867
Travel710633
Food692675
Leader898898
Included visits561736
Optional visits683712
Fred wishes to test whether there is significant positive correlation between the scores given by the two categories.
  1. Explain why it is probably not appropriate to use Pearson's product-moment correlation coefficient.
  2. Carry out an appropriate test at the \(1 \%\) level.
  3. Explain what is meant by the statement that the test carried out in part (b) is a non-parametric test.
OCR Further Statistics AS 2020 November Q5
5 At a cinema there are three film sessions each Saturday, "early", "middle" and "late". The numbers of the audience, in different age groups, at the three showings on a randomly chosen Saturday are given in Table 1. \begin{table}[h]
\multirow{2}{*}{Observed frequencies}Session
EarlyMiddleLate
\multirow{3}{*}{Age group}< 25242040
25 to 604210
> 60282210
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} The cinema manager carries out a test of whether there is any association between age group and session attended.
  1. Show that it is necessary to combine cells in order to carry out the test. It is decided to combine the second and third rows of the table. Some of the expected frequencies for the table with rows combined, and the corresponding contributions to the \(\chi ^ { 2 }\) test statistic, are shown in the following incomplete tables. \begin{table}[h]
    \multirow{2}{*}{Expected frequencies}Session
    EarlyMiddleLate
    \multirow{2}{*}{Age group}< 2529.423.1
    \(\geqslant 25\)26.620.9
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table} \begin{table}[h]
    \multirow{2}{*}{Contribution to \(\chi ^ { 2 }\)}Session
    EarlyMiddleLate
    \multirow{2}{*}{Age group}< 250.99180.4160
    \(\geqslant 25\)1.09620.4598
    \captionsetup{labelformat=empty} \caption{Table 3}
    \end{table}
  2. In the Printed Answer Booklet, complete both tables.
  3. Carry out the test at the \(5 \%\) significance level.
  4. Use the figures in your completed Table 3 to comment on the numbers of the audience in different age groups.
OCR Further Statistics AS 2020 November Q6
6 A statistician investigates the number, \(F\), of signal failures per week on a railway network.
  1. The statistician assumes that signal failures occur randomly. Explain what this statement means.
  2. State two further assumptions needed for \(F\) to be well modelled by a Poisson distribution. In a random sample of 50 weeks, the statistician finds that the mean number of failures per week is 1.61, with standard deviation 1.28.
  3. Explain whether this suggests that \(F\) is likely to be well modelled by a Poisson distribution. Assume first that \(F \sim \operatorname { Po } ( 1.61 )\).
  4. Write down an exact expression for \(\mathrm { P } ( F = 0 )\).
  5. Complete the table in the Printed Answer Booklet to show the probabilities of different values of \(F\), correct to three significant figures.
    Value of \(F\)01\(\geqslant 2\)
    Probability0.200
    After further investigation, the statistician decides to use a different model for the distribution of \(F\). In this model it is now assumed that \(\mathrm { P } ( F = 0 )\) is still 0.200 , but that if one failure occurs, there is an increased probability that further failures occur.
  6. Explain the effect of this assumption on the value of \(\mathrm { P } ( F = 1 )\).
OCR Further Statistics AS 2020 November Q7
7 A bag contains \(2 m\) yellow and \(m\) green counters. Three counters are chosen at random, without replacement. The probability that exactly two of the three counters are yellow is \(\frac { 28 } { 55 }\). Determine the value of \(m\).
OCR Further Statistics AS 2021 November Q1
1 The discrete random variable \(A\) has the following probability distribution.
\(a\)1251020
\(\mathrm { P } ( A = a )\)0.30.10.10.20.3
  1. Find the value of \(\mathrm { E } ( A )\).
  2. Determine the value of \(\operatorname { Var } ( A )\).
  3. The variable \(A\) represents the value in pence of a coin chosen at random from a pile. Mia picks one coin at random from the pile. She then adds, from a different source, another coin of the same value as the one that she has chosen, and one 50p coin.
    1. Find the mean of the value of the three coins.
    2. Find the variance of the value of the three coins.
OCR Further Statistics AS 2021 November Q2
2 A shopper estimates the cost, \(\pounds X\) per item, of each of 12 items in a supermarket. The shopper's estimates are compared with the actual cost, \(\pounds Y\) per item, of each item. The results are summarised as follows.
\(n = 12\)
\(\sum x = 399\)
\(\sum y = 623.88\)
\(\sum x ^ { 2 } = 28127\)
\(\sum y ^ { 2 } = 116509.0212\)
\(\sum x y = 45006.01\)
Test at the 1\% significance level whether the shopper's estimates are positively correlated with the actual cost of the items.
OCR Further Statistics AS 2021 November Q3
3
  1. Using the scatter diagram in the Printed Answer Booklet, explain what is meant by least squares in the context of a regression line of \(y\) on \(x\).
  2. A set of bivariate data \(( t , u )\) is summarised as follows.
    \(n = 5 \quad \sum t = 35 \quad \sum u = 54\)
    \(\sum t ^ { 2 } = 285 \quad \sum u ^ { 2 } = 758 \quad \sum \mathrm { tu } = 460\)
    1. Calculate the equation of the regression line of \(u\) on \(t\).
    2. The variables \(t\) and \(u\) are now scaled using the following scaling.
      \(\mathrm { v } = 2 \mathrm { t } , \mathrm { w } = \mathrm { u } + 4\)
      Find the equation of the regression line of \(w\) on \(v\), giving your equation in the form \(w = f ( v )\).
OCR Further Statistics AS 2021 November Q4
4 Two random variables \(X\) and \(Y\) have the distributions \(\mathrm { B } ( m , p )\) and \(\mathrm { B } ( n , p )\) respectively, where \(p > 0\). It is known that
  • \(\mathrm { E } ( Y ) = 2 \mathrm { E } ( X )\)
  • \(\operatorname { Var } ( Y ) = 1.2 \mathrm { E } ( X )\).
Determine the value of \(p\).
OCR Further Statistics AS 2021 November Q5
5 The discrete random variable \(X\) has a geometric distribution. It is given that \(\operatorname { Var } ( X ) = 20\).
Determine \(\mathrm { P } ( X \geqslant 7 )\).
OCR Further Statistics AS 2021 November Q6
6 A student believes that if you ask people to choose an integer between 1 and 10, not all integers are equally likely to be chosen. The student asks a random sample of 100 people to choose an integer between 1 and 10 inclusive. The observed frequencies \(O\), together with the values of \(\frac { ( O - E ) ^ { 2 } } { E }\) where \(E\) is the corresponding expected frequency, are shown in the table.
Integer12345678910
O7820876197810
\(\frac { ( \mathrm { O } - \mathrm { E } ) ^ { 2 } } { \mathrm { E } }\)0.90.410.00.40.91.68.10.90.40
  1. Show how the value of 8.1 for integer 7 is obtained.
  2. Show that there is evidence at the \(1 \%\) significance level that the student’s belief is correct. The student wishes to suggest an alternative model for the probabilities associated with each integer. In this model, two of the integers have the same probability \(p _ { 1 }\) of being chosen and the other eight integers each have probability \(p _ { 2 }\) of being chosen.
  3. Suggest which two integers should have probability \(p _ { 1 }\) and suggest a possible value of \(p _ { 1 }\).
OCR Further Statistics AS 2021 November Q7
7 The 20 members of a club consist of 10 Town members and 10 Country members.
  1. All 20 members are arranged randomly in a straight line. Determine the probability that the Town members and the Country members alternate.
  2. Ten members of the club are chosen at random. Show that the probability that the number of Town members chosen is no more than \(r\), where \(0 \leqslant r \leqslant 10\), is given by
    \(\frac { 1 } { \mathrm {~N} } \sum _ { \mathrm { i } = 0 } ^ { \mathrm { r } } \left( { } ^ { 10 } \mathrm { C } _ { \mathrm { i } } \right) ^ { 2 }\)
    where \(N\) is an integer to be determined.
OCR Further Statistics AS 2021 November Q8
8
  1. A substance emits particles randomly at a constant average rate of 3.2 per minute. A second substance emits particles randomly, and independently of the first source, at a constant average rate of 2.7 per minute. Find the probability that the total number of particles emitted by the two sources in a ten-minute period is less than 70 .
  2. The random variable \(X\) represents the number of particles emitted by a substance in a fixed time interval \(t\) minutes. It may be assumed that particles are emitted randomly and independently of each other. In general, the rate at which particles are emitted is proportional to the mass of the substance, but each particle emitted reduces the mass of the substance. Explain why a Poisson distribution may not be a valid model for \(X\) if the value of \(t\) is very large.
  3. The random variable \(Y\) has the distribution \(\operatorname { Po } ( \lambda )\). It is given that
    \(\mathrm { P } ( \mathrm { Y } = \mathrm { r } ) = \mathrm { P } ( \mathrm { Y } = \mathrm { r } + 1 )\)
    \(\mathrm { P } ( \mathrm { Y } = \mathrm { r } ) = 1.5 \times \mathrm { P } ( \mathrm { Y } = \mathrm { r } - 1 )\). Determine the following, in either order.
    • The value of \(r\)
    • The value of \(\lambda\)
    \section*{END OF QUESTION PAPER}
OCR Further Statistics AS Specimen Q1
1 Two music critics, \(P\) and \(Q\), give scores to seven concerts as follows.
Concert1234567
Score by
critic \(P\)
1211613171614
Score by
critic \(Q\)
913814181620
  1. Calculate Spearman's rank correlation coefficient, \(r _ { s }\), for these scores.
  2. Without carrying out a hypothesis test, state what your answer tells you about the views of the two critics.