Questions — OCR (4628 questions)

Browse by board
AQA AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further AS Paper 1 Further AS Paper 2 Discrete Further AS Paper 2 Mechanics Further AS Paper 2 Statistics Further Paper 1 Further Paper 2 Further Paper 3 Discrete Further Paper 3 Mechanics Further Paper 3 Statistics M1 M2 M3 Paper 1 Paper 2 Paper 3 S1 S2 S3 CAIE FP1 FP2 Further Paper 1 Further Paper 2 Further Paper 3 Further Paper 4 M1 M2 P1 P2 P3 S1 S2 Edexcel AEA AS Paper 1 AS Paper 2 C1 C12 C2 C3 C34 C4 CP AS CP1 CP2 D1 D2 F1 F2 F3 FD1 FD1 AS FD2 FD2 AS FM1 FM1 AS FM2 FM2 AS FP1 FP1 AS FP2 FP2 AS FP3 FS1 FS1 AS FS2 FS2 AS M1 M2 M3 M4 M5 P1 P2 P3 P4 PMT Mocks Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 OCR AS Pure C1 C2 C3 C4 D1 D2 FD1 AS FM1 AS FP1 FP1 AS FP2 FP3 FS1 AS Further Additional Pure Further Additional Pure AS Further Discrete Further Discrete AS Further Mechanics Further Mechanics AS Further Pure Core 1 Further Pure Core 2 Further Pure Core AS Further Statistics Further Statistics AS H240/01 H240/02 H240/03 M1 M2 M3 M4 Mechanics 1 PURE Pure 1 S1 S2 S3 S4 Stats 1 OCR MEI AS Paper 1 AS Paper 2 C1 C2 C3 C4 D1 D2 FP1 FP2 FP3 Further Extra Pure Further Mechanics A AS Further Mechanics B AS Further Mechanics Major Further Mechanics Minor Further Numerical Methods Further Pure Core Further Pure Core AS Further Pure with Technology Further Statistics A AS Further Statistics B AS Further Statistics Major Further Statistics Minor M1 M2 M3 M4 Paper 1 Paper 2 Paper 3 S1 S2 S3 S4 WJEC Further Unit 1 Further Unit 2 Further Unit 3 Further Unit 4 Further Unit 5 Further Unit 6 Unit 1 Unit 2 Unit 3 Unit 4
OCR Further Statistics 2022 June Q9
10 marks Challenging +1.2
9 The head teacher of a school believes that, on average, pupil absences on the days Monday, Tuesday, Wednesday, Thursday and Friday are in the ratio \(3 : 2 : 2 : 2 : 3\). The head teacher takes a random sample of 120 pupil absences. The results are as follows.
Day of weekMondayTuesdayWednesdayThursdayFriday
Number of absences2816241636
  1. Test at the \(5 \%\) significance level whether these results are consistent with the head teacher's belief. A significance test at the \(5 \%\) level is also carried out on a second, independent, random sample of \(n\) pupil absences. All the numbers of absences are integers. The ratio of the numbers of absences for each day in this sample is identical to the ratio of the numbers of absences for each day in the original sample of size 120.
  2. Determine the smallest value of \(n\) for which the conclusion of this significance test is that the data are not consistent with the head teacher's belief.
OCR Further Statistics 2023 June Q1
8 marks Standard +0.3
1 A certain section of a library contains several thousand books. A lecturer is looking for a book that refers to a particular topic. The lecturer believes that one-twentieth of the books in that section of the library contain a reference to that topic. However, the lecturer does not know which books they might be, so the lecturer looks in each book in turn for a reference to the topic. The first book the lecturer finds that refers to the topic is the \(X\) th book in which the lecturer looks.
  1. A student says, "There is a maximum value of \(X\) as there is only a finite number of books. So a geometric distribution cannot be a good model for \(X\)." Explain whether you agree with the student.
    1. State one modelling assumption (not involving the total number of books) needed for \(X\) to be modelled by a geometric distribution in this context.
    2. Suggest a reason why this assumption may not be valid in this context. Assume now that \(X\) can be well modelled by the distribution \(\operatorname { Geo } ( 0.05 )\).
  2. The probability that the lecturer needs to look in no more than \(n\) books is greater than 0.9 . Find the smallest possible value of \(n\).
  3. The lecturer needs to find four different books that refer to the topic. Find the probability that the lecturer wants to look in exactly 40 books.
OCR Further Statistics 2023 June Q2
8 marks Standard +0.3
2 The director of a concert hall wishes to investigate if the price of the most expensive concert tickets affects attendance. The director collects data about the price, \(\pounds P\), of the most expensive tickets and the number of people in the audience, \(H\) hundred (rounded to the nearest hundred), for 20 concerts. For each price there are several different concerts. The results are shown in the table.
\(P\) (£)7565554535
\multirow[t]{5}{*}{\(H\) (hundred)}2727272615
2727202112
2218169
191813
12169
\(\mathrm { n } = 20 \quad \sum \mathrm { p } = 1080 \quad \sum \mathrm {~h} = 381 \quad \sum \mathrm { p } ^ { 2 } = 61300 \quad \sum \mathrm {~h} ^ { 2 } = 8011 \quad \sum \mathrm { ph } = 21535\)
  1. Calculate the equation of the regression line of \(h\) on \(p\).
  2. State what change, if any, there would be to your answer to part (a) if \(H\) had been measured in thousands (to 1 decimal place) rather than in hundreds. For a special charity concert, the most expensive tickets cost \(\pounds 50\).
  3. Use your answer to part (b) to estimate the expected size of the audience for this concert. Give your answer correct to \(\mathbf { 1 }\) decimal place.
  4. Comment on the reliability of your answer to part (c). You should refer to
    • the value of the product-moment correlation coefficient for the data, which is 0.642
    • the value of \(\pounds 50\)
    • any one other relevant factor that should be taken into account.
OCR Further Statistics 2023 June Q3
6 marks Standard +0.3
3 The discrete random variable \(W\) has the distribution \(\mathrm { U } ( 11 )\). The independent discrete random variable \(V\) has the distribution \(\mathrm { U } ( 5 )\).
  1. It is given that, for constants \(m\) and \(n\), with \(m > 0\), \(\mathrm { E } ( \mathrm { mW } + \mathrm { nV } ) = 0\) and \(\operatorname { Var } ( \mathrm { mW } + \mathrm { nV } ) = 1\). Determine the exact values of \(m\) and \(n\). The random variable \(T\) is the mean of three independent observations of \(W\).
  2. Explain whether the Central Limit Theorem can be used to say that the distribution of \(T\) is approximately normal.
OCR Further Statistics 2023 June Q4
10 marks Standard +0.3
4 Two magazines give numerical ratings to hi-fi systems. Li wishes to test whether there is agreement between the opinions of the magazines. Li chooses a random sample of 5 hi -fi systems and looks up the ratings given by the two magazines. The results are shown in the table.
SystemABCDE
Magazine I6875778392
Magazine II3025403545
  1. Give a reason why Li might choose to use a test based on Spearman's rank correlation coefficient rather than on Pearson’s product-moment correlation coefficient.
  2. Calculate the value of Spearman's rank correlation coefficient for the data.
  3. Use your answer to part (b) to carry out a hypothesis test at the \(5 \%\) significance level.
  4. The value of Spearman's rank correlation coefficient between the ratings given by magazine I and by a third magazine, magazine III, has the same numerical value as the answer to part (b) but with the sign changed. In the Printed Answer Booklet, complete the table showing the rankings given by magazine III.
OCR Further Statistics 2023 June Q5
10 marks Challenging +1.2
5 An historian has reason to believe that the average age at which men got married in the seventeenth century was higher in urban areas compared to rural areas. The historian collected data from a random sample of 8 men in an urban area and a random sample of 6 men in a rural area, all of whom were married in the seventeenth century. The results were as follows, given in the form years/months.
Urban:\(18 / 3\)\(18 / 5\)\(19 / 9\)\(20 / 7\)\(25 / 6\)\(34 / 6\)\(41 / 8\)\(46 / 3\)
Rural:\(18 / 0\)\(18 / 1\)\(18 / 4\)\(19 / 11\)\(22 / 2\)\(28 / 11\)
  1. Use an appropriate non-parametric method to test at the \(5 \%\) significance level whether the average age at marriage of men is higher in urban areas than in rural areas.
  2. When checking the data, the historian found that the age of one of the men, Mr X, which had been recorded as 28/11, had been wrongly recorded. When corrected, the result of the test in part (a) was unchanged. Determine the youngest age that Mr X could have been, given that it was not the same, in years and months, as that of any of the other men in the sample.
OCR Further Statistics 2023 June Q6
7 marks Challenging +1.8
6 The continuous random variable \(X\) has a uniform distribution on the interval \([ - \pi , \pi ]\).
The random variable \(Y\) is defined by \(Y = \sin X\).
Determine the cumulative distribution function of \(Y\).
OCR Further Statistics 2023 June Q7
10 marks Challenging +1.2
7 A club secretary collects data about the time, \(T\) minutes, needed to process the details of a new member. The mean of \(T\) is denoted by \(\mu\). The variance of \(T\) is denoted by \(\sigma ^ { 2 }\). The results of a random sample of 40 observations of \(T\) are summarised as follows. \(\mathrm { n } = 40 \quad \Sigma \mathrm { t } = 396.0 \quad \Sigma \mathrm { t } ^ { 2 } = 4271.40\)
  1. Determine a 99\% confidence interval for \(\mu\).
  2. The secretary discovers that over a long period the value of \(\sigma ^ { 2 }\) is in fact 10.0 . The secretary collects an independent random sample of 50 observations of \(T\) and constructs a new 99\% confidence interval for \(\mu\) based on this sample of size 50 , but using \(\sigma ^ { 2 } = 10.0\). Find the probability that this new confidence interval contains the value \(\mu + 1.6\).
OCR Further Statistics 2023 June Q8
16 marks Challenging +1.2
8 A team of researchers have reason to believe that the number of calls received in randomly chosen 10-minute intervals to a call centre can be well modelled by a Poisson distribution. To test this belief the researchers record the number of telephone calls received in 60 randomly chosen 10-minute intervals. The results, together with relevant calculations, are shown in the following table.
Total
Number of calls, \(r\)01234\(\geqslant 5\)
Observed frequency, \(f\)18131298060
rf013242732096
\(\mathrm { r } ^ { 2 } \mathrm { f }\)01348811280270
Expected frequency12.11419.38215.5068.2703.3081.42160
Contribution to test statistic2.8602.1010.7931.2326.99
  1. Calculate the mean of the observed number of calls received.
  2. Calculate the variance of the observed number of calls received.
  3. Comment on what your answers to parts (a) and (b) suggest about the proposed model.
  4. Explain why it is necessary to combine some cells in the table.
  5. Show how the values 15.506 and 0.793 in the table were obtained.
  6. Carry out the test, at the \(5 \%\) significance level. In the light of the result of the test, the team consider that a different model is appropriate. They propose the following improved model: $$P ( R = r ) = \begin{cases} \frac { 1 } { 60 } ( a + ( 2 - r ) b ) & r = 0,1,2,3,4 \\ 0 & \text { otherwise } \end{cases}$$ where \(a\) and \(b\) are integers.
  7. Use at least three of the observed frequencies to suggest appropriate values for \(a\) and \(b\). You should consider more than one possible pair of values, and explain which pair of values you consider best. (Do not carry out a goodness-of-fit test.)
OCR Further Statistics 2024 June Q1
8 marks Standard +0.3
1 A discrete random variable \(X\) has the following distribution, where \(a , b\) and \(c\) are constants.
\(x\)0123
\(\mathrm { P } ( \mathrm { X } = \mathrm { x } )\)\(a\)\(b\)\(c\)0.1
It is given that \(\mathrm { E } ( X ) = 1.25\) and \(\operatorname { Var } ( X ) = 0.8875\).
  1. Determine the values of \(a\), \(b\) and \(c\).
  2. The random variable \(Y\) is defined by \(Y = 7 - 2 X\). Write down the value of \(\operatorname { Var } ( Y )\).
  3. Twenty independent observations of \(X\) are obtained. The number of those observations for which \(X = 3\) is denoted by \(T\). Find the value of \(\operatorname { Var } ( T )\).
OCR Further Statistics 2024 June Q2
9 marks Standard +0.3
2 A newspaper article claimed that "taller dog owners have taller dogs as pets". Alex investigated this claim and obtained data from a random sample of 16 fellow students who owned exactly one dog. The results are summarised as follows, where the height of the student, in cm, is denoted by \(h\) and the height, in cm, of their dog is denoted by \(d\). \(\mathrm { n } = 16 \quad \sum \mathrm {~h} = 2880 \quad \sum \mathrm {~d} = 660 \quad \sum \mathrm {~h} ^ { 2 } = 519276 \quad \sum \mathrm {~d} ^ { 2 } = 30000 \quad \sum \mathrm { hd } = 119425\)
  1. Calculate the value of Pearson's product moment correlation coefficient for the data.
  2. State what your answer tells you about a scatter diagram illustrating the data.
  3. Use the data to test, at the \(5 \%\) significance level, the claim of the newspaper article.
  4. Explain whether the answer to part (a) would be likely to be different if the dogs' weights had been used instead of their heights.
OCR Further Statistics 2024 June Q3
11 marks Standard +0.3
3 Research suggests that the mean reading age of a child about to start secondary school is 10.75 . The reading ages, \(X\) years, of a random sample of 80 children who were about to start secondary school in a particular district were measured, and the results are summarised as follows. $$\mathrm { n } = 80 \quad \sum \mathrm { x } = 893 \quad \sum \mathrm { x } ^ { 2 } = 10267$$
  1. Test at the \(5 \%\) significance level whether the mean reading age of children about to start secondary school in this district is not 10.75 .
  2. A student wrote: "Although we do not know that the distribution of \(X\) is normal, the central limit theorem allows us to assume that it is, as the sample size is large." This statement is incorrect. Give a corrected version of the student's statement.
OCR Further Statistics 2024 June Q4
6 marks
4
  1. Write down the number of ways of choosing 5 objects from 12 distinct objects.
  2. Each possible set of 5 different integers selected from the integers \(1,2 , \ldots , 12\) is obtained, and for each set, the sum of the 5 integers is found. The sum \(S\) can take values between 15 and 50 inclusive. Part of the frequency distribution of \(S\) is shown in the following table, together with the cumulative frequencies.
    S151617181920212223
    Frequency112357101317
    Cumulative Frequency12471219294259
    Use these numbers to determine the critical region for a 1-tail Wilcoxon rank-sum test at the \(2 \%\) significance level when \(m = 5\) and \(n = 7\).
  3. A student says that, for a Wilcoxon rank-sum test on samples of size \(m\) and \(n\), where \(m\) and \(n\) are large, the mean and variance of the test statistic \(R _ { m }\) are 200 and \(616 \frac { 2 } { 3 }\) respectively. Show that at least one of these values must be incorrect.
OCR Further Statistics 2024 June Q5
12 marks Standard +0.3
5 Some bird-watchers study the song of chaffinches in a particular wood. They investigate whether the number, \(N\), of separate bursts of song in a 5 minute period can be modelled by a Poisson distribution. They assume that a burst of song can be considered as a single event, and that bursts of song occur randomly. \section*{(a) State two further assumptions needed for \(N\) to be well modelled by a Poisson distribution.} The bird-watchers record the value of \(N\) in each of 60 periods of 5 minutes. The mean and variance of the results are 3.55 and 5.6475 respectively.
(b) Explain what this suggests about the validity of a Poisson distribution as a model in this context. The complete results are shown in the table.
\(n\)012345678\(\geqslant 9\)
Frequency103781366250
The bird-watchers carry out a \(\chi ^ { 2 }\) goodness of fit test at the \(5 \%\) significance level.
(c) State suitable hypotheses for the test.
(d) Determine the contribution to the test statistic for \(n = 3\).
(e) The total value of the test statistic, obtained by combining the cells for \(n \leqslant 1\) and also for \(n \geqslant 6\), is 9.202 , correct to 4 significant figures. Complete the goodness of fit test.
(f) It is known that chaffinches are more likely to sing in the presence of other chaffinches. Explain whether this fact affects the validity of a Poisson model for \(N\).
OCR Further Statistics 2024 June Q6
11 marks Standard +0.3
6 A bag contains 6 identical blue counters and 5 identical yellow counters.
  1. Three counters are selected at random, without replacement. Find the probability that at least two of the counters are blue. All 11 counters are now arranged in a row in a random order.
  2. Find the probability that all the yellow counters are next to each other.
  3. Find the probability that no yellow counter is next to another yellow counter.
  4. Find the probability that the counters are arranged in such a way that both of the following conditions hold.
    • Exactly three of the yellow counters are next to one another.
    • Neither of the other two yellow counters is next to a yellow counter.
    • Explain whether the answer to part (d) would be different if the yellow counters were numbered \(1,2,3,4\) and 5 , so that they are not identical.
OCR Further Statistics 2024 June Q7
8 marks Standard +0.3
7 The coordinates of a set of 10 points are denoted by ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) for \(i = 1,2 , \ldots , 10\). For a particular set of values of ( \(\mathrm { x } _ { \mathrm { i } } , \mathrm { y } _ { \mathrm { i } }\) ) and any constants \(a\) and \(b\) it can be shown that \(\Sigma \left( y _ { i } - a - b x _ { i } \right) ^ { 2 } = 10 ( 11 - a - 6 b ) ^ { 2 } + 126 \left( b - \frac { 83 } { 42 } \right) ^ { 2 } + \frac { 139 } { 14 }\).
    1. Explain why \(\sum \left( \mathrm { y } _ { \mathrm { i } } - \mathrm { a } - \mathrm { bx } _ { \mathrm { i } } \right) ^ { 2 }\) is minimised by taking \(b = \frac { 83 } { 42 }\) and \(\mathrm { a } = 11 - 6 \mathrm {~b}\).
    2. Hence explain why the equation of the regression line of \(y\) on \(x\) for these points is given by the corresponding values of \(a\) and \(b\) (so that the equation is \(\mathrm { y } = \frac { 83 } { 42 } \mathrm { x } - \frac { 6 } { 7 }\) ).
  1. State which of the following terms cannot apply to the variable \(X\) if the regression line of \(y\) on \(x\) can be used for estimating values of \(Y\). Dependent Independent Controlled Response
  2. Use the regression line to estimate the value of \(y\) corresponding to \(x = 8\).
  3. State what must be true of the value \(x = 8\) if the estimate in part (c) is to be reliable.
  4. Variables \(u\) and \(v\) are related to \(x\) and \(y\) by the following relationships. \(u = 2 + 4 x \quad v = 8 - 2 y\) Show that the gradient of the regression line of \(v\) on \(u\) is very close to - 1 .
OCR Further Statistics 2024 June Q8
10 marks Standard +0.3
8 A random sample of 100 students were given a task and the time taken by each student to complete the task was recorded. The maximum time allowed to complete the task was one minute and all students completed the task within the maximum time. The times, \(T\) minutes, for the random sample of students are summarised as follows. \(n = 100 \quad \sum t = 61.88\) A researcher proposes that \(T\) can be modelled by the continuous random variable with probability density function \(f ( t ) = \begin{cases} \alpha t ^ { \alpha - 1 } & 0 \leqslant t \leqslant 1 , \\ 0 & \text { otherwise, } \end{cases}\) where \(\alpha\) is a positive constant. \section*{(a) In this question you must show detailed reasoning.} By finding \(\mathbf { E } ( T )\) according to the researcher's model, determine an approximation for the value of \(\alpha\). Give your answer correct to \(\mathbf { 3 }\) significant figures. Further information about the times taken for the sample of 100 students to complete the task is given in the table.
Time \(t\)\(0 \leqslant t < \frac { 1 } { 3 }\)\(\frac { 1 } { 3 } \leqslant t < \frac { 2 } { 3 }\)\(\frac { 2 } { 3 } \leqslant t \leqslant 1\)
Frequency183745
(b) Using the value of \(\alpha\) found in part (a), determine the extent to which the proposed model is a good model. (Do not carry out a goodness of fit test.)
OCR Further Statistics 2020 November Q1
4 marks Moderate -0.8
1 The continuous random variable \(X\) has the distribution \(\mathrm { N } ( \mu , 30 )\). The mean of a random sample of 8 observations of \(X\) is 53.1. Determine a \(95 \%\) confidence interval for \(\mu\). You should give the end points of the interval correct to 4 significant figures.
OCR Further Statistics 2020 November Q2
8 marks Moderate -0.3
2 A book collector compared the prices of some books, \(\pounds x\), when new in 1972 and the prices of copies of the same books, \(\pounds y\), on a second-hand website in 2018.
The results are shown in Table 1 and are summarised below the table. \begin{table}[h]
BookABCDEFGHIJKL
\(x\)0.950.650.700.900.551.401.500.501.150.350.200.35
\(y\)6.067.002.005.874.005.367.192.503.008.291.372.00
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table} $$n = 12 , \sum x = 9.20 , \sum y = 54.64 , \sum x ^ { 2 } = 8.9950 , \sum y ^ { 2 } = 310.4572 , \sum x y = 46.0545$$
  1. It is given that the value of Pearson’s product-moment correlation coefficient for the data is 0.381, correct to 3 significant figures.
    1. State what this information tells you about a scatter diagram illustrating the data.
    2. Test at the \(5 \%\) significance level whether there is evidence of positive correlation between prices in 1972 and prices in 2018.
  2. The collector noticed that the second-hand copy of book J was unusually expensive and he decided to ignore the data for book J. Calculate the value of Pearson's product-moment correlation coefficient for the other 11 books.
OCR Further Statistics 2020 November Q3
9 marks Standard +0.3
3 Jo can use either of two different routes, A or B, for her journey to school. She believes that route A has shorter journey times. She measures how long her journey takes for 17 journeys by route A and 12 journeys by route B . She ranks the 29 journeys in increasing order of time taken, and she finds that the sum of the ranks of the journeys by route B is 219 .
  1. Test at the \(10 \%\) significance level whether route A has shorter journey times than route B .
  2. State an assumption about the 29 journeys which is necessary for the conclusion of the test to be valid.
OCR Further Statistics 2020 November Q4
7 marks Challenging +1.2
4 The random variable \(X\) is equally likely to take any of the \(n\) integer values from \(m + 1\) to \(m + n\) inclusive. It is given that \(\mathrm { E } ( 3 X ) = 30\) and \(\operatorname { Var } ( 3 X ) = 36\). Determine the value of \(m\) and the value of \(n\). 526 cards are each labelled with a different letter of the alphabet, A to Z. The letters A, E, I, O and U are vowels.
  1. Five cards are selected at random without replacement. Determine the probability that the letters on at least three of the cards are vowels.
  2. All 26 cards are arranged in a line, in random order.
    1. Show that the probability that all the vowels are next to one another is \(\frac { 1 } { 2990 }\).
    2. Determine the probability that three of the vowels are next to each other, and the other two vowels are next to each other, but the five vowels are not all next to each other.
OCR Further Statistics 2020 November Q6
11 marks Standard +0.3
6 The numbers of CD players sold in a shop on three consecutive weekends were 7,6 and 2 . It may be assumed that sales of CD players occur randomly and that nobody buys more than one CD player at a time. The number of CD players sold on a randomly chosen weekend is denoted by \(X\).
  1. How appropriate is the Poisson distribution as a model for \(X\) ? Now assume that a Poisson distribution with mean 5 is an appropriate model for \(X\).
  2. Find
    1. \(\mathrm { P } ( X = 6 )\),
    2. \(\mathrm { P } ( x \geqslant 8 )\). The number of integrated sound systems sold in a weekend at the same shop can be assumed to have the distribution \(\operatorname { Po } ( 7.2 )\).
  3. Find the probability that on a randomly chosen weekend the total number of CD players and integrated sound systems sold is between 10 and 15 inclusive.
  4. State an assumption needed for your answer to part (c) to be valid.
  5. Give a reason why the assumption in part (d) may not be valid in practice.
OCR Further Statistics 2020 November Q7
10 marks Standard +0.3
7 A biased spinner has five sides, numbered 1 to 5 . Elmer spins the spinner repeatedly and counts the number of spins, \(X\), up to and including the first time that the number 2 appears. He carries out this experiment 100 times and records the frequency \(f\) with which each value of \(X\) is obtained. His results are shown in Table 1, together with the values of \(x f\). \begin{table}[h]
\(x\)123456\(\geqslant 7\)Total
Frequency \(f\)2015913101023100
\(x f\)203027525060161400
\captionsetup{labelformat=empty} \caption{Table 1}
\end{table}
  1. State an appropriate distribution with which to model \(X\), determining the value(s) of any parameter(s). Elmer carries out a goodness-of-fit test, at the \(5 \%\) level, for the distribution in part (a). Table 2 gives some of his calculations, in which numbers that are not exact have been rounded to 3 decimal places. \begin{table}[h]
    \(x\)123456\(\geqslant 7\)
    Observed frequency \(O\)2015913101023
    Expected frequency \(E\)2518.7514.06310.5477.9105.93317.798
    ( \(\mathrm { O } - \mathrm { E } ) ^ { 2 } / \mathrm { E }\)10.751.8230.5710.5522.7891.520
    \captionsetup{labelformat=empty} \caption{Table 2}
    \end{table}
  2. Show how the expected frequency corresponding to \(x \geqslant 7\) was obtained.
  3. Carry out the test.
OCR Further Statistics 2020 November Q8
15 marks Standard +0.8
8 The continuous random variable \(X\) has probability density function $$f ( x ) = \begin{cases} \frac { k } { x ^ { n } } & x \geqslant 1 \\ 0 & \text { otherwise } \end{cases}$$ where \(n\) and \(k\) are constants and \(n\) is an integer greater than 1 .
  1. Find \(k\) in terms of \(n\).
    1. When \(n = 4\), find the cumulative distribution function of \(X\).
    2. Hence determine \(\mathrm { P } ( X > 7 \mid X > 5 )\) when \(n = 4\).
  2. Determine the values of \(n\) for which \(\operatorname { Var } ( X )\) is not defined.
OCR Further Statistics 2021 November Q1
6 marks Standard +0.3
1 At a seaside resort the number \(X\) of ice-creams sold and the temperature \(Y ^ { \circ } \mathrm { F }\) were recorded on 20 randomly chosen summer days. The data can be summarised as follows. \(\sum x = 1506 \quad \sum x ^ { 2 } = 127542 \quad \sum y = 1431 \quad \sum y ^ { 2 } = 104451 \quad \sum x y = 111297\)
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  2. Explain the significance for the regression line of the quantity \(\sum \left[ y _ { i } - \left( a x _ { i } + b \right) \right] ^ { 2 }\).
  3. It is decided to measure the temperature in degrees Centigrade instead of degrees Fahrenheit. If the same temperature is measured both as \(f ^ { \circ }\) Fahrenheit and \(c ^ { \circ }\) Centigrade, the relationship between \(f\) and \(c\) is \(\mathrm { c } = \frac { 5 } { 9 } ( \mathrm { f } - 32 )\). Find the equation of the new regression line.