In a study, samples of soil were collected during the summer. Soil samples of dimensions \(25 \mathrm {~cm} \times 25 \mathrm {~cm} \times 40 \mathrm {~cm}\) were collected for analysis. The study found that there were, on average, 11 earthworms per sample.
a) Explain briefly the conditions under which a Poisson distribution could be used to model the number of earthworms per sample.
b) In July, pupils at a primary school are asked to dig a smaller hole, \(25 \mathrm {~cm} \times 25 \mathrm {~cm} \times 10 \mathrm {~cm}\), and to count the number of earthworms they find. Calculate the probability that the pupils find exactly 5 earthworms.
c) In the autumn, the average number of earthworms per sample is greater than in the summer. The probability that, in the autumn, there are fewer than 13 earthworms in a soil sample of dimensions \(25 \mathrm {~cm} \times 25 \mathrm {~cm} \times 40 \mathrm {~cm}\) is close to \(36 \%\). Find the mean number of earthworms, to the nearest whole number, per \(25 \mathrm {~cm} \times 25 \mathrm {~cm} \times 40 \mathrm {~cm}\) soil sample in the autumn.
Jessica is studying the relationship between hip girth, \(h \mathrm {~cm}\), and thigh girth, \(t \mathrm {~cm}\), for American adults who are physically active. She takes a random sample of 11 people from a very large dataset which she has downloaded into a spreadsheet software package. The results are shown below.
| \(h ( \mathrm {~cm} )\) | \(98 \cdot 6\) | \(112 \cdot 1\) | \(97 \cdot 9\) | \(110 \cdot 2\) | \(89 \cdot 2\) | \(111 \cdot 7\) | \(87 \cdot 0\) | \(94 \cdot 7\) | \(100 \cdot 4\) | \(104 \cdot 0\) | \(88 \cdot 4\) |
| \(t ( \mathrm {~cm} )\) | \(48 \cdot 3\) | \(87 \cdot 2\) | \(55 \cdot 2\) | \(68 \cdot 0\) | \(48 \cdot 5\) | \(63 \cdot 2\) | \(49 \cdot 5\) | \(55 \cdot 7\) | \(59 \cdot 1\) | \(64 \cdot 0\) | \(52 \cdot 4\) |
a) Jessica notes that, for the thigh girth data, the lower quartile is 49.5 and the upper quartile is \(64 \cdot 0\).
i) Show that 87.2 should be classified as an outlier for \(t\).
ii) Give a reason why Jessica might exclude the outlier.
iii) Give a reason why Jessica might include the outlier.
Jessica decides to exclude the outlier and produces the following scatter diagram.
\section*{Thigh girth versus Hip girth}
\includegraphics[max width=\textwidth, alt={}, center]{77c62e6d-58e4-42d3-9982-5a8325e8e826-04_647_1250_1439_404}
b) Interpret, in context, the correlation in the data shown in the diagram.
The equation of the regression line of \(t\) on \(h\) for this sample is
$$t = 0.69 h - 11.26$$
c) Interpret the gradient of the regression line in this context.
d) Use your knowledge of large data sets and spreadsheet software packages to suggest a way in which Jessica could improve her investigation.
A company, Run4Lyfe, sponsors an athletic event. The organisers of the event claim that \(70 \%\) of the participants know the name of the sponsoring company. Run4Lyfe is concerned that the proportion, \(p\), of participants knowing the name of the sponsoring company is less than \(70 \%\). They decide to survey 60 randomly selected participants to carry out a significance test.
a) State suitable hypotheses for carrying out the test.
b) i) Explain what is meant by the critical region for this test.
ii) Determine the critical region if the test is to be carried out at a significance level as close as possible to, but not exceeding, \(5 \%\).
iii) Given that 40 participants out of the 60 in the sample know the name of the company, complete the significance test.
c) State, with a reason, how you would advise Run4Lyfe with regards to sponsoring the event next year.
The fertility rate for a country is the average number of children that are born to a woman over her lifetime. The graphs and table below show some data on the fertility rates for 197 countries in the years 1914 and 2014.
\begin{figure}[h]
\captionsetup{labelformat=empty}
\caption{Fertility rates in 1914}
\includegraphics[alt={},max width=\textwidth]{77c62e6d-58e4-42d3-9982-5a8325e8e826-06_671_1483_593_283}
\end{figure}
\begin{figure}[h]
\captionsetup{labelformat=empty}
\caption{Fertility rates in 2014}
\includegraphics[alt={},max width=\textwidth]{77c62e6d-58e4-42d3-9982-5a8325e8e826-06_616_1219_1434_287}
\end{figure}
\begin{figure}[h]
\captionsetup{labelformat=empty}
\caption{Decreases in fertility rates from 1914 to 2014}
\includegraphics[alt={},max width=\textwidth]{77c62e6d-58e4-42d3-9982-5a8325e8e826-06_476_613_2270_388}
\end{figure}
| Minimum value | - 0.71 | | Lower quartile | 2.08 | | Median | 3.19 | | Upper quartile | 3.94 | | Maximum value | 6.49 |
a) Comment on the shapes of the distributions of fertility rates for 1914 and 2014.
b) Interpret the minimum value, \(- 0 \cdot 71\), in the boxplot.
You are also given the following information:
| Country | | | | France | | 1.98 | | Ethiopia | | 4.4 |
c) i) Find the best possible estimate for the decrease in the fertility rate from 1914 to 2014 for France.
ii) Find the best possible estimate for the decrease in the fertility rate from 1914 to 2014 for Ethiopia.
iii) Give one possible reason why the answers to i) and ii) are so different.
iv) Explain why these estimates may not be very accurate.
\section*{Section B: Mechanics}
The diagram below shows a vehicle of mass 1300 kg towing a trailer of mass 500 kg by means of a light horizontal tow bar. The vehicle is moving forward along a straight horizontal road such that a constant resistance of magnitude 650 N acts on the vehicle and a constant resistance of magnitude 320 N acts on the trailer. The vehicle's engine produces a constant driving force of \(F \mathrm {~N}\).
\includegraphics[max width=\textwidth, alt={}]{77c62e6d-58e4-42d3-9982-5a8325e8e826-08_158_851_781_609}
Given that the acceleration of the vehicle and trailer is \(0.85 \mathrm {~ms} ^ { - 2 }\), show that \(F = 2500\) and determine the tension in the tow bar.
CAIE
FP2
2017
June
Q10
12 marks
Standard +0.3
Roberto owns a small hotel and offers accommodation to guests. Over a period of \(100\) nights, the numbers of rooms, \(x\), that are occupied each night at Roberto's hotel and the corresponding frequencies are shown in the following table.
| Number of rooms occupied \((x)\) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | \(\geqslant 7\) | | Number of nights | 4 | 9 | 18 | 26 | 20 | 16 | 7 | 0 |
- Show that the mean number of rooms that are occupied each night is \(3.25\). [1]
The following table shows most of the corresponding expected frequencies, correct to \(2\) decimal places, using a Poisson distribution with mean \(3.25\).
| Number of rooms occupied \((x)\) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | \(\geqslant 7\) | | Observed frequency | 4 | 9 | 18 | 26 | 20 | 16 | 7 | 0 | | Expected frequency | 3.88 | 12.60 | 20.48 | 22.18 | 18.02 | 11.72 | | |
- Show how the expected value of \(22.18\), for \(x = 3\), is obtained and find the expected values for \(x = 6\) and for \(x \geqslant 7\). [4]
- Use a goodness-of-fit test at the \(5\%\) significance level to determine whether the Poisson distribution is a suitable model for the number of rooms occupied each night at Roberto's hotel. [7]
CAIE
FP2
2018
November
Q10
12 marks
Standard +0.8
The number of accidents, \(x\), that occur each day on a motorway are recorded over a period of 40 days. The results are shown in the following table.
| Number of accidents | 0 | 1 | 2 | 3 | 4 | 5 | 6 | \(\geqslant 7\) | | Observed frequency | 3 | 5 | 8 | 10 | 5 | 7 | 2 | 0 |
\begin{enumerate}[label=(\roman*)]
\item Show that the mean number of accidents each day is 2.95 and calculate the variance for this sample. Explain why these values suggest that a Poisson distribution might fit the data.
[3]
\item A Poisson distribution with mean 2.95, as found from the data, is used to calculate the expected frequencies, correct to 2 decimal places. The results are shown in the following table.
| Number of accidents | 0 | 1 | 2 | 3 | 4 | 5 | 6 | \(\geqslant 7\) | | Observed frequency | 3 | 5 | 8 | 10 | 5 | 7 | 2 | 0 | | Expected frequency | 2.09 | 6.18 | 9.11 | 8.96 | 6.61 | 3.90 | 1.92 | 1.23 |
Show how the expected frequency of 6.61 for \(x = 4\) is obtained.
[2]
\item Test at the 5% significance level the goodness of fit of this Poisson distribution to the data.
[7]
\end{enumerate]
CAIE
S2
2021
June
Q1
4 marks
Standard +0.3
Accidents at two factories occur randomly and independently. On average, the numbers of accidents per month are 3.1 at factory \(A\) and 1.7 at factory \(B\).
Find the probability that the total number of accidents in the two factories during a 2-month period is more than 3. [4]
CAIE
S2
2021
June
Q5
7 marks
Standard +0.3
On average, 1 in 75000 adults has a certain genetic disorder.
- Use a suitable approximating distribution to find the probability that, in a random sample of 10000 people, at least 1 has the genetic disorder. [3]
- In a random sample of \(n\) people, where \(n\) is large, the probability that no-one has the genetic disorder is more than 0.9.
Find the largest possible value of \(n\). [4]
CAIE
S2
2022
November
Q3
6 marks
Moderate -0.8
1.6% of adults in a certain town ride a bicycle. A random sample of 200 adults from this town is selected.
- Use a suitable approximating distribution to find the probability that more than 3 of these adults ride a bicycle. [4]
- Justify your approximating distribution. [2]
CAIE
S2
2023
November
Q3
10 marks
Standard +0.3
A website owner finds that, on average, his website receives 0.3 hits per minute. He believes that the number of hits per minute follows a Poisson distribution.
- Assume that the owner is correct.
- Find the probability that there will be at least 4 hits during a 10-minute period. [3]
- Use a suitable approximating distribution to find the probability that there will be fewer than 40 hits during a 3-hour period. [4]
A friend agrees that the website receives, on average, 0.3 hits per minute. However, she notices that the number of hits during the day-time (9.00am to 9.00pm) is usually about twice the number of hits during the night-time (9.00pm to 9.00am).
- Explain why this fact contradicts the owner's belief that the number of hits per minute follows a Poisson distribution. [1]
- Specify separate Poisson distributions that might be suitable models for the number of hits during the day-time and during the night-time. [2]
CAIE
S2
2024
November
Q6
9 marks
Standard +0.3
The numbers of customers arriving at service desks \(A\) and \(B\) during a \(10\)-minute period have the independent distributions \(\text{Po}(1.8)\) and \(\text{Po}(2.1)\) respectively.
- Find the probability that during a randomly chosen \(15\)-minute period more than \(2\) customers will arrive at desk \(A\). [2]
- Find the probability that during a randomly chosen \(5\)-minute period the total number of customers arriving at both desks is less than \(4\). [3]
- An inspector waits at desk \(B\). She wants to wait long enough to be \(90\%\) certain of seeing at least one customer arrive at the desk.
Find the minimum time for which she should wait, giving your answer correct to the nearest minute. [4]
CAIE
S2
2011
June
Q3
7 marks
Moderate -0.3
The number of goals scored per match by Everly Rovers is represented by the random variable \(X\) which has mean 1.8.
- State two conditions for \(X\) to be modelled by a Poisson distribution. [2]
Assume now that \(X \sim \text{Po}(1.8)\).
- Find \(\text{P}(2 < X < 6)\). [2]
- The manager promises the team a bonus if they score at least 1 goal in each of the next 10 matches. Find the probability that they win the bonus. [3]
CAIE
S2
2011
June
Q5
9 marks
Standard +0.3
The number of adult customers arriving in a shop during a 5-minute period is modelled by a random variable with distribution \(\text{Po}(6)\). The number of child customers arriving in the same shop during a 10-minute period is modelled by an independent random variable with distribution \(\text{Po}(4.5)\).
- Find the probability that during a randomly chosen 2-minute period, the total number of adult and child customers who arrive in the shop is less than 3. [3]
- During a sale, the manager claims that more adult customers are arriving than usual. In a randomly selected 30-minute period during the sale, 49 adult customers arrive. Test the manager's claim at the 2.5\% significance level. [6]
CAIE
S2
2002
November
Q4
7 marks
Standard +0.3
The number of accidents per month at a certain road junction has a Poisson distribution with mean 4.8. A new road sign is introduced warning drivers of the danger ahead, and in a subsequent month 2 accidents occurred.
- A hypothesis test at the 10% level is used to determine whether there were fewer accidents after the new road sign was introduced. Find the critical region for this test and carry out the test. [5]
- Find the probability of a Type I error. [2]
Edexcel
S2
2016
January
Q3
11 marks
Moderate -0.3
Left-handed people make up 10\% of a population. A random sample of 60 people is taken from this population. The discrete random variable \(Y\) represents the number of left-handed people in the sample.
- Write down an expression for the exact value of \(\mathrm{P}(Y \leq 1)\)
- Evaluate your expression, giving your answer to 3 significant figures. [3]
- Using a Poisson approximation, estimate \(\mathrm{P}(Y \leq 1)\) [2]
- Using a normal approximation, estimate \(\mathrm{P}(Y \leq 1)\) [5]
- Give a reason why the Poisson approximation is a more suitable estimate of \(\mathrm{P}(Y \leq 1)\) [1]
Edexcel
S2
2016
January
Q5
10 marks
Standard +0.3
The number of eruptions of a volcano in a 10 year period is modelled by a Poisson distribution with mean 1
- Find the probability that this volcano erupts at least once in each of 2 randomly selected 10 year periods. [2]
- Find the probability that this volcano does not erupt in a randomly selected 20 year period. [2]
The probability that this volcano erupts exactly 4 times in a randomly selected \(w\) year period is 0.0443 to 3 significant figures.
- Use the tables to find the value of \(w\) [3]
A scientist claims that the mean number of eruptions of this volcano in a 10 year period is more than 1
She selects a 100 year period at random in order to test her claim.
- State the null hypothesis for this test. [1]
- Determine the critical region for the test at the 5\% level of significance. [2]
Edexcel
S2
2016
January
Q7
12 marks
Standard +0.3
A fisherman is known to catch fish at a mean rate of 4 per hour. The number of fish caught by the fisherman in an hour follows a Poisson distribution.
The fisherman takes 5 fishing trips each lasting 1 hour.
- Find the probability that this fisherman catches at least 6 fish on exactly 3 of these trips. [6]
The fisherman buys some new equipment and wants to test whether or not there is a change in the mean number of fish caught per hour.
Given that the fisherman caught 14 fish in a 2 hour period using the new equipment,
- carry out the test at the 5\% level of significance. State your hypotheses clearly. [6]
|