Edexcel S2 — Question 7

Exam BoardEdexcel
ModuleS2 (Statistics 2)
PaperDownload PDF ↗
Mark schemeDownload PDF ↗
TopicHypothesis test of binomial distributions

7. A drugs company claims that \(75 \%\) of patients suffering from depression recover when treated with a new drug. A random sample of 10 patients with depression is taken from a doctor's records.
  1. Write down a suitable distribution to model the number of patients in this sample who recover when treated with the new drug. Given that the claim is correct,
  2. find the probability that the treatment will be successful for exactly 6 patients. The doctor believes that the claim is incorrect and the percentage who will recover is lower. From her records she took a random sample of 20 patients who had been treated with the new drug. She found that 13 had recovered.
  3. Stating your hypotheses clearly, test, at the \(5 \%\) level of significance, the doctor's belief.
  4. From a sample of size 20, find the greatest number of patients who need to recover for the test in part (c) to be significant at the \(1 \%\) level. Turn over
    1. Before introducing a new rule the secretary of a golf club decided to find out how members might react to this rule.
    2. Explain why the secretary decided to take a random sample of club members rather than ask all the members.
    3. Suggest a suitable sampling frame.
    4. Identify the sampling units. \includegraphics[max width=\textwidth, alt={}, center]{4d6588cd-22a7-4436-a7d9-9b335b98d2c0-014_90_72_2577_1805} \includegraphics[max width=\textwidth, alt={}, center]{4d6588cd-22a7-4436-a7d9-9b335b98d2c0-014_104_1831_2648_114}
    5. The continuous random variable \(L\) represents the error, in mm , made when a machine cuts rods to a target length. The distribution of \(L\) is continuous uniform over the interval [-4.0, 4.0].
    Find
  5. \(\mathrm { P } ( L < - 2.6 )\),
  6. \(\mathrm { P } ( L < - 3.0\) or \(L > 3.0 )\). A random sample of 20 rods cut by the machine was checked.
  7. Find the probability that more than half of them were within 3.0 mm of the target length.
    3. An estate agent sells properties at a mean rate of 7 per week.
  8. Suggest a suitable model to represent the number of properties sold in a randomly chosen week. Give two reasons to support your model.
  9. Find the probability that in any randomly chosen week the estate agent sells exactly 5 properties.
  10. Using a suitable approximation find the probability that during a 24 week period the estate agent sells more than 181 properties.
    1. Breakdowns occur on a particular machine at random at a mean rate of 1.25 per week.
    2. Find the probability that fewer than 3 breakdowns occurred in a randomly chosen week.
    Over a 4 week period the machine was monitored. During this time there were 11 breakdowns.
  11. Test, at the \(5 \%\) level of significance, whether or not there is evidence that the rate of breakdowns has changed over this period. State your hypotheses clearly.
    1. A manufacturer produces large quantities of coloured mugs. It is known from previous records that \(6 \%\) of the production will be green.
    A random sample of 10 mugs was taken from the production line.
  12. Define a suitable distribution to model the number of green mugs in this sample.
  13. Find the probability that there were exactly 3 green mugs in the sample. A random sample of 125 mugs was taken.
  14. Find the probability that there were between 10 and 13 (inclusive) green mugs in this sample, using
    1. a Poisson approximation,
    2. a Normal approximation.
      6. The continuous random variable \(X\) has probability density function $$f ( x ) = \left\{ \begin{array} { c c } \frac { 1 + x } { k } , & 1 \leqslant x \leqslant 4 \\ 0 , & \text { otherwise } \end{array} \right.$$
  15. Show that \(k = \frac { 21 } { 2 }\).
  16. Specify fully the cumulative distribution function of \(X\).
  17. Calculate \(\mathrm { E } ( X )\).
  18. Find the value of the median.
  19. Write down the mode.
  20. Explain why the distribution is negatively skewed.
    1. It is known from past records that 1 in 5 bowls produced in a pottery have minor defects. To monitor production a random sample of 25 bowls was taken and the number of such bowls with defects was recorded.
    2. Using a 5\% level of significance, find critical regions for a two-tailed test of the hypothesis that 1 in 5 bowls have defects. The probability of rejecting, in either tail, should be as close to \(2.5 \%\) as possible.
    3. State the actual significance level of the above test.
    At a later date, a random sample of 20 bowls was taken and 2 of them were found to have defects.
  21. Test, at the \(10 \%\) level of significance, whether or not there is evidence that the proportion of bowls with defects has decreased. State your hypotheses clearly. Turn over
    1. (a) Define a statistic.
    A random sample \(X _ { 1 } , X _ { 2 } , \ldots , X _ { \mathrm { n } }\) is taken from a population with unknown mean \(\mu\).
  22. For each of the following state whether or not it is a statistic.
    1. \(\frac { X _ { 1 } + X _ { 4 } } { 2 }\),
    2. \(\frac { \sum X ^ { 2 } } { n } - \mu ^ { 2 }\).
      2. The random variable \(J\) has a Poisson distribution with mean 4.
  23. Find \(\mathrm { P } ( J \geqslant 10 )\). The random variable \(K\) has a binomial distribution with parameters \(n = 25 , p = 0.27\).
  24. Find \(\mathrm { P } ( K \leqslant 1 )\).
    3. For a particular type of plant \(45 \%\) have white flowers and the remainder have coloured flowers. Gardenmania sells plants in batches of 12. A batch is selected at random. Calculate the probability that this batch contains
  25. exactly 5 plants with white flowers,
  26. more plants with white flowers than coloured ones. Gardenmania takes a random sample of 10 batches of plants.
  27. Find the probability that exactly 3 of these batches contain more plants with white flowers than coloured ones. Due to an increasing demand for these plants by large companies, Gardenmania decides to sell them in batches of 50 .
  28. Use a suitable approximation to calculate the probability that a batch of 50 plants contains more than 25 plants with white flowers. 4. (a) State the condition under which the normal distribution may be used as an approximation to the Poisson distribution.
  29. Explain why a continuity correction must be incorporated when using the normal distribution as an approximation to the Poisson distribution. A company has yachts that can only be hired for a week at a time. All hiring starts on a Saturday.
    During the winter the mean number of yachts hired per week is 5 .
  30. Calculate the probability that fewer than 3 yachts are hired on a particular Saturday in winter. During the summer the mean number of yachts hired per week increases to 25 . The company has only 30 yachts for hire.
  31. Using a suitable approximation find the probability that the demand for yachts cannot be met on a particular Saturday in the summer. In the summer there are 16 Saturdays on which a yacht can be hired.
  32. Estimate the number of Saturdays in the summer that the company will not be able to meet the demand for yachts. 5. The continuous random variable \(X\) is uniformly distributed over the interval \(\alpha < x < \beta\).
  33. Write down the probability density function of \(X\), for all \(x\).
  34. Given that \(\mathrm { E } ( X ) = 2\) and \(\mathrm { P } ( X < 3 ) = \frac { 5 } { 8 }\) find the value of \(\alpha\) and the value of \(\beta\). A gardener has wire cutters and a piece of wire 150 cm long which has a ring attached at one end. The gardener cuts the wire, at a randomly chosen point, into 2 pieces. The length, in cm, of the piece of wire with the ring on it is represented by the random variable \(X\). Find
  35. \(\mathrm { E } ( X )\),
  36. the standard deviation of \(X\),
  37. the probability that the shorter piece of wire is at most 30 cm long.
    6. Past records from a large supermarket show that \(20 \%\) of people who buy chocolate bars buy the family size bar. On one particular day a random sample of 30 people was taken from those that had bought chocolate bars and 2 of them were found to have bought a family size bar.
  38. Test at the \(5 \%\) significance level, whether or not the proportion \(p\), of people who bought a family size bar of chocolate that day had decreased. State your hypotheses clearly. The manager of the supermarket thinks that the probability of a person buying a gigantic chocolate bar is only 0.02 . To test whether this hypothesis is true the manager decides to take a random sample of 200 people who bought chocolate bars.
  39. Find the critical region that would enable the manager to test whether or not there is evidence that the probability is different from 0.02 . The probability of each tail should be as close to \(2.5 \%\) as possible.
  40. Write down the significance level of this test.
    7. The continuous random variable \(X\) has cumulative distribution function $$\mathrm { F } ( x ) = \begin{cases} 0 , & x < 0 \\ 2 x ^ { 2 } - x ^ { 3 } , & 0 \leqslant x \leqslant 1 \\ 1 , & x > 1 \end{cases}$$
  41. Find \(\mathrm { P } ( X > 0.3 )\).
  42. Verify that the median value of \(X\) lies between \(x = 0.59\) and \(x = 0.60\).
  43. Find the probability density function \(\mathrm { f } ( x )\).
  44. Evaluate \(\mathrm { E } ( X )\).
  45. Find the mode of \(X\).
  46. Comment on the skewness of \(X\). Justify your answer.
    \end{table} Turn over
    1. A string \(A B\) of length 5 cm is cut, in a random place \(C\), into two pieces. The random variable \(X\) is the length of \(A C\).
    2. Write down the name of the probability distribution of \(X\) and sketch the graph of its probability density function.
    3. Find the values of \(\mathrm { E } ( X )\) and \(\operatorname { Var } ( X )\).
    4. Find \(\mathrm { P } ( X > 3 )\).
    5. Write down the probability that \(A C\) is 3 cm long.
    6. Bacteria are randomly distributed in a river at a rate of 5 per litre of water. A new factory opens and a scientist claims it is polluting the river with bacteria. He takes a sample of 0.5 litres of water from the river near the factory and finds that it contains 7 bacteria. Stating your hypotheses clearly test, at the 5\% level of significance, the claim of the scientist.
    7. An engineering company manufactures an electronic component. At the end of the manufacturing process, each component is checked to see if it is faulty. Faulty components are detected at a rate of 1.5 per hour.
    8. Suggest a suitable model for the number of faulty components detected per hour.
    9. Describe, in the context of this question, two assumptions you have made in part (a) for this model to be suitable.
    10. Find the probability of 2 faulty components being detected in a 1 hour period.
    11. Find the probability of at least one faulty component being detected in a 3 hour period.
    12. A bag contains a large number of coins:
    75\% are 10p coins, \(25 \%\) are 5 p coins. A random sample of 3 coins is drawn from the bag.
    Find the sampling distribution for the median of the values of the 3 selected coins.
    5. (a) Write down the conditions under which the Poisson distribution may be used as an approximation to the Binomial distribution. A call centre routes incoming telephone calls to agents who have specialist knowledge to deal with the call. The probability of the caller being connected to the wrong agent is 0.01
  47. Find the probability that 2 consecutive calls will be connected to the wrong agent.
  48. Find the probability that more than 1 call in 5 consecutive calls are connected to the wrong agent. The call centre receives 1000 calls each day.
  49. Find the mean and variance of the number of wrongly connected calls.
  50. Use a Poisson approximation to find, to 3 decimal places, the probability that more than 6 calls each day are connected to the wrong agent.
    1. Linda regularly takes a taxi to work five times a week. Over a long period of time she finds the taxi is late once a week. The taxi firm changes her driver and Linda thinks the taxi is late more often. In the first week, with the new driver, the taxi is late 3 times.
    You may assume that the number of times a taxi is late in a week has a Binomial distribution. Test, at the \(5 \%\) level of significance, whether or not there is evidence of an increase in the proportion of times the taxi is late. State your hypotheses clearly.
    7. (a) (i) Write down two conditions for \(X \sim \operatorname { Bin } ( n , p )\) to be approximated by a normal distribution \(Y \sim \mathrm {~N} \left( \mu , \sigma ^ { 2 } \right)\).
    (ii) Write down the mean and variance of this normal approximation in terms of \(n\) and \(p\). A factory manufactures 2000 DVDs every day. It is known that \(3 \%\) of DVDs are faulty.
  51. Using a normal approximation, estimate the probability that at least 40 faulty DVDs are produced in one day. The quality control system in the factory identifies and destroys every faulty DVD at the end of the manufacturing process. It costs \(\pounds 0.70\) to manufacture a DVD and the factory sells non-faulty DVDs for \(\pounds 11\).
  52. Find the expected profit made by the factory per day.

7. A drugs company claims that $75 \%$ of patients suffering from depression recover when treated with a new drug.

A random sample of 10 patients with depression is taken from a doctor's records.\\
(a) Write down a suitable distribution to model the number of patients in this sample who recover when treated with the new drug.

Given that the claim is correct,\\
(b) find the probability that the treatment will be successful for exactly 6 patients.

The doctor believes that the claim is incorrect and the percentage who will recover is lower. From her records she took a random sample of 20 patients who had been treated with the new drug. She found that 13 had recovered.\\
(c) Stating your hypotheses clearly, test, at the $5 \%$ level of significance, the doctor's belief.\\
(d) From a sample of size 20, find the greatest number of patients who need to recover for the test in part (c) to be significant at the $1 \%$ level.

Turn over

\begin{enumerate}
  \item Before introducing a new rule the secretary of a golf club decided to find out how members might react to this rule.\\
(a) Explain why the secretary decided to take a random sample of club members rather than ask all the members.\\
(b) Suggest a suitable sampling frame.\\
(c) Identify the sampling units.\\

\includegraphics[max width=\textwidth, alt={}, center]{4d6588cd-22a7-4436-a7d9-9b335b98d2c0-014_90_72_2577_1805}\\
\includegraphics[max width=\textwidth, alt={}, center]{4d6588cd-22a7-4436-a7d9-9b335b98d2c0-014_104_1831_2648_114}
  \item The continuous random variable $L$ represents the error, in mm , made when a machine cuts rods to a target length. The distribution of $L$ is continuous uniform over the interval [-4.0, 4.0].
\end{enumerate}

Find\\
(a) $\mathrm { P } ( L < - 2.6 )$,\\
(b) $\mathrm { P } ( L < - 3.0$ or $L > 3.0 )$.

A random sample of 20 rods cut by the machine was checked.\\
(c) Find the probability that more than half of them were within 3.0 mm of the target length.\\
3. An estate agent sells properties at a mean rate of 7 per week.\\
(a) Suggest a suitable model to represent the number of properties sold in a randomly chosen week. Give two reasons to support your model.\\
(b) Find the probability that in any randomly chosen week the estate agent sells exactly 5 properties.\\
(c) Using a suitable approximation find the probability that during a 24 week period the estate agent sells more than 181 properties.

\begin{enumerate}
  \setcounter{enumi}{3}
  \item Breakdowns occur on a particular machine at random at a mean rate of 1.25 per week.\\
(a) Find the probability that fewer than 3 breakdowns occurred in a randomly chosen week.
\end{enumerate}

Over a 4 week period the machine was monitored. During this time there were 11 breakdowns.\\
(b) Test, at the $5 \%$ level of significance, whether or not there is evidence that the rate of breakdowns has changed over this period. State your hypotheses clearly.\\

\begin{enumerate}
  \setcounter{enumi}{4}
  \item A manufacturer produces large quantities of coloured mugs. It is known from previous records that $6 \%$ of the production will be green.
\end{enumerate}

A random sample of 10 mugs was taken from the production line.\\
(a) Define a suitable distribution to model the number of green mugs in this sample.\\
(b) Find the probability that there were exactly 3 green mugs in the sample.

A random sample of 125 mugs was taken.\\
(c) Find the probability that there were between 10 and 13 (inclusive) green mugs in this sample, using\\
(i) a Poisson approximation,\\
(ii) a Normal approximation.\\

6. The continuous random variable $X$ has probability density function

$$f ( x ) = \left\{ \begin{array} { c c } 
\frac { 1 + x } { k } , & 1 \leqslant x \leqslant 4 \\
0 , & \text { otherwise }
\end{array} \right.$$

(a) Show that $k = \frac { 21 } { 2 }$.\\
(b) Specify fully the cumulative distribution function of $X$.\\
(c) Calculate $\mathrm { E } ( X )$.\\
(d) Find the value of the median.\\
(e) Write down the mode.\\
(f) Explain why the distribution is negatively skewed.\\

\begin{enumerate}
  \setcounter{enumi}{6}
  \item It is known from past records that 1 in 5 bowls produced in a pottery have minor defects. To monitor production a random sample of 25 bowls was taken and the number of such bowls with defects was recorded.\\
(a) Using a 5\% level of significance, find critical regions for a two-tailed test of the hypothesis that 1 in 5 bowls have defects. The probability of rejecting, in either tail, should be as close to $2.5 \%$ as possible.\\
(b) State the actual significance level of the above test.
\end{enumerate}

At a later date, a random sample of 20 bowls was taken and 2 of them were found to have defects.\\
(c) Test, at the $10 \%$ level of significance, whether or not there is evidence that the proportion of bowls with defects has decreased. State your hypotheses clearly.

Turn over

\begin{enumerate}
  \item (a) Define a statistic.
\end{enumerate}

A random sample $X _ { 1 } , X _ { 2 } , \ldots , X _ { \mathrm { n } }$ is taken from a population with unknown mean $\mu$.\\
(b) For each of the following state whether or not it is a statistic.\\
(i) $\frac { X _ { 1 } + X _ { 4 } } { 2 }$,\\
(ii) $\frac { \sum X ^ { 2 } } { n } - \mu ^ { 2 }$.\\

2. The random variable $J$ has a Poisson distribution with mean 4.\\
(a) Find $\mathrm { P } ( J \geqslant 10 )$.

The random variable $K$ has a binomial distribution with parameters $n = 25 , p = 0.27$.\\
(b) Find $\mathrm { P } ( K \leqslant 1 )$.\\

3. For a particular type of plant $45 \%$ have white flowers and the remainder have coloured flowers. Gardenmania sells plants in batches of 12. A batch is selected at random.

Calculate the probability that this batch contains\\
(a) exactly 5 plants with white flowers,\\
(b) more plants with white flowers than coloured ones.

Gardenmania takes a random sample of 10 batches of plants.\\
(c) Find the probability that exactly 3 of these batches contain more plants with white flowers than coloured ones.

Due to an increasing demand for these plants by large companies, Gardenmania decides to sell them in batches of 50 .\\
(d) Use a suitable approximation to calculate the probability that a batch of 50 plants contains more than 25 plants with white flowers.

4. (a) State the condition under which the normal distribution may be used as an approximation to the Poisson distribution.\\
(b) Explain why a continuity correction must be incorporated when using the normal distribution as an approximation to the Poisson distribution.

A company has yachts that can only be hired for a week at a time. All hiring starts on a Saturday.\\
During the winter the mean number of yachts hired per week is 5 .\\
(c) Calculate the probability that fewer than 3 yachts are hired on a particular Saturday in winter.

During the summer the mean number of yachts hired per week increases to 25 . The company has only 30 yachts for hire.\\
(d) Using a suitable approximation find the probability that the demand for yachts cannot be met on a particular Saturday in the summer.

In the summer there are 16 Saturdays on which a yacht can be hired.\\
(e) Estimate the number of Saturdays in the summer that the company will not be able to meet the demand for yachts.

5. The continuous random variable $X$ is uniformly distributed over the interval $\alpha < x < \beta$.\\
(a) Write down the probability density function of $X$, for all $x$.\\
(b) Given that $\mathrm { E } ( X ) = 2$ and $\mathrm { P } ( X < 3 ) = \frac { 5 } { 8 }$ find the value of $\alpha$ and the value of $\beta$.

A gardener has wire cutters and a piece of wire 150 cm long which has a ring attached at one end. The gardener cuts the wire, at a randomly chosen point, into 2 pieces. The length, in cm, of the piece of wire with the ring on it is represented by the random variable $X$. Find\\
(c) $\mathrm { E } ( X )$,\\
(d) the standard deviation of $X$,\\
(e) the probability that the shorter piece of wire is at most 30 cm long.\\

6. Past records from a large supermarket show that $20 \%$ of people who buy chocolate bars buy the family size bar. On one particular day a random sample of 30 people was taken from those that had bought chocolate bars and 2 of them were found to have bought a family size bar.\\
(a) Test at the $5 \%$ significance level, whether or not the proportion $p$, of people who bought a family size bar of chocolate that day had decreased. State your hypotheses clearly.

The manager of the supermarket thinks that the probability of a person buying a gigantic chocolate bar is only 0.02 . To test whether this hypothesis is true the manager decides to take a random sample of 200 people who bought chocolate bars.\\
(b) Find the critical region that would enable the manager to test whether or not there is evidence that the probability is different from 0.02 . The probability of each tail should be as close to $2.5 \%$ as possible.\\
(c) Write down the significance level of this test.\\

7. The continuous random variable $X$ has cumulative distribution function

$$\mathrm { F } ( x ) = \begin{cases} 0 , & x < 0 \\ 2 x ^ { 2 } - x ^ { 3 } , & 0 \leqslant x \leqslant 1 \\ 1 , & x > 1 \end{cases}$$

(a) Find $\mathrm { P } ( X > 0.3 )$.\\
(b) Verify that the median value of $X$ lies between $x = 0.59$ and $x = 0.60$.\\
(c) Find the probability density function $\mathrm { f } ( x )$.\\
(d) Evaluate $\mathrm { E } ( X )$.\\
(e) Find the mode of $X$.\\
(f) Comment on the skewness of $X$. Justify your answer.\\

\end{table}

Turn over

\begin{enumerate}
  \item A string $A B$ of length 5 cm is cut, in a random place $C$, into two pieces. The random variable $X$ is the length of $A C$.\\
(a) Write down the name of the probability distribution of $X$ and sketch the graph of its probability density function.\\
(b) Find the values of $\mathrm { E } ( X )$ and $\operatorname { Var } ( X )$.\\
(c) Find $\mathrm { P } ( X > 3 )$.\\
(d) Write down the probability that $A C$ is 3 cm long.
  \item Bacteria are randomly distributed in a river at a rate of 5 per litre of water. A new factory opens and a scientist claims it is polluting the river with bacteria. He takes a sample of 0.5 litres of water from the river near the factory and finds that it contains 7 bacteria. Stating your hypotheses clearly test, at the 5\% level of significance, the claim of the scientist.
  \item An engineering company manufactures an electronic component. At the end of the manufacturing process, each component is checked to see if it is faulty. Faulty components are detected at a rate of 1.5 per hour.\\
(a) Suggest a suitable model for the number of faulty components detected per hour.\\
(b) Describe, in the context of this question, two assumptions you have made in part (a) for this model to be suitable.\\
(c) Find the probability of 2 faulty components being detected in a 1 hour period.\\
(d) Find the probability of at least one faulty component being detected in a 3 hour period.\\

  \item A bag contains a large number of coins:
\end{enumerate}

75\% are 10p coins,\\
$25 \%$ are 5 p coins.

A random sample of 3 coins is drawn from the bag.\\
Find the sampling distribution for the median of the values of the 3 selected coins.\\
5. (a) Write down the conditions under which the Poisson distribution may be used as an approximation to the Binomial distribution.

A call centre routes incoming telephone calls to agents who have specialist knowledge to deal with the call. The probability of the caller being connected to the wrong agent is 0.01\\
(b) Find the probability that 2 consecutive calls will be connected to the wrong agent.\\
(c) Find the probability that more than 1 call in 5 consecutive calls are connected to the wrong agent.

The call centre receives 1000 calls each day.\\
(d) Find the mean and variance of the number of wrongly connected calls.\\
(e) Use a Poisson approximation to find, to 3 decimal places, the probability that more than 6 calls each day are connected to the wrong agent.\\

\begin{enumerate}
  \setcounter{enumi}{5}
  \item Linda regularly takes a taxi to work five times a week. Over a long period of time she finds the taxi is late once a week. The taxi firm changes her driver and Linda thinks the taxi is late more often. In the first week, with the new driver, the taxi is late 3 times.
\end{enumerate}

You may assume that the number of times a taxi is late in a week has a Binomial distribution.

Test, at the $5 \%$ level of significance, whether or not there is evidence of an increase in the proportion of times the taxi is late. State your hypotheses clearly.\\
7. (a) (i) Write down two conditions for $X \sim \operatorname { Bin } ( n , p )$ to be approximated by a normal distribution $Y \sim \mathrm {~N} \left( \mu , \sigma ^ { 2 } \right)$.\\
(ii) Write down the mean and variance of this normal approximation in terms of $n$ and $p$.

A factory manufactures 2000 DVDs every day. It is known that $3 \%$ of DVDs are faulty.\\
(b) Using a normal approximation, estimate the probability that at least 40 faulty DVDs are produced in one day.

The quality control system in the factory identifies and destroys every faulty DVD at the end of the manufacturing process. It costs $\pounds 0.70$ to manufacture a DVD and the factory sells non-faulty DVDs for $\pounds 11$.\\
(c) Find the expected profit made by the factory per day.

\hfill \mbox{\textit{Edexcel S2  Q7}}