SPS SPS FM Statistics (SPS FM Statistics) 2024 January

Question 1
View details
1. The continuous random variable \(X\) has the distribution \(\mathrm { N } ( \mu , 30 )\). The mean of a random sample of 8 observations of \(X\) is 53.1. Determine a \(95 \%\) confidence interval for \(\mu\). You should give the end points of the interval correct to 4 significant figures.
Question 2
View details
2. At a seaside resort the number \(X\) of ice-creams sold and the temperature \(Y ^ { \circ } \mathrm { F }\) were recorded on 20 randomly chosen summer days. The data can be summarised as follows. $$\sum x = 1506 \quad \sum x ^ { 2 } = 127542 \quad \sum y = 1431 \quad \sum y ^ { 2 } = 104451 \quad \sum x y = 111297$$
  1. Calculate the equation of the least squares regression line of \(y\) on \(x\), giving your answer in the form \(y = a + b x\).
  2. Explain the significance for the regression line of the quantity \(\sum \left[ y _ { i } - \left( a x _ { i } + b \right) \right] ^ { 2 }\).
  3. It is decided to measure the temperature in degrees Centigrade instead of degrees Fahrenheit. If the same temperature is measured both as \(f ^ { \circ }\) Fahrenheit and \(c ^ { \circ }\) Centigrade, the relationship between \(f\) and \(c\) is \(c = \frac { 5 } { 9 } ( f - 32 )\). Find the equation of the new regression line.
Question 3
View details
3. Eight runners took part in two races. The positions in which the runners finished in the two races are shown in the table.
RunnerABCDEFGH
First race31562874
Second race43872561
Test at the \(5 \%\) significance level whether those runners who do better in one race tend to do better in the other.
Question 4
View details
4. The manager of a car breakdown service uses the distribution \(\operatorname { Po } ( 2.7 )\) to model the number of punctures, \(R\), in a 24-hour period in a given rural area. The manager knows that, for this model to be valid, punctures must occur randomly and independently of one another.
  1. State a further assumption needed for the Poisson model to be valid.
  2. State the value of the standard deviation of \(R\).
  3. Use the model to calculate the probability that, in a randomly chosen period of 168 hours, at least 22 punctures occur. The manager uses the distribution \(\mathrm { Po } ( 0.8 )\) to model the number of flat batteries in a 24 -hour period in the same rural area, and he assumes that instances of flat batteries are independent of punctures. A day begins and ends at midnight, and a "bad" day is a day on which there are more than 6 instances, in total, of punctures and flat batteries.
  4. Assume first that both the manager's models are correct. Calculate the probability that a randomly chosen day is a "bad" day.
  5. It is found that 12 of the next 100 days are "bad" days. Comment on whether this casts doubt on the validity of the manager's models.
Question 5
View details
5. A company uses two drivers for deliveries.
Driver \(A\) charges a fixed rate of \(\pounds 80\) per day plus \(\pounds 2\) per mile travelled on that day.
Driver \(B\) charges a fixed rate of \(\pounds 120\) per day plus \(\pounds 1.50\) per mile travelled on that day.
On each working day the total distance, in miles, travelled by each driver is a random variable with the distribution \(\mathrm { N } ( 83,360 )\). Find the probability that the total charge to the company of three randomly chosen days' deliveries by driver \(A\) is at least \(\pounds 300\) more than the total charge of two randomly chosen days' deliveries by driver \(B\).
Question 6
View details
6. A firm claims that no more than \(2 \%\) of their packets of sugar are underweight. A market researcher believes that the actual proportion is greater than \(2 \%\). In order to test the firm's claim, the researcher weighs a random sample of 600 packets and carries out a hypothesis test, at the \(5 \%\) significance level, using the null hypothesis \(p = 0.02\).
  1. Given that the researcher's null hypothesis is correct, determine the probability that the researcher will conclude that the firm's claim is incorrect.
  2. The researcher finds that 18 out of the 600 packets are underweight. A colleague says
    " 18 out of 600 is \(3 \%\), so there is evidence that the actual proportion of underweight bags is greater than \(2 \%\)." Criticise this statement.
Question 7
View details
7. The random variable \(X\) was assumed to have a normal distribution with mean \(\mu\). Using a random sample of size 128, a significance test was carried out using the following hypotheses.
\(\mathrm { H } _ { 0 } : \mu = 30\)
\(\mathrm { H } _ { 1 } : \mu > 30\)
It was found that \(\sum x = 3929.6\) and \(\sum x ^ { 2 } = 123483.52\). The conclusion of the test was to reject the null hypothesis.
  1. Determine the range of possible values of the significance level of the test.
  2. It was subsequently found that \(X\) was not normally distributed. Explain whether this invalidates the conclusion of the test.
Question 8
View details
8. A teacher has 10 different mathematics books. Of these books, 5 are on Algebra, 3 are on Calculus and 2 are on Trigonometry. The teacher arranges all 10 books in random order on a shelf.
a) Find the probability that the Calculus books are next to each other and the Trigonometry books are next to each other. \section*{In this question you must show detailed reasoning.} b) Find the probability that 2 of the Calculus books are next to each other but the third Calculus book is separated from the other 2 by at least 1 other book.
Question 9
View details
9. The continuous random variable \(X\) has a uniform distribution on the interval \([ - \pi , \pi ]\).
The random variable \(Y\) is defined by \(Y = \sin X\). Determine the cumulative distribution function of \(Y\). END OF TEST