SPS SPS ASFM Statistics (SPS ASFM Statistics) 2025 January

Question 1
View details
  1. \(\mathrm { E } ( a X + b Y + c ) = a \mathrm { E } ( X ) + b \mathrm { E } ( Y ) + c\),
  2. if \(X\) and \(Y\) are independent then \(\operatorname { Var } ( a X + b Y + c ) = a ^ { 2 } \operatorname { Var } ( X ) + b ^ { 2 } \operatorname { Var } ( Y )\).
\section*{Non-parametric tests} Goodness-of-fit test and contingency tables: \(\sum \frac { \left( O _ { i } - E _ { i } \right) ^ { 2 } } { E _ { i } } \sim \chi _ { v } ^ { 2 }\)
Approximate distributions for large samples
Wilcoxon Signed Rank test: \(T \sim \mathrm {~N} \left( \frac { 1 } { 4 } n ( n + 1 ) , \frac { 1 } { 24 } n ( n + 1 ) ( 2 n + 1 ) \right)\)
Wilcoxon Rank Sum test (samples of sizes \(m\) and \(n\), with \(m \leq n\) ): $$W \sim \mathrm {~N} \left( \frac { 1 } { 2 } m ( m + n + 1 ) , \frac { 1 } { 12 } m n ( m + n + 1 ) \right)$$ \section*{Discrete distributions} \(X\) is a random variable taking values \(x _ { i }\) in a discrete distribution with \(\mathrm { P } \left( X = x _ { i } \right) = p _ { i }\)
Expectation: \(\mu = \mathrm { E } ( X ) = \sum x _ { i } p _ { i }\)
Variance: \(\sigma ^ { 2 } = \operatorname { Var } ( X ) = \sum \left( x _ { i } - \mu \right) ^ { 2 } p _ { i } = \sum x _ { i } ^ { 2 } p _ { i } - \mu ^ { 2 }\) \(n = 8 \quad \sum p = 28.5 \quad \sum q = 26.7 \quad \sum p ^ { 2 } = 136.35 \quad \sum q ^ { 2 } = 116.35 \quad \sum p q = 116.70\)
\includegraphics[max width=\textwidth, alt={}, center]{76f751ed-394d-41cb-b98f-bc8efcf3365e-08_705_1164_1139_267}
  1. State which, if either, of the variables \(p\) and \(q\) is independent.
  2. Calculate the equation of the regression line of \(q\) on \(p\).
    1. Use the regression line to estimate the value of \(q\) for an investment account for which \(p = 2.5\).
    2. Give two reasons why this estimate could be considered reliable.
  3. Comment on the reliability of using the regression line to predict the value of \(q\) when \(p = 7.0\). Total: \(\_\_\_\_\) / 9 marks \section*{Question 4} After a holiday organised for a group, the company organising the holiday obtained scores out of 10 for six different aspects of the holiday. The company obtained responses from 100 couples and 100 single travellers. The total scores for each of the aspects are given in the following table. After further investigation, the statistician decides to use a different model for the distribution of \(F\). In this model it is now assumed that \(\mathrm { P } ( F = 0 )\) is still 0.200 , but that if one failure occurs, there is an increased probability that further failures occur.
  4. Explain the effect of this assumption on the value of \(\mathrm { P } ( F = 1 )\). Total: \(\_\_\_\_\) / 10 marks \section*{Question 6} In a fashion competition, two judges gave marks to a large number of contestants.
    The value of Spearman's rank correlation coefficient, \(r _ { s }\), between the marks given to 7 randomly chosen contestants is \(\frac { 27 } { 28 }\).
  5. An excerpt from the table of critical values of \(r _ { s }\) is shown below. \section*{Critical values of Spearman's rank correlation coefficient}
    1-tail test5\%2.5\%1\%0.5\%
    2-tail test10\%5\%2\%1\%
    \multirow{3}{*}{\(n\)}60.82860.88570.94291.0000
    70.71430.78570.89290.9286
    80.64290.73810.83330.8810
    Test whether there is evidence, at the \(1 \%\) significance level, that the judges agree with each another. The marks given by the two judges to the 7 randomly chosen contestants were as follows, where \(x\) is an integer.
    Contestant\(A\)\(B\)\(C\)\(D\)\(E\)\(F\)\(G\)
    Judge 164656778798086
    Judge 2616378808190\(x\)
  6. Use the value \(r _ { s } = \frac { 27 } { 28 }\) to determine the range of possible values of \(x\).
  7. Give a reason why it might be preferable to use the product moment correlation coefficient rather than Spearman's rank correlation coefficient in this context. Total: \(\_\_\_\_\) / 9 marks \section*{Question 7} A bag contains \(2 m\) yellow and \(m\) green counters. Three counters are chosen at random, without replacement. The probability that exactly two of the three counters are yellow is \(\frac { 28 } { 55 }\). Determine the value of \(m\). Total: \(\_\_\_\_\) End of Paper