Edexcel S3 (Statistics 3) 2014 June

Question 1
View details
  1. A tennis club's committee wishes to select a sample of 50 members to fill in a questionnaire about the club's facilities. The 300 members, of whom 180 are males, are listed in alphabetical order and numbered \(1 - 300\) in the club’s membership book.
The club's committee decides to use a random number table to obtain its sample.
The first three lines of the random number table used are given below.
319952241343278811394165008413063179749
722962334461267114806992414837837657339
470684554127067459142920144575311605412
Starting with the top left-hand corner (319) and working across, the committee selects 50 random numbers. The first 2 suitable numbers are 241 and 278. Numbers greater than 300 are ignored.
  1. Find the next two suitable numbers. When the club's committee looks at the members corresponding to their random numbers they find that only 1 female has been selected.
    The committee does not want to be accused of being biased towards males so considers using a systematic sample instead.
    1. Explain clearly how the committee could take a systematic sample.
    2. Explain why a systematic sample may not give a sample that represents the proportion of males and females in the club. The committee decides to use a stratified sample instead.
  2. Describe how to choose members for the stratified sample.
  3. Explain an advantage of using a stratified sample rather than a quota sample.
Question 2
View details
2. The random variable \(X\) follows a continuous uniform distribution over the interval \([ \alpha - 3,2 \alpha + 3 ]\) where \(\alpha\) is a constant.
The mean of a random sample of size \(n\) is denoted by \(\bar { X }\)
  1. Show that \(\bar { X }\) is a biased estimator of \(\alpha\), and state the bias. Given that \(Y = k \bar { X }\) is an unbiased estimator for \(\alpha\)
  2. find the value of \(k\). A random sample of 10 values of \(X\) is taken and the results are as follows $$\begin{array} { l l l l l l l l l l } 3 & 5 & 8 & 12 & 4 & 13 & 10 & 8 & 5 & 12 \end{array}$$
  3. Hence estimate the maximum value of \(X\)
Question 3
View details
3. A grocer believes that the average weight of a grapefruit from farm \(A\) is greater than the average weight of a grapefruit from farm \(B\). The weights, in grams, of 80 grapefruit selected at random from farm \(A\) have a mean value of 532 g and a standard deviation, \(s _ { A }\), of 35 g . A random sample of 100 grapefruit from farm \(B\) have a mean weight of 520 g and a standard deviation, \(s _ { B }\), of 28 g . Stating your hypotheses clearly and using a 1\% level of significance, test whether or not the grocer's belief is supported by the data.
Question 4
View details
4. In a survey 10 randomly selected men had their systolic blood pressure, \(x\), and weight, \(w\), measured. Their results are as follows
Man\(\boldsymbol { A }\)\(\boldsymbol { B }\)\(\boldsymbol { C }\)\(\boldsymbol { D }\)\(\boldsymbol { E }\)\(\boldsymbol { F }\)\(\boldsymbol { G }\)\(\boldsymbol { H }\)\(\boldsymbol { I }\)\(\boldsymbol { J }\)
\(x\)123128137143149153154159162168
\(w\)78938583759888879599
  1. Calculate the value of Spearman's rank correlation coefficient between \(x\) and \(w\).
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight. The product moment correlation coefficient for these data is 0.5114
  3. Use the value of the product moment correlation coefficient to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight.
  4. Using your conclusions to part (b) and part (c), describe the relationship between systolic blood pressure and weight.
Question 5
View details
  1. A random sample of 200 people were asked which hot drink they preferred from tea, coffee and hot chocolate. The results are given below.
\cline { 3 - 6 } \multicolumn{2}{|c|}{}
\multirow{2}{*}{Total}
\cline { 3 - 5 } \multicolumn{2}{|c|}{}TeaCoffeeHot Chocolate
\multirow{2}{*}{Gender}Males57261194
\cline { 2 - 6 }Females424717106
Total997328200
  1. Test, at the \(5 \%\) significance level, whether or not there is an association between type of drink preferred and gender. State your hypotheses and show your working clearly. You should state your expected frequencies to 2 decimal places.
  2. State what difference using a \(0.5 \%\) significance level would make to your conclusion. Give a reason for your answer.
Question 6
View details
6. Eight tasks were given to each of 125 randomly selected job applicants. The number of tasks failed by each applicant is recorded. The results are as follows
Number of tasks failed by an applicant0123456 or more
Frequency22145421230
  1. Show that the probability of a randomly selected task, from this sample, being failed is 0.3 An employer believes that a binomial distribution might provide a good model for the number of tasks, out of 8, that an applicant fails. He uses a binomial distribution, with the estimated probability 0.3 of a task being failed. The calculated expected frequencies are as follows
    Number of tasks failed by an applicant0123456 or more
    Expected frequency7.2124.7137.06\(r\)17.025.83\(s\)
  2. Find the value of \(r\) and the value of \(s\) giving your answers to 2 decimal places.
  3. Test, at the \(5 \%\) level of significance, whether or not a binomial distribution is a suitable model for these data. State your hypotheses and show your working clearly. The employer believes that all applicants have the same probability of failing each task.
  4. Use your result from part(c) to comment on this belief.
Question 7
View details
7. The random variable \(X\) is defined as $$X = 4 Y - 3 W$$ where \(Y \sim \mathrm {~N} \left( 40,3 ^ { 2 } \right) , W \sim \mathrm {~N} \left( 50,2 ^ { 2 } \right)\) and \(Y\) and \(W\) are independent.
  1. Find \(\mathrm { P } ( X > 25 )\) The random variables \(Y _ { 1 } , Y _ { 2 }\) and \(Y _ { 3 }\) are independent and each has the same distribution as \(Y\). The random variable \(A\) is defined as $$A = \sum _ { i = 1 } ^ { 3 } Y _ { i }$$ The random variable \(C\) is such that \(C \sim \mathrm {~N} \left( 115 , \sigma ^ { 2 } \right)\) Given that \(\mathrm { P } ( A - C < 0 ) = 0.2\) and that \(A\) and \(C\) are independent,
  2. find the variance of \(C\).