Edexcel S3 (Statistics 3)

Question 1
View details
  1. A personnel manager has details on all company employees and wishes to consult a sample of them on a possible change to the company's hours of business. She decides to take a stratified sample based on different age groups.
    1. Give one advantage of using stratified sampling in this situation.
    The manager needs to select a sample of size 10 , without replacement, from a list of 65 employees aged 16 to 25 . She numbers these employees from 01 to 65 in alphabetical order and uses the table of random numbers given in the formula book. She starts with the top of the sixth two-digit column and works down. The first two numbers she writes down are 30 and 47.
  2. Find the other eight numbers in the sample.
  3. Suggest another factor that might be useful to consider in deciding on the strata.
    (1 mark)
Question 2
View details
2. A Geography teacher is interested in the link between mathematical ability and the ability to visualise three-dimensional situations. He gives a group of 15 students a test and records each student's score, \(m\), on the mathematics questions and each student's score, \(v\), on the visiospatial questions. He calculates the following summary statistics: $$S _ { m m } = 3747.73 , \quad S _ { v v } = 2791.33 , \quad S _ { m v } = 2564.33$$
  1. Calculate the product moment correlation coefficient for these data.
  2. Stating your hypotheses clearly and using a \(5 \%\) level of significance test the theory that students who are good at Mathematics tend to have better visio-spatial awareness.
    (4 marks)
Question 3
View details
3. A random variable \(X\) is distributed normally with a standard deviation of 6.8 Sixty observations of \(X\) are made and found to have a mean of 31.4
  1. Find a 90\% confidence interval for the mean of \(X\).
  2. How many observations of \(X\) would be needed in order to obtain a \(90 \%\) confidence interval for the mean of \(X\) with a width of less than 1.5
    (5 marks)
Question 4
View details
4. A paranormal investigator invites couples who believe they have a telepathic connection to participate in a trial. With each couple one person looks at a card with one of five shapes on it and the other person says which of the shapes they think it is. This is repeated six times and the number of correct answers recorded. The results from 120 couples are given below.
Number Correct0123456
Number of Couples2656288200
The investigator wishes to see if this data fits a binomial distribution with parameters \(n = 6\) and \(p = \frac { 1 } { 5 }\) and calculates to 2 decimal places the expected frequencies given below.
Number Correct0123456
Expected Frequency9.831.840.180.01
  1. Find the other expected frequencies.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether or not the distribution is an appropriate model.
  3. Comment on your findings.
Question 5
View details
5. A Policy Unit wished to find out whether attitudes to the European Union varied with age. It conducted a survey asking 200 individuals to which of three age groups they belonged and whether they regarded themselves as generally pro-Europe or Eurosceptic. The results are shown in the table below.
\cline { 2 - 3 } \multicolumn{1}{c|}{}Pro-EuropeEurosceptic
\(18 - 34\) years4321
\(35 - 54\) years3036
55 years or over2743
  1. Stating your hypotheses clearly, test at the \(5 \%\) level of significance whether attitudes to Europe are associated with age.
    (11 marks)
    The survey also asked people if they voted at the last election. When the above test was repeated using only the results from those who had voted a value of 4.872 was calculated for \(\sum \frac { ( O - E ) ^ { 2 } } { E }\). No classes were combined.
  2. Find if this value leads to a different result.
Question 6
View details
6. Four swimmers, \(A , B , C\) and \(D\), are to be used in a \(4 \times 100\) metres freestyle relay. The time for each swimmer to complete a leg follows a normal distribution. The mean and standard deviation, in seconds, of the time for each swimmer to complete a leg and the order in which they are to swim are shown in the table below.
meanstandard deviation
\(1 ^ { \text {st } }\) leg \(- A\)63.11.2
\(2 ^ { \text {nd } }\) leg \(- B\)65.71.5
\(3 ^ { \text {rd } } \operatorname { leg } - C\)65.41.8
\(4 ^ { \text {th } }\) leg - \(D\)62.50.9
  1. Find the probability that the total time for first two legs is less than the total time for the last two.
    (6 marks)
    The total time for another team to complete this relay is normally distributed with a mean of 259.0 seconds and a standard deviation of 3.4 seconds. The two teams are to compete over four races.
  2. Find the probability that the first team wins all four races, assuming that the team's performances are not affected by previous results.
    (8 marks)
Question 7
View details
7. A telephone company believes that, for young people, the average length of a telephone call on a land line is longer than on a mobile, due to the difference in price. The company collected data on the time, \(t\) minutes, of 500 calls made by young people on mobiles and the data is summarised by $$\Sigma t = 7335 , \quad \Sigma t ^ { 2 } = 172040 .$$
  1. Calculate unbiased estimates of the mean and variance of \(t\). For 200 calls made on land lines by the same young people, unbiased estimates of the mean and variance of the call length were 15.9 minutes and 108.5 minutes \({ } ^ { 2 }\) respectively.
  2. Stating your hypotheses clearly, test at the \(5 \%\) level whether or not there is evidence that longer calls are made on land lines than on mobiles.
    (9 marks)
  3. Explain the importance of the central limit theorem in carrying out the test in part (b).