Sampling method explanation

A question is this type if and only if it asks to describe, justify, or critique a sampling method (systematic, stratified, quota, simple random, etc.).

8 questions

CAIE S2 2020 June Q7
7 A market researcher is investigating the length of time that customers spend at an information desk. He plans to choose a sample of 50 customers on a particular day.
  1. He considers choosing the first 50 customers who visit the information desk. Explain why this method is unsuitable.
    The actual lengths of time, in minutes, that customers spend at the information desk may be assumed to have mean \(\mu\) and variance 4.8. The researcher knows that in the past the value of \(\mu\) was 6.0. He wishes to test, at the \(2 \%\) significance level, whether this is still true. He chooses a random sample of 50 customers and notes how long they each spend at the information desk.
  2. State the probability of making a Type I error and explain what is meant by a Type I error in this context.
  3. Given that the mean time spent at the information desk by the 50 customers is 6.8 minutes, carry out the test.
  4. Give a reason why it was necessary to use the Central Limit theorem in your answer to part (c).
    If you use the following lined page to complete the answer(s) to any question(s), the question number(s) must be clearly shown.
CAIE S2 2015 June Q5
5 The mean breaking strength of cables made at a certain factory is supposed to be 5 tonnes. The quality control department wishes to test whether the mean breaking strength of cables made by a particular machine is actually less than it should be. They take a random sample of 60 cables. For each cable they find the breaking strength by gradually increasing the tension in the cable and noting the tension when the cable breaks.
  1. Give a reason why it is necessary to take a sample rather then testing all the cables produced by the machine.
  2. The mean breaking strength of the 60 cables in the sample is found to be 4.95 tonnes. Given that the population standard deviation of breaking strengths is 0.15 tonnes, test at the \(1 \%\) significance level whether the population mean breaking strength is less than it should be.
  3. Explain whether it was necessary to use the Central Limit theorem in the solution to part (ii).
CAIE S2 2010 November Q7
7
  1. Give a reason why sampling would be required in order to reach a conclusion about
    1. the mean height of adult males in England,
    2. the mean weight that can be supported by a single cable of a certain type without the cable breaking.
  2. The weights, in kg , of sacks of potatoes are represented by the random variable \(X\) with mean \(\mu\) and standard deviation \(\sigma\). The weights of a random sample of 500 sacks of potatoes are found and the results are summarised below. $$n = 500 , \quad \Sigma x = 9850 , \quad \Sigma x ^ { 2 } = 194125 .$$
    1. Calculate unbiased estimates of \(\mu\) and \(\sigma ^ { 2 }\).
    2. A further random sample of 60 sacks of potatoes is taken. Using your values from part (b) (i), find the probability that the mean weight of this sample exceeds 19.73 kg .
    3. Explain whether it was necessary to use the Central Limit Theorem in your calculation in part (b) (ii).
Edexcel S3 2006 June Q1
\begin{enumerate} \item Describe one advantage and one disadvantage of
  1. quota sampling,
  2. simple random sampling. \item A report on the health and nutrition of a population stated that the mean height of three-year old children is 90 cm and the standard deviation is 5 cm . A sample of 100 three-year old children was chosen from the population.
Edexcel S3 Q6
6. As part of her statistics project, Deepa decided to estimate the amount of time A-level students at her school spend on private study each week. She took a random sample of students from those studying Arts subjects, Science subjects and a mixture of Arts and Science subjects. Each student kept a record of the time they spent on private study during the third week of term.
  1. Write down the name of the sampling method used by Deepa.
  2. Give a reason for using this method and give one advantage this method has over simple random sampling. The results Deepa obtained are summarised in the table below.
    Type of studentSample size
    Mean number of
    hours
    Arts1212.6
    Science1214.1
    Mixture810.2
  3. Show that an estimate of the mean time spent on private study by A level students at Deepa’s school, based on these 32 students is 12.56, to 2 decimal places.
    (3 marks) The standard deviation of the time spent on private study by students at the school was 2.48 hours.
  4. Assuming that the number of hours spent on private study is normally distributed, find a 95\% confidence interval for the mean time spent on private study by A level students at Deepa’s school. A member of staff at the school suggested that A level students should spend on average 12 hours each week on private study.
  5. Comment on this suggestion in the light of your interval.
Edexcel S3 Q1
  1. A researcher wishes to take a sample of size 9 , without replacement, from a list of 72 people involved in the trial of a new computer keyboard. She numbers the people from 01 to 72 and uses the table of random numbers given in the formula book. She starts with the left-hand side of the sixth row of the table and works across the row. The first two numbers she writes down are 56 and 32 .
    1. Find the other six numbers in the sample.
    2. Give one advantage and one disadvantage of using random numbers when taking a sample.
      (2 marks)
    3. The length of time that registered customers spend on each visit to a supermarket's website is normally distributed with a mean of 28.5 minutes and a standard deviation of 7.2 minutes.
    Eight visitors to the site are selected at random and the length of time, \(T\) minutes, that each stays is recorded.
  2. Write down the distribution of \(\bar { T }\), the mean time spent at the site by these eight visitors.
    (2 marks)
  3. Find \(\mathrm { P } ( 25 < \bar { T } < 30 )\).
    (4 marks)
Edexcel S3 Q1
  1. A charity has 240 volunteers and wishes to consult a sample of them of size 20 .
    1. Explain briefly how a systematic sample can be taken using random numbers.
    2. Give one advantage and one disadvantage of using systematic sampling compared with simple random sampling.
      (2 marks)
    3. A teacher gives each student in his class a list of 30 numbers. All the numbers have been generated at random by a computer from a normal distribution with a fixed mean and variance. The teacher tells the class that the variance of the distribution is 25 and asks each of them to calculate a \(95 \%\) confidence interval based on their list of numbers.
    The sum of the numbers given to one student is 1419 .
  2. Find the confidence interval that should be obtained by this student. Assuming that all the students calculate their confidence intervals correctly,
  3. state the proportion of the students you would expect to have a confidence interval that includes the true mean of the distribution,
    (1 mark)
  4. explain why the probability of any one student's confidence interval including the true mean is not 0.95
    (1 mark)
Edexcel S3 Q1
  1. A hotel has 160 rooms of which 20 are classified as De-luxe, 40 Premier and 100 as Standard. The manager wants to obtain information about room usage in the hotel by taking a \(10 \%\) sample of the rooms.
    1. Suggest a suitable sampling method.
    2. Explain in detail how the manager should obtain the sample.
    3. A random sample of 100 classical CDs produced by a record company had a mean playing time of 70.6 minutes and a standard deviation of 9.1 minutes. An independent random sample of 120 CDs produced by a different company had a mean playing time of 67.2 minutes with a standard deviation of 8.4 minutes.
    4. Using a \(1 \%\) level of significance, test whether or not there is a difference in the mean playing times of the CDs produced by these two companies. State your hypotheses clearly.
    5. State an assumption you made in carrying out the test in part (a).
    6. The weights of a group of males are normally distributed with mean 80 kg and standard deviation 2.6 kg . A random sample of 10 of these males is selected.
    7. Write down the distribution of \(\bar { M }\), the mean weight, in kg , of this sample.
    8. Find \(\mathrm { P } ( \bar { M } < 78.5 )\).
    The weights of a group of females are normally distributed with mean 59 kg and standard deviation 1.9 kg . A random sample of 6 of the males and 4 of the females enters a lift that can carry a maximum load of 730 kg .
  2. Find the probability that the maximum load will be exceeded when these 10 people enter the lift.
    4. At the end of a season an athletics coach graded a random sample of ten athletes according to their performances throughout the season and their dedication to training. The results, expressed as percentages, are shown in the table below.
    AthletePerformanceDedication
    \(A\)8672
    \(B\)6069
    \(C\)7859
    \(D\)5668
    \(E\)8080
    \(F\)6684
    \(G\)3165
    \(H\)5955
    \(I\)7379
    \(J\)4953
  3. Calculate the Spearman rank correlation coefficient between performance and dedication.
  4. Stating clearly your hypotheses and using a \(10 \%\) level of significance, interpret your rank correlation coefficient.
  5. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data.
    5. The manager of a leisure centre collected data on the usage of the facilities in the centre by its members. A random sample from her records is summarised below.
    FacilityMaleFemale
    Pool4068
    Jacuzzi2633
    Gym5231
    Making your method clear, test whether or not there is any evidence of an association between gender and use of the club facilities. State your hypotheses clearly and use a \(5 \%\) level of significance.
    6. Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
    Number of femalesObserved number of littersExpected number of litters
    010.78
    196.25
    22721.88
    346\(R\)
    449\(S\)
    535\(T\)
    62621.88
    756.25
    820.78
  6. Find the values of \(R , S\) and \(T\).
  7. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a \(5 \%\) level of significance. An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
  8. Explain how this would have affected the test.
    7. The weights of tubs of margarine are known to be normally distributed. A random sample of 10 tubs of margarine were weighed, to the nearest gram, and the results were as follows. $$\begin{array} { l l l l l l l l l l } 498 & 502 & 500 & 496 & 509 & 504 & 511 & 497 & 506 & 499 \end{array}$$
  9. Find unbiased estimates of the mean and the variance of the population from which this sample was taken. Given that the population standard deviation is 5.0 g ,
  10. estimate limits, to 2 decimal places, between which \(90 \%\) of the weights of the tubs lie,
  11. find a \(95 \%\) confidence interval for the mean weight of the tubs. A second random sample of 15 tubs was found to have a mean weight of 501.9 g .
  12. Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the mean weight of these tubs is greater than 500 g . \section*{END} \section*{Items included with question papers Nil} Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
    6685 \section*{Edexcel GCE
    Statistics S3} Advanced/Advanced Subsidiary
    Thursday 5 June 2003 - Morning
    Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Explain how to obtain a sample from a population using
    2. stratified sampling,
    3. quota sampling.
    Give one advantage and one disadvantage of each sampling method.