Edexcel S3 (Statistics 3)

Question 1
View details
  1. A hotel has 160 rooms of which 20 are classified as De-luxe, 40 Premier and 100 as Standard. The manager wants to obtain information about room usage in the hotel by taking a \(10 \%\) sample of the rooms.
    1. Suggest a suitable sampling method.
    2. Explain in detail how the manager should obtain the sample.
    3. A random sample of 100 classical CDs produced by a record company had a mean playing time of 70.6 minutes and a standard deviation of 9.1 minutes. An independent random sample of 120 CDs produced by a different company had a mean playing time of 67.2 minutes with a standard deviation of 8.4 minutes.
    4. Using a \(1 \%\) level of significance, test whether or not there is a difference in the mean playing times of the CDs produced by these two companies. State your hypotheses clearly.
    5. State an assumption you made in carrying out the test in part (a).
    6. The weights of a group of males are normally distributed with mean 80 kg and standard deviation 2.6 kg . A random sample of 10 of these males is selected.
    7. Write down the distribution of \(\bar { M }\), the mean weight, in kg , of this sample.
    8. Find \(\mathrm { P } ( \bar { M } < 78.5 )\).
    The weights of a group of females are normally distributed with mean 59 kg and standard deviation 1.9 kg . A random sample of 6 of the males and 4 of the females enters a lift that can carry a maximum load of 730 kg .
  2. Find the probability that the maximum load will be exceeded when these 10 people enter the lift.
    4. At the end of a season an athletics coach graded a random sample of ten athletes according to their performances throughout the season and their dedication to training. The results, expressed as percentages, are shown in the table below.
    AthletePerformanceDedication
    \(A\)8672
    \(B\)6069
    \(C\)7859
    \(D\)5668
    \(E\)8080
    \(F\)6684
    \(G\)3165
    \(H\)5955
    \(I\)7379
    \(J\)4953
  3. Calculate the Spearman rank correlation coefficient between performance and dedication.
  4. Stating clearly your hypotheses and using a \(10 \%\) level of significance, interpret your rank correlation coefficient.
  5. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data.
    5. The manager of a leisure centre collected data on the usage of the facilities in the centre by its members. A random sample from her records is summarised below.
    FacilityMaleFemale
    Pool4068
    Jacuzzi2633
    Gym5231
    Making your method clear, test whether or not there is any evidence of an association between gender and use of the club facilities. State your hypotheses clearly and use a \(5 \%\) level of significance.
    6. Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
    Number of femalesObserved number of littersExpected number of litters
    010.78
    196.25
    22721.88
    346\(R\)
    449\(S\)
    535\(T\)
    62621.88
    756.25
    820.78
  6. Find the values of \(R , S\) and \(T\).
  7. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a \(5 \%\) level of significance. An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
  8. Explain how this would have affected the test.
    7. The weights of tubs of margarine are known to be normally distributed. A random sample of 10 tubs of margarine were weighed, to the nearest gram, and the results were as follows. $$\begin{array} { l l l l l l l l l l } 498 & 502 & 500 & 496 & 509 & 504 & 511 & 497 & 506 & 499 \end{array}$$
  9. Find unbiased estimates of the mean and the variance of the population from which this sample was taken. Given that the population standard deviation is 5.0 g ,
  10. estimate limits, to 2 decimal places, between which \(90 \%\) of the weights of the tubs lie,
  11. find a \(95 \%\) confidence interval for the mean weight of the tubs. A second random sample of 15 tubs was found to have a mean weight of 501.9 g .
  12. Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the mean weight of these tubs is greater than 500 g . \section*{END} \section*{Items included with question papers Nil} Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
    6685 \section*{Edexcel GCE
    Statistics S3} Advanced/Advanced Subsidiary
    Thursday 5 June 2003 - Morning
    Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Explain how to obtain a sample from a population using
    2. stratified sampling,
    3. quota sampling.
    Give one advantage and one disadvantage of each sampling method.
Question 5
View details
5. The manager of a leisure centre collected data on the usage of the facilities in the centre by its members. A random sample from her records is summarised below.
FacilityMaleFemale
Pool4068
Jacuzzi2633
Gym5231
Making your method clear, test whether or not there is any evidence of an association between gender and use of the club facilities. State your hypotheses clearly and use a \(5 \%\) level of significance.
Question 6
View details
6. Data were collected on the number of female puppies born in 200 litters of size 8. It was decided to test whether or not a binomial model with parameters \(n = 8\) and \(p = 0.5\) is a suitable model for these data. The following table shows the observed frequencies and the expected frequencies, to 2 decimal places, obtained in order to carry out this test.
Number of femalesObserved number of littersExpected number of litters
010.78
196.25
22721.88
346\(R\)
449\(S\)
535\(T\)
62621.88
756.25
820.78
  1. Find the values of \(R , S\) and \(T\).
  2. Carry out the test to determine whether or not this binomial model is a suitable one. State your hypotheses clearly and use a \(5 \%\) level of significance. An alternative test might have involved estimating \(p\) rather than assuming \(p = 0.5\).
  3. Explain how this would have affected the test.
Question 7
View details
7. The weights of tubs of margarine are known to be normally distributed. A random sample of 10 tubs of margarine were weighed, to the nearest gram, and the results were as follows. $$\begin{array} { l l l l l l l l l l } 498 & 502 & 500 & 496 & 509 & 504 & 511 & 497 & 506 & 499 \end{array}$$
  1. Find unbiased estimates of the mean and the variance of the population from which this sample was taken. Given that the population standard deviation is 5.0 g ,
  2. estimate limits, to 2 decimal places, between which \(90 \%\) of the weights of the tubs lie,
  3. find a \(95 \%\) confidence interval for the mean weight of the tubs. A second random sample of 15 tubs was found to have a mean weight of 501.9 g .
  4. Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the mean weight of these tubs is greater than 500 g . \section*{END} \section*{Items included with question papers Nil} Answer Book (AB16)
    Graph Paper (ASG2)
    Mathematical Formulae (Lilac) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. Paper Reference(s)
    6685 \section*{Edexcel GCE
    Statistics S3} Advanced/Advanced Subsidiary
    Thursday 5 June 2003 - Morning
    Time: \(\mathbf { 1 }\) hour \(\mathbf { 3 0 }\) minutes In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions. You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. Explain how to obtain a sample from a population using
    2. stratified sampling,
    3. quota sampling.
    Give one advantage and one disadvantage of each sampling method.
    2. A random sample of 30 apples was taken from a batch. The mean weight of the sample was 124 g with standard deviation 20 g .
  5. Find a \(99 \%\) confidence interval for the mean weight \(\mu\) grams of the population of apples. Write down any assumptions you made in your calculations. Given that the actual value of \(\mu\) is 140 ,
  6. state, with a reason, what you can conclude about the sample of 30 apples.
    3. Given the random variables \(X \sim \mathrm {~N} ( 20,5 )\) and \(Y \sim \mathrm {~N} ( 10,4 )\) where \(X\) and \(Y\) are independent, find
  7. \(\mathrm { E } ( X - Y )\),
  8. \(\operatorname { Var } ( X - Y )\),
  9. \(\mathrm { P } ( 13 < X - Y < 16 )\).
    4. A new drug to treat the common cold was used with a randomly selected group of 100 volunteers. Each was given the drug and their health was monitored to see if they caught a cold. A randomly selected control group of 100 volunteers was treated with a dummy pill. The results are shown in the table below.
  10. Write down a suitable model for \(X\).
  11. Test, at the \(1 \%\) level of significance, the suitability of your model for these data.
  12. Explain how the test would have been modified if it had not been assumed that the dice were fair.
    7. The random variable \(D\) is defined as $$D = A - 3 B + 4 C$$ where \(A \sim \mathrm {~N} \left( 5,2 ^ { 2 } \right) , B \sim \mathrm {~N} \left( 7,3 ^ { 2 } \right)\) and \(C \sim \mathrm {~N} \left( 9,4 ^ { 2 } \right)\), and \(A , B\) and \(C\) are independent.
  13. Find \(\mathrm { P } ( \mathrm { D } < 44 )\). The random variables \(B _ { 1 } , B _ { 2 }\) and \(B _ { 3 }\) are independent and each has the same distribution as \(B\). The random variable \(X\) is defined as $$X = A - \sum _ { i = 1 } ^ { 3 } B _ { i } + 4 C$$
  14. Find \(\mathrm { P } ( X > 0 )\). \section*{END} \section*{6685/01 6691/01
    Edexcel GCE} \section*{Thursday 9 June 2005 - Morning} Materials required for examination
    Mathematical Formulae (Lilac)
    Graph Paper (ASG2) Candidates may use any calculator EXCEPT those with the facility for symbolic algebra, differentiation and/or integration. Thus candidates may NOT use calculators such as the Texas Instruments TI 89, TI 92, Casio CFX 9970G, Hewlett Packard HP 48G. In the boxes on the answer book, write the name of the examining body (Edexcel), your centre number, candidate number, the unit title (Statistics S3), the paper reference (6685), your surname, other name and signature.
    Values from the statistical tables should be quoted in full. When a calculator is used, the answer should be given to an appropriate degree of accuracy. A booklet 'Mathematical Formulae and Statistical Tables' is provided.
    Full marks may be obtained for answers to ALL questions.
    This paper has seven questions.
    The total mark for this paper is 75 . Items included with question papers
    Nil
    Nil You must ensure that your answers to parts of questions are clearly labelled.
    You must show sufficient working to make your methods clear to the Examiner. Answers without working may gain no credit.
    1. A researcher carried out a survey of three treatments for a fruit tree disease. The contingency table below shows the results of a survey of a random sample of 60 diseased trees.
    Using a \(5 \%\) significance level, test whether or not there is an association between gender and acceptance or rejection of an annual flu injection. State your hypotheses clearly.
    5. Upon entering a school, a random sample of eight girls and an independent random sample of eighty boys were given the same examination in mathematics. The girls and boys were then taught in separate classes. After one year, they were all given another common examination in mathematics. The means and standard deviations of the boys' and the girls' marks are shown in the table.
  15. Find, to 3 decimal places, the Spearman rank correlation coefficient between the distance of the shop from the tourist attraction and the price of an ice cream.
  16. Stating your hypotheses clearly and using a \(5 \%\) one-tailed test, interpret your rank correlation coefficient.
    5. The workers in a large office block use a lift that can carry a maximum load of 1090 kg . The weights of the male workers are normally distributed with mean 78.5 kg and standard deviation 12.6 kg . The weights of the female workers are normally distributed with mean 62.0 kg and standard deviation 9.8 kg . Random samples of 7 males and 8 females can enter the lift.
  17. Find the mean and variance of the total weight of the 15 people that enter the lift.
  18. Comment on any relationship you have assumed in part (a) between the two samples.
  19. Find the probability that the maximum load of the lift will be exceeded by the total weight of the 15 people.
    6. A research worker studying colour preference and the age of a random sample of 50 children obtained the results shown below.
    Age in yearsRedBlueTotals
    412618
    810717
    126915
    Totals282250
    Using a \(5 \%\) significance level, carry out a test to decide whether or not there is an association between age and colour preference. State your hypotheses clearly.
    7. A machine produces metal containers. The weights of the containers are normally distributed. A random sample of 10 containers from the production line was weighed, to the nearest 0.1 kg , and gave the following results $$\begin{array} { l l l l l } 49.7 , & 50.3 , & 51.0 , & 49.5 , & 49.9
    50.1 , & 50.2 , & 50.0 , & 49.6 , & 49.7 . \end{array}$$
  20. Find unbiased estimates of the mean and variance of the weights of the population of metal containers. The machine is set to produce metal containers whose weights have a population standard deviation of 0.5 kg .
  21. Estimate the limits between which \(95 \%\) of the weights of metal containers lie.
  22. Determine the \(99 \%\) confidence interval for the mean weight of metal containers.
Question 8
View details
8. Five coins were tossed 100 times and the number of heads recorded. The results are shown in the table below.
  1. Calculate Spearman's rank correlation coefficient for the marks awarded by the two judges. After the show, one competitor complained about the judges. She claimed that there was no positive correlation between their marks.
  2. Stating your hypotheses clearly, test whether or not this sample provides support for the competitor's claim. Use a \(5 \%\) level of significance.
    (4)
    2. The Director of Studies at a large college believed that students' grades in Mathematics were independent of their grades in English. She examined the results of a random group of candidates who had studied both subjects and she recorded the number of candidates in each of the 6 categories shown. Showing your working clearly, test, at the \(1 \%\) level of significance, whether or not there is an association between gender and the type of course taken. State your hypotheses clearly.
    3. The product moment correlation coefficient is denoted by \(r\) and Spearman's rank correlation coefficient is denoted by \(r _ { s }\).
  3. Sketch separate scatter diagrams, with five points on each diagram, to show
    1. \(r = 1\),
    2. \(r _ { s } = - 1\) but \(r > - 1\). Two judges rank seven collie dogs in a competition. The collie dogs are labelled \(A\) to \(G\) and the rankings are as follows.
  4. Calculate Spearman's rank correlation coefficient for these data.
  5. Stating your hypotheses clearly and using a one tailed test with a \(5 \%\) level of significance, interpret your rank correlation coefficient.
  6. Give a reason to support the use of the rank correlation coefficient rather than the product moment correlation coefficient with these data.
    (1)
    4. A sample of size 8 is to be taken from a population that is normally distributed with mean 55 and standard deviation 3 . Find the probability that the sample mean will be greater than 57 .
    (5)
    5. The number of goals scored by a football team is recorded for 100 games. The results are summarised in Table 1 below. \begin{table}[h]
  7. Calculate Spearman's rank correlation coefficient between \(b\) and \(s\).
  8. Stating your hypotheses clearly, test whether or not the data provides support for the researcher's claim. Use a \(1 \%\) level of significance.
    (4)
    5. A random sample of 100 people were asked if their finances were worse, the same or better than this time last year. The sample was split according to their annual income and the results are shown in the table below.
  9. Calculate the Spearman's rank correlation coefficient between \(h\) and \(c\). After collecting the data, the councillor thinks there is no correlation between hardship and the number of calls to the emergency services.
  10. Test, at the \(5 \%\) level of significance, the councillor's claim. State your hypotheses clearly.
    3. A factory manufactures batches of an electronic component. Each component is manufactured in one of three shifts. A component may have one of two types of defect, \(D _ { 1 }\) or \(D _ { 2 }\), at the end of the manufacturing process. A production manager believes that the type of defect is dependent upon the shift that manufactured the component. He examines 200 randomly selected defective components and classifies them by defect type and shift. The results are shown in the table below.
  11. Calculate Spearman's rank correlation coefficient for these data.
  12. Test, at the \(5 \%\) level of significance, whether there is agreement between the rankings awarded by each manager. State your hypotheses clearly. Manager \(Y\) later discovered he had miscopied his score for candidate \(D\) and it should be 54 .
  13. Without carrying out any further calculations, explain how you would calculate Spearman rank correlation in this case.
    (2)
    2. A lake contains 3 species of fish. There are estimated to be 1400 trout, 600 bass and 450 pike in the lake. A survey of the health of the fish in the lake is carried out and a sample of 30 fish is chosen.
  14. Give a reason why stratified random sampling cannot be used.
  15. State an appropriate sampling method for the survey.
  16. Give one advantage and one disadvantage of this sampling method.
  17. Explain how this sampling method could be used to select the sample of 30 fish. You must show your working.
    (4)
    3. (a) Explain what you understand by the Central Limit Theorem. A garage services hire cars on behalf of a hire company. The garage knows that the lifetime of the brake pads has a standard deviation of 5000 miles. The garage records the lifetimes, \(x\) miles, of the brake pads it has replaced. The garage takes a random sample of 100 brake pads and finds that \(\sum x = 1740000\).
  18. Find a 95\% confidence interval for the mean lifetime of a brake pad.
  19. Explain the relevance of the Central Limit Theorem in part (b). Brake pads are made to be changed very 20000 miles on average. The hire car company complain that the garage is changing the brake pads too soon.
  20. Comment on the hire company's complaint. Give a reason for your answer.
    4. Two breeds of chicken are surveyed to measure their egg yield. The results are shown in the table below.
  21. Find, to 3 decimal places, Spearman's rank correlation coefficient between the population and the number of council employees.
  22. Use your value of Spearman's rank correlation coefficient to test for evidence of a positive correlation between the population and the number of council employees. Use a \(2.5 \%\) significance level. State your hypotheses clearly. It is suggested that a product moment correlation coefficient would be a more suitable calculation in this case. The product moment correlation coefficient for these data is 0.627 to 3 decimal places.
  23. Use the value of the product moment correlation coefficient to test for evidence of a positive correlation between the population and the number of council employees. Use a \(2.5 \%\) significance level.
  24. Interpret and comment on your results from part (b) and part (c).
    4. John thinks that a person's eye colour is related to their hair colour. He takes a random sample of 600 people and records their eye and hair colours. The results are shown in Table 1. \begin{table}[h] Using a \(5 \%\) level of significance, test whether or not there is an association between cholesterol level and intake of saturated fats. State your hypotheses and show your working clearly.
    2. The table below shows the number of students per member of staff and the student satisfaction scores for 7 universities.
  25. Calculate Spearman's rank correlation coefficient for these data. The journalist believes that car models with higher fuel efficiency will achieve higher sales.
  26. Stating your hypotheses clearly, test whether or not the data support the journalist's belief. Use a \(5 \%\) level of significance.
  27. State the assumption necessary for a product moment correlation coefficient to be valid in this case.
    (1)
  28. The mean and median fuel efficiencies of the car models in the random sample are \(14.5 \mathrm {~km} /\) litre and \(15.65 \mathrm {~km} /\) litre respectively. Considering these statistics, as well as the distribution of the fuel efficiency data, state whether or not the data suggest that the assumption in part (c) might be true in this case. Give a reason for your answer.
    (No further calculations are required.)
    2. A survey asked a random sample of 200 people their age and the main use of their mobile phone. The results are shown in Table 1 below. \begin{table}[h] Stating your hypotheses, test at the \(5 \%\) level of significance, whether or not there is evidence of an association between happiness and gender. Show your working clearly.
    4. The random variable \(A\) is defined as $$A = B + 4 C - 3 D$$ where \(B\), \(C\) and \(D\) are independent random variables with $$B \sim \mathrm {~N} \left( 6,2 ^ { 2 } \right) \quad C \sim \mathrm {~N} \left( 7,3 ^ { 2 } \right) \quad D \sim \mathrm {~N} \left( 4,1.5 ^ { 2 } \right)$$ Find \(\mathrm { P } ( A < 45 )\).
    5. A research station is doing some work on the germination of a new variety of genetically modified wheat. They planted 120 rows containing 7 seeds in each row.
    The number of seeds germinating in each row was recorded. The results are as follows Starting with the top left-hand corner (319) and working across, the committee selects 50 random numbers. The first 2 suitable numbers are 241 and 278 . Numbers greater than 300 are ignored.
  29. Find the next two suitable numbers. When the club's committee looks at the members corresponding to their random numbers they find that only 1 female has been selected.
    The committee does not want to be accused of being biased towards males so considers using a systematic sample instead.
    1. Explain clearly how the committee could take a systematic sample.
    2. Explain why a systematic sample may not give a sample that represents the proportion of males and females in the club. The committee decides to use a stratified sample instead.
  30. Describe how to choose members for the stratified sample.
  31. Explain an advantage of using a stratified sample rather than a quota sample.
    2. The random variable \(X\) follows a continuous uniform distribution over the interval \([ \alpha - 3,2 \alpha + 3 ]\) where \(\alpha\) is a constant.
    The mean of a random sample of size \(n\) is denoted by \(\bar { X }\).
  32. Show that \(\bar { X }\) is a biased estimator of \(\alpha\), and state the bias. Given that \(Y = k \bar { X }\) is an unbiased estimator for \(\alpha\),
  33. find the value of \(k\). A random sample of 10 values of \(X\) is taken and the results are as follows $$\begin{array} { l l l l l l l l l l } 3 & 5 & 8 & 12 & 4 & 13 & 10 & 8 & 5 & 12 \end{array}$$
  34. Hence estimate the maximum value of \(X\).
    3. A grocer believes that the average weight of a grapefruit from farm \(A\) is greater than the average weight of a grapefruit from farm \(B\). The weights, in grams, of 80 grapefruit selected at random from farm \(A\) have a mean value of 532 g and a standard deviation, \(s _ { A }\), of 35 g . A random sample of 100 grapefruit from farm \(B\) have a mean weight of 520 g and a standard deviation, \(S _ { B }\), of 28 g . Stating your hypotheses clearly and using a \(1 \%\) level of significance, test whether or not the grocer's belief is supported by the data.
    4. In a survey 10 randomly selected men had their systolic blood pressure, \(x\), and weight, \(w\), measured. Their results are as follows:
    Man\(\boldsymbol { A }\)\(\boldsymbol { B }\)\(\boldsymbol { C }\)\(\boldsymbol { D }\)\(\boldsymbol { E }\)\(\boldsymbol { F }\)\(\boldsymbol { G }\)\(\boldsymbol { H }\)\(\boldsymbol { I }\)\(\boldsymbol { J }\)
    \(x\)123128137143149153154159162168
    \(w\)78938583759888879599
  35. Calculate the value of Spearman's rank correlation coefficient between \(x\) and \(w\).
  36. Stating your hypotheses clearly, test at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight. The product moment correlation coefficient for these data is 0.5114 .
  37. Use the value of the product moment correlation coefficient to test, at the \(5 \%\) level of significance, whether or not there is evidence of a positive correlation between systolic blood pressure and weight.
  38. Using your conclusions to part (b) and part (c), describe the relationship between systolic blood pressure and weight.
    5. A random sample of 200 people were asked which hot drink they preferred from tea, coffee and hot chocolate. The results are given below.
    \multirow{2}{*}{}Type of drink preferred\multirow{2}{*}{Total}
    TeaCoffeeHot Chocolate
    \multirow{2}{*}{Gender}Males57261194
    Females424717106
    Total997328200
  39. Test, at the \(5 \%\) significance level, whether or not there is an association between type of drink preferred and gender. State your hypotheses and show your working clearly. You should state your expected frequencies to 2 decimal places.
  40. State what difference using a \(0.5 \%\) significance level would make to your conclusion. Give a reason for your answer.
    6. Eight tasks were given to each of 125 randomly selected job applicants. The number of tasks failed by each applicant is recorded. The results are as follows:
    Number of tasks
    failed by an
    applicant
    012345
    6 or
    more
    Frequency22145421230
  41. Show that the probability of a randomly selected task, from this sample, being failed is 0.3 . An employer believes that a binomial distribution might provide a good model for the number of tasks, out of 8 , that an applicant fails. He uses a binomial distribution, with the estimated probability 0.3 of a task being failed. The calculated expected frequencies are as follows
    Number of tasks
    failed by an
    applicant
    012345
    6 or
    more
    Frequency7.2124.7137.06\(r\)17.025.83\(s\)
  42. Find the value of \(r\) and the value of \(s\) giving your answers to 2 decimal places.
  43. Test, at the \(5 \%\) level of significance, whether or not a binomial distribution is a suitable model for these data. State your hypotheses and show your working clearly. The employer believes that all applicants have the same probability of failing each task.
  44. Use your result from part (c) to comment on this belief.
    7. The random variable \(X\) is defined as $$X = 4 Y - 3 W$$ where \(Y \sim \mathrm {~N} \left( 40,3 ^ { 2 } \right) , W \sim \mathrm {~N} \left( 50,2 ^ { 2 } \right)\) and \(Y\) and \(W\) are independent.
  45. Find \(\mathrm { P } ( X > 25 )\). The random variables \(Y _ { 1 } , Y _ { 2 }\) and \(Y _ { 3 }\) are independent and each has the same distribution as \(Y\). The random variable \(A\) is defined as $$A = \sum _ { i = 1 } ^ { 3 } Y _ { i }$$ The random variable \(C\) is such that \(C \sim \mathrm {~N} \left( 115 , \sigma ^ { 2 } \right)\).
    Given that \(\mathrm { P } ( A - C < 0 ) = 0.2\) and that \(A\) and \(C\) are independent,
  46. find the variance of \(C\).