MTB>random 9 c1-c20;SUBC>normal 68.71 3.MTB>tinterval 0.90 c1-c20a. How many of your intervals containμ? 17 (any number greater than 14 is acceptable.b.Would you expect all 20 of the intervals to containμ? [0.5]No. Why? Expected is(20)(0.90)=18 [1.5]c.Do all the intervals have the same width? No. Why (what is the theoretical width)?2(t0.05)s/√nwheresis changed from sample to sample. d. Suppose you took 95% intervals instead of 90%. Would they be narrower or wider?Widere. How many of your intervals contain the value 71? 6, but any number between 5 and16 is acceptable.f.Suppose you took samples of sizen= 64 instead ofn= 9. Would you expect more orfewer intervals to contain 71?  Fewer. What about 68.71? Same What about the widthof the intervals forn= 64: Would they be narrower or wider than forn= 9?  Narrower.3.Hypothesis testing forμwhenσis knownImaging choosingn= 16 women at random from a large population and measuring theirheights. Assume that the heights of the women in this population are normal withμ= 63.8inches andσ= 3 inches. Suppose you then test the null hypothesisH0:μ= 63.8 versus thealternative thatHa:μ6= 63.8, usingα= 0.10. Assumeσis known. Simulate the results ofdoing this test 30 times as follows:MTB>random 16 c1-c30;SUBC>normal 63.8 3.MTB>ztest 63.8 3 c1-c30a. In how many tests did you rejectH0. That is, how many times did you make an“incorrect decision”? I had 3 p-values less than 0.10, but any number≤8 is acceptable.
STAT2507 . Final Examination December 2010 1 STAT2507 Final Examination December 2010 2 Part I: Multiple-Choice Questions. For each question, circle only one of the proposed choices. Each multiple-choice question has 3 marks 1. In order to test H0 : µ = 50 vs Ha : µ 6= 50, a random sample of 9 observations (from a normally distributed population) is obtained, yielding x¯ = 61 and s = 21. What is the P-value of the test? (a) greater than 0.1 (b) between 0.05 and 0.10 (c) between 0.01 and 0.05 (d) less than 0.01. 2. Which of the following is a consequence of the Central Limit Theorem? (a) A large population will be normally distributed with mean µ and variance σ 2 /n (b) A large sample will be normally distributed with mean µ and variance σ 2 /n ¯ is approximately normally distributed with mean µ and (c) The sample mean X 2 variance σ /n ¯ is approximately binomially distributed with mean µ and (d) The sample mean X variance σ 2 /n 3. Suppose we repeat an experiment identically and independently 100 times. Each time we construct a 99% confidence interval for µ via the t-distribution. Let X = the number of times the confidence interval fails to contain the true value of µ. The distribution of X is (a) Normal with µ=99 and σ 2 =0.99 (b) Normal with µ=0 and σ 2 =1 (c) Binomial with n=100 and p=0.99 (d) Binomial with n=100 and p=0.01 4. Under which of the following conditions is the t-distribution used in statistical inference? (a) The population standard deviation is not known and the sample is small. (b) The population standard deviation is known. (c) The sample is not random or the population distribution is strongly skewed. (d) The sample size is at least 30. 5. Which of the following randomly selected measurements, X, might be considered a potential outlier if it was selected from the given population? (a) X=0 from a population with µ =0 and σ =2 (b) X=−5 from a population with µ =1 and σ =4 (c) X=7 from a population with µ =4 and σ =2 (d) X=4 from a population with µ =0 and σ =1 6. After taking 90 observations, you construct a 90% confidence interval for µ. You are told that your interval is 3 times too wide (i.e., your interval is 3 times wider than what was required). Your sample size should have been (a) 30 (b) 270 (c) 810 (d) 10 7. In a histogram, the proportion of the total area which must be to the right of the mean is (a) Less than 0.50 if the histogram is skewed to the left (b) Exactly 0.50 (always) (c) Exactly 0.50 if the histogram is symmetric and unimodal (d) More than 0.50 if the histogram is skewed to the right. 8. Which of the following is an example of a binomial experiment? (a) A shopping mall is interested in the income levels of its customers and is taking a survey to gather information. STAT2507 Final Examination December 2010 3 (b) A business firm introducing a new product wants to know how many purchases its clients will make each year. (c) A sociologist is researching an area in an effort to determine the proportion of households with a male head of household. (d) A study is concerned with the average number of hours worked by high school students. 9. Let X be a Poisson random variable with mean 2.5. Find P (X = 0) (a) 2.5 (b) 0.0821 (c) 1.5811 (d) 0.40 10. The time it takes Jessica to bicycle to school is normally distributed with mean 15 minutes and variance 4. Jessica has to be at school at 8:00 am. What time should she leave her house so she will be late only 4% of the time? (a) 8:00 (b) 11.5 minutes before 8:00 (c) 22 minutes before 8:00 (d) 18.5 minutes before 8:00 11. A student took a chemistry exam where the exam scores were mound-shaped with a mean score of 90 and a standard deviation of 64. She also took a statistics exam where the scores were mound-shaped, the mean score was 70 and standard deviation 16. If the student’s grades were 102 on the chemistry exam and 77 on the statistics exam, then (a) the student did relatively better on chemistry exam than on the statistics exam, compared to the other students in each class (b) the student did relatively better on statistics exam than on the chemistry exam, compared to the other students in the two classes (c) the student’s scores on both exams are comparable, when accounting for the scores of the other students in the two classes. (d) it is impossible to say which of the student’s exam scores indicates the better performance. 12. From a sample of size n = 100, the following descriptive measures were calculated: median=23, mean=20, standard deviation=5, range=35; 75 sample values are between 10 and 30; and 99 sample values are between 5 and 35. If you knew the sample mean, median, and standard deviation were correct, which of the following conclusions might you draw? (a) The distribution is skewed to the right because the median exceeds the mean. (b) The range must have been calculated incorrectly because it should not be seven times the standard deviation’s value. (c) The number of sample values between 10 and 30 was miscounted. (d) The number of sample values between 5 and 35 must have been miscounted because all 100 values must be in this interval. 13. Which of the following is always true for two events A and B with, P (A) = 0.5, P (B) = 0.7 P (A ∩ B) = 0.3? (a) A and B are independent (c) A and B are mutually exclusive events (b) P (A ∪ B) = 0.8 (d) P (A|B) = 0.6 14. If P (A) = 0.3, P (B) = 0.4, and P (A|B) = 0.6 then P (A ∪ B) is (a) 0.7 (b) 0.1 (c) 0.8 (d) 0.46 STAT2507 Final Examination December 2010 4 15. We have three identical Boxes I, II and III. Each box has two drawers. Each drawer of Box I contains a gold coin. One drawer of Box II has a gold coin and the other one contains a silver coin and each drawer of Box III contains a silver coin. We choose one of the boxes in random and we open one of its drawers. If that drawer contains a gold coin then what is the probability that the second drawer contains a gold coin too? (a) 1 2 (b) 2 3 (c) 1 3 (d) 1 6 Part II Long-answer questions. For these questions clearly show all your work; otherwise only partial or NO marks may be awarded. Marks for each question are given in [ ] 1. Suppose men’s heights are normally distributed with mean of 174 cm and a standard deviation of 6 cm. (a) [4 marks] What proportion of men are between 170 cm and 179 cm tall? (b) [4 marks] Find the minimum ceiling of an airplane such that at most 5% of the men walking down the aisle will have to duck their heads. (c) [4 marks] Find the probability that the average height of a random sample of 49 men is greater than 176. STAT2507 Final Examination December 2010 5 2. We randomly selected 50 billing statements from the computer databases of two hotel chains: Marriott and Radisson, and recorded the nightly room rates. A summary of this study is given below. Sample average Sample standard deviation Marriott Radisson $170 $145 $15 $10 (a) [4 marks] Do this data indicate a difference between the average room rates for these two hotels? Test at α = .01. (b) [4 marks] Find P -value of this test. Does this value confirm your findings in (a)? Why? STAT2507 Final Examination December 2010 6 3. [4 marks] Five soft drink bottling companies have agreed to implement a time management program in hopes of increasing productivity (measured in cases of soft drinks bottled per hour). The number of cases of soft drinks bottled per hour before and after implementation of the program are listed below. Before After Company 1 2 3 4 5 500 475 525 490 530 510 480 525 495 533 Test at α = .05 if the time management program is efficient in increasing the productivity. STAT2507 Final Examination December 2010 7 4. A food processor wants to compare two preservatives for their effects on retarding spoilage. Suppose 10 cuts of fresh meat are treated with preservative A and 10 cuts are treated with preservative B, and the time (in hours) until spoilage begins is recorded for each of the 20 cuts. The results are summarized in the following table. Sample mean Sample standard deviation Preservative A 108.7 hours 9.5 hours Preservative B 98.7 hours 11.5 hours (a) [4 marks] State clearly the null and alternative hypotheses to determine if the average time until spoilage begins differs for preservatives A and B. (b) [4 marks] Compute the pooled variance and the value of the test statistic. (c) [4 marks] Determine the rejection region at α = .05, and write down a proper conclusion. STAT2507 Final Examination December 2010 8 5. [4 marks] An Internet server conducted a study on 225 of its customers and found that the average amount spent online was 12.5 hours per week with a standard deviation of 5.4 hours. Construct a 95% confidence interval for the average online time for all users of this particular Internet server. 6. [4 marks] A salesperson has found that the probability of making a sale on a particular product manufactured by him or her company is .05. If the salesperson contacts 140 potential customers, what is the probability he or she will sell at least 2 of these products? Use and justify Poisson approximation to Binomial. 7. [ 4 marks] A firm with many traveling salespersons decides to check the salespersons’ travel expenses to see if they are correctly reported. An auditor for the firm selects 200 expense reports at random to audit. What is the probability that more than 40 of these 200 sampled reports will be incorrect when, in fact, only 10% of the firm’s reports are improperly documented. Use and justify normal approximation to binomial. STAT2507 Final Examination December 2010 9 8. The number of household members, x, and the amount spent on groceries per week, y, rounded to the nearest dollar are measured for 8 households in a suburb of Ottawa. the data are shown below. x 5 y 140 2 50 2 55 1 4 3 5 3 35 95 70 130 65 (a) [4 marks] Compute y¯, Sx , Sy , and Sxy (b) [4 marks] Compute the correlation coefficient r between x and y. What would you estimate a household of 6 to spend on groceries per week. STAT2507 Final Examination December 2010 Formulae 10 2 Pn i=1 xi Pn 2 n 1 X i=1 xi − 2 n (xi − x¯) = s = n − 1 i=1 n−1 X X X E(X) = µ = xp(x), σ2 = (x − µ)2 p(x) = x2 p(x) − µ2 2 x x P P ( xi )( yi ) Sxy 1 , Sxy = xi yi − , r= n−1 n Sx Sy Sx and Sy are standard deviations of X and Y . • Binomial distribution with parameters n, p: X P (X = k) = Ckn pk (1 − p)n−k x Sy , Sx b=r µ = np, a = y¯−b¯ x, yˆ = bx+a σ 2 = np(1 − p) n! k!(n − k)! • Hypergeometric distribution with parameters N, M, n: N −M M CkM Cn−k M M N − n 2 P (X = k) = , µ = n , σ = n 1 − CnN N N N N −1 Ckn = • Poisson distribution with parameter µ : P (X = k) = e−µ µk , k! E(X) = µ = Var(X) Approximations • If n/N ≤ .05, then hypergeometric distribution may be approximated by the Binomial. • If np < 7 and n ≥ 50 then the binomial may be approximated by Poisson distribution. • If np ≥ 5 and n(1 − p) ≥ 5, then binomial may be approximated by normal distribution with µ = np and σ 2 = np(1 − p). Important Test Statistics Confidence Intervals ¯ X−µ √0 σ/ n ¯ ± zα/2 √σ X n √pˆ−p0 pˆ ± zα/2 p0 q0 /n pˆ1 −pˆ2 s pˆqˆ (pˆ1 − pˆ2 ) ± zα/2 pˆqˆ/n q 1 + n1 n1 2 ¯ X−µ √0 s/ n ¯1 − X ¯ 2 ) ± zα/2 (X 2 s2 1 + s2 n1 n2 ¯ ¯ s X 1 −X2 1 + n1 n1 2 d¯√ sd / n pˆ1 qˆ1 n1 + pˆ2 qˆ2 n2 ¯ ± tα/2 √s X n ¯ 1 −X ¯2 X r s2 p q s21 n1 + s22 n2 r ¯1 − X ¯ 2 ) ± tα/2 s2 1 + (X n1 d¯ ± tα/2 1 n2 sd √ n (n1 − 1)s21 + (n2 − 1)s22 x1 + x2 , pooled proportion pˆ = . n1 + n2 − 2 n1 + n2 For One-sided test and One-sided confidence interval (lower and upper confidence bounds), use α instead of α/2. pooled variance s2 = STAT2507 . Final Examination December 2010 11 STAT2507 . Final Examination December 2010 12 STAT2507 . Final Examination December 2010 13 STAT2507 . Final Examination December 2010 14 STAT2507 . Final Examination December 2010 15