# The 2003 Statistical Abstract of the United States reported the percentage of people 18 years of age and older who smoke. Suppose that a study designed to collect new data on smokers and nonsmokers uses a preliminary estimate of the proportion who smoke to be .30. a) How large a sample should be taken to estimate the proportion of smokers in the population with a margin of error of 2%? Use 95% confidence. b) Assume that the study uses your sample size recommendation above and finds 520 smokers. What is the point estimate of the proportion of smokers in the population? c) What is the 95% confidence interval for the proportion of smokers in the population?

Question
Study design
The 2003 Statistical Abstract of the United States reported the percentage of people 18 years of age and older who smoke. Suppose that a study designed to collect new data on smokers and nonsmokers uses a preliminary estimate of the proportion who smoke to be .30.
a) How large a sample should be taken to estimate the proportion of smokers in the population with a margin of error of 2%? Use 95% confidence.
b) Assume that the study uses your sample size recommendation above and finds 520 smokers. What is the point estimate of the proportion of smokers in the population?
c) What is the 95% confidence interval for the proportion of smokers in the population?

2021-01-16
Step 1
Given data
Error = 0.02
Confidence level = 0.95
Significance level $$\displaystyle=\alpha={1}-{0.95}={0.05}$$
$$\displaystyle{Z}_{{\frac{{0.05}}{{2}}}}={Z}_{{{0.025}}}=\pm{1.96}$$
(From Excel = NORM.S.INV(0.025))
sample proportion(p)=0.30
a)
Sample size is given by(n)
Margin of error formula is given by
$$\displaystyle{E}={Z}_{{\frac{\alpha}{{2}}}}\times\sqrt{{\frac{{{p}-{\left({1}-{p}\right)}}}{{n}}}}$$
Simplifying the above formula
$$\displaystyle{n}={p}{\left({1}-{p}\right)}\times{\left(\frac{{Z}_{{\frac{\alpha}{{2}}}}}{{E}}\right)}^{{2}}={0.3}\times{\left({1}-{0.3}\right)}\times{\left(\frac{{1.96}}{{0.02}}\right)}^{{2}}={2016.84}\approx{2017}$$
Step 2
b)
n=2017
No of smokers = 520
Point estimate is given by
$$\displaystyle\hat{{{p}}}=\frac{{520}}{{2017}}={0.258}$$
Step 3
c)
95% confidence interval for the proportion of smokers in the population is given by
Confidence level = 0.95
Significance level $$\displaystyle=\alpha={1}-{0.95}={0.05}$$
$$\displaystyle{Z}_{{\frac{{0.05}}{{2}}}}={Z}_{{0.025}}=\pm{1.96}$$
(From Excel = NORM.S.INV(0.025))
Confidence interval is given by
$$\displaystyle\hat{{{p}}}+{Z}_{{\frac{\alpha}{{2}}}}\sqrt{{\frac{{\hat{{{p}}}-{\left({1}-\hat{{{p}}}\right)}}}{{n}}}}={0.258}\pm{1.96}\sqrt{{\frac{{{0.258}{\left({1}-{0.258}\right)}}}{{2017}}}}={\left({0.239},{0.277}\right)}$$
Step 4
Result:
a) 2017
b) 0.258
c) (0.239,0.277)

### Relevant Questions

The Centers for Disease Control reported the percentage of people 18 years of age and older who smoke $$\displaystyle{\left\langle{C}{D}{C}{w}{e}{b}{s}{i}{t}{e},{D}{e}{c}{e}{m}{b}{e}{r}{14},{2014}\right\rangle}$$. Suppose that a study designed to collect new data on smokers and nonsmokers uses a preliminary estimate of the proportion who smoke of .30.
a.How large a sample should be taken to estimate the proportion of smokers in the population with a margin of error of .02(to the nearest whole number)? Use 95% confidence.
b.Assume that the study uses your sample size recommendation in part(a) and finds 520 smokers. What is the point estimate of the proportion of smokers in the population(to 4 decimals)?
c.What is the 95% confidence interval for the proportion of smokers in the population (to 4 decimals)?
In 2014, the Centers for Disearse reported the percentage of people 18 years of age and older who smoke. Suppose that a study designed to collect new data on smokers and nonsmokers uses a preliminary estimate of the proportion who smoke of 0.30.
a) How large a sample should be taken to estimate the proportion of smokers in the population with a margin of error of 0.02? Use 95% confidence.(Round your answer up to the nearest integer.)
b)Assume that the study uses your sample size recommendation in part (a) and finds 470 smokers. What is the point estimate of the proportion of smokers in the population? (Round your answer to four decimal places.)
c) What is the 95% confidence interval for the proportion of smokers in the population? (Round your answer to four decimal places.)
The centers for Disease Control reported the percentage of people 18 years of age and older who smoke (CDC website, December 14, 2014). Suppose that a study designed to collect new data on smokers and Questions Navigation Menu preliminary estimate of the proportion who smoke of .26.
a) How large a sample should be taken to estimate the proportion of smokers in the population with a margin of error of .02?(to the nearest whole number) Use 95% confidence.
b) Assume that the study uses your sample size recommendation in part (a) and finds 520 smokers. What is the point estimate of the proportion of smokers in the population (to 4 decimals)?
c) What is the 95% confidence interval for the proportion of smokers in the population?(to 4 decimals)?
The Centers for Disease Control reported the percentage of people 18 years of age and older who smoke (CDC website, December 14, 2014). Suppose that a study designed to collect new data on smokers and nonsmokers uses a preliminary estimate of the proportion who smoke of .30.
a. How large a sample should be taken to estimate the proportion of smokers in the population with a margin of error of .02? Use 95% confidence.

A random sample of $$n_1 = 14$$ winter days in Denver gave a sample mean pollution index $$x_1 = 43$$.
Previous studies show that $$\sigma_1 = 19$$.
For Englewood (a suburb of Denver), a random sample of $$n_2 = 12$$ winter days gave a sample mean pollution index of $$x_2 = 37$$.
Previous studies show that $$\sigma_2 = 13$$.
Assume the pollution index is normally distributed in both Englewood and Denver.
(a) State the null and alternate hypotheses.
$$H_0:\mu_1=\mu_2.\mu_1>\mu_2$$
$$H_0:\mu_1<\mu_2.\mu_1=\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1<\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1\neq\mu_2$$
(b) What sampling distribution will you use? What assumptions are you making? NKS The Student's t. We assume that both population distributions are approximately normal with known standard deviations.
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations.
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations.
(c) What is the value of the sample test statistic? Compute the corresponding z or t value as appropriate.
(Test the difference $$\mu_1 - \mu_2$$. Round your answer to two decimal places.) NKS (d) Find (or estimate) the P-value. (Round your answer to four decimal places.)
(e) Based on your answers in parts (i)−(iii), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level \alpha?
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are not statistically significant.
(f) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver. (g) Find a 99% confidence interval for
$$\mu_1 - \mu_2$$.
lower limit
upper limit
(h) Explain the meaning of the confidence interval in the context of the problem.
Because the interval contains only positive numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, we can not say that the mean population pollution index for Englewood is different than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains only negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is less than that of Denver.
The presidential election is coming. Five survey companies (A, B, C, D, and E) are doing survey to forecast whether or not the Republican candidate will win the election. Each company randomly selects a sample size between 1000 and 1500 people. All of these five companies interview people over the phone during Tuesday and Wednesday. The interviewee will be asked if he or she is 18 years old or above and U.S. citizen who are registered to vote. If yes, the interviewee will be further asked: will you vote for the Republican candidate? On Thursday morning, these five companies announce their survey sample and results at the same time on the newspapers. The results show that a% (from A), b% (from B), c% (from C), d% (from D), and e% (from E) will support the Republican candidate. The margin of error is plus/minus 3% for all results. Suppose that $$\displaystyle{c}{>}{a}{>}{d}{>}{e}{>}{b}$$. When you see these results from the newspapers, can you exactly identify which result(s) is (are) not reliable and not accurate? That is, can you identify which estimation interval(s) does (do) not include the true population proportion? If you can, explain why you can, if no, explain why you cannot and what information you need to identify. Discuss and explain your reasons. You must provide your statistical analysis and reasons.
You may need to use the appropriate appendix table or technology to answer this question.
Money reports that the average annual cost of the first year of owning and caring for a large dog in 2017 is $1,448. The Irish Red and White Setter Association of America has requested a study to estimate the annual first-year cost for owners of this breed. A sample of 50 will be used. Based on past studies, the population standard deviation is assumed known with $$\displaystyle\sigma=\{230}.$$ $$\begin{matrix} 1,902 & 2,042 & 1,936 & 1,817 & 1,504 & 1,572 & 1,532 & 1,907 & 1,882 & 2,153 \\ 1,945 & 1,335 & 2,006 & 1,516 & 1,839 & 1,739 & 1,456 & 1,958 & 1,934 & 2,094 \\ 1,739 & 1,434 & 1,667 & 1,679 & 1,736 & 1,670 & 1,770 & 2,052 & 1,379 & 1,939\\ 1,854 & 1,913 & 2,163 & 1,737 & 1,888 & 1,737 & 2,230 & 2,131 & 1,813 & 2,118\\ 1,978 & 2,166 & 1,482 & 1,700 & 1,679 & 2,060 & 1,683 & 1,850 & 2,232 & 2,294 \end{matrix}$$ (a) What is the margin of error for a $$95\%$$ confidence interval of the mean cost in dollars of the first year of owning and caring for this breed? (Round your answer to nearest cent.) (b) The DATAfile Setters contains data collected from fifty owners of Irish Setters on the cost of the first year of owning and caring for their dogs. Use this data set to compute the sample mean. Using this sample, what is the $$95\%$$ confidence interval for the mean cost in dollars of the first year of owning and caring for an Irish Red and White Setter? (Round your answers to nearest cent.)$_______ to \$________
From the Statistical Abstract of the United States, we obtained data on percentage of gross domestic product (GDP) spent on health care and life expectancy, in years, for selected countries. a) Obtain a scatterplot for the data. b) Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f). c) Determine and interpret the regression equation for the data. d) Identify potential outliers and influential observations. e) In case a potential outlier is present, remove it and discuss the effect. f) In case a potential influential observation is present, remove it and discuss the effect.
The dominant form of drag experienced by vehicles (bikes, cars,planes, etc.) at operating speeds is called form drag. Itincreases quadratically with velocity (essentially because theamount of air you run into increase with v and so does the amount of force you must exert on each small volume of air). Thus
$$\displaystyle{F}_{{{d}{r}{u}{g}}}={C}_{{d}}{A}{v}^{{2}}$$
where A is the cross-sectional area of the vehicle and $$\displaystyle{C}_{{d}}$$ is called the coefficient of drag.
Part A:
Consider a vehicle moving with constant velocity $$\displaystyle\vec{{{v}}}$$. Find the power dissipated by form drag.
Express your answer in terms of $$\displaystyle{C}_{{d}},{A},$$ and speed v.
Part B:
A certain car has an engine that provides a maximum power $$\displaystyle{P}_{{0}}$$. Suppose that the maximum speed of thee car, $$\displaystyle{v}_{{0}}$$, is limited by a drag force proportional to the square of the speed (as in the previous part). The car engine is now modified, so that the new power $$\displaystyle{P}_{{1}}$$ is 10 percent greater than the original power ($$\displaystyle{P}_{{1}}={110}\%{P}_{{0}}$$).
Assume the following:
The top speed is limited by air drag.
The magnitude of the force of air drag at these speeds is proportional to the square of the speed.
By what percentage, $$\displaystyle{\frac{{{v}_{{1}}-{v}_{{0}}}}{{{v}_{{0}}}}}$$, is the top speed of the car increased?
Express the percent increase in top speed numerically to two significant figures.
1. A researcher is interested in finding a 98% confidence interval for the mean number of times per day that college students text. The study included 144 students who averaged 44.7 texts per day. The standard deviation was 16.5 texts. a. To compute the confidence interval use a ? z t distribution. b. With 98% confidence the population mean number of texts per day is between and texts. c. If many groups of 144 randomly selected members are studied, then a different confidence interval would be produced from each group. About percent of these confidence intervals will contain the true population number of texts per day and about percent will not contain the true population mean number of texts per day. 2. You want to obtain a sample to estimate how much parents spend on their kids birthday parties. Based on previous study, you believe the population standard deviation is approximately $$\displaystyle\sigma={40.4}$$ dollars. You would like to be 90% confident that your estimate is within 1.5 dollar(s) of average spending on the birthday parties. How many parents do you have to sample? n = 3. You want to obtain a sample to estimate a population mean. Based on previous evidence, you believe the population standard deviation is approximately $$\displaystyle\sigma={57.5}$$. You would like to be 95% confident that your estimate is within 0.1 of the true population mean. How large of a sample size is required?