The pathogen Phytophthora capsici causes bell pepper plants to wilt and die. A research project was designated to study the effort of soil water content and the spread of the disease in fields of bell peppers. It is thought that too much water helps spread the disease. The fields were divided into rows and quadrants. The soil water content (percent of water by volume of soil) was determined for each plot. An important first step in such a research is to give a statistical description of the data.

Question
Study design
asked 2021-03-08
The pathogen Phytophthora capsici causes bell pepper plants to wilt and die. A research project was designated to study the effort of soil water content and the spread of the disease in fields of bell peppers. It is thought that too much water helps spread the disease. The fields were divided into rows and quadrants. The soil water content (percent of water by volume of soil) was determined for each plot. An important first step in such a research is to give a statistical description of the data.

Answers (1)

2021-03-09
a) Order the number from smallest to largest:
\(\displaystyle{6},\ {7},{8},{8},\ {9},\ {9},\ {9},\ {9},\ {9},\ {9},\ {9},\ {10},\ {10},\ {10},\ {10},\ {10},\ {10},\ {10},\ {10},\ {11},\ {11},\ {11},\ {11},\ {11},\ {11},\ {11},\ {11},\ {11},\ {12},\ {12},\ {12},\ {12},\ {12},\ {13},\ {13},\ {13},\ {13},\ {13},\ {14},\ {14},\ {14},\ {14},\ {14},\ {15},\ {15},\ {15},\ {15},\ {16},\ {16},\ {16}.\)
Since the number of scores is even, the median is the average of the middle scores:
\(\displaystyle{M}={Q}_{{{2}}}={\frac{{{11}\ +\ {11}}}{{{2}}}}={11}\)
The first quartile is the median of the data values below the median (or at \(\displaystyle{25}\%\) of the data):
\(\displaystyle{Q}_{{{1}}}={10}\)
The third quartile is the median of the data values above the median (or at \(\displaystyle{75}\%\) of the data):
\(\displaystyle{Q}_{{{3}}}={13}\)
The interquartile range IQR is the difference of the third and first quartile:
\(\displaystyle{I}{Q}{R}={13}\ -\ {10}={3}\)
The whiskers of the boxplot are at the minimum and maximum value. The box starts at the lower quartile, end at the upper quartile and has a vertical line at the median.
The lower quartile is at \(\displaystyle{25}\%\) of the sorted data list.
The median at \(\displaystyle{50}\%\) and upper quartile:
at \(\displaystyle{75}\%\)
image
b) The class width is the difference between the largest and smallest value, divided by the number of classes (round up to the nearest integer!).
\(\displaystyle{C}{l}{a}{s}{s}\ {W}{i}{\left.{d}{t}\right.}{h}={\frac{{{16}\ -\ {6}}}{{{4}}}}={2.5}\ \approx\ {3}\)
Determine the midpoints, frequencies, the products of the midpoints and the frequaencies, and the products of the squared midpoints and the frequencies.
\(\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\left|{c}\right|}{c}{\mid}\right\rbrace}{h}{l}\in{e}\text{Interval}&\text{Midpoint x}&{f}&{x}{f}&{x}^{{{2}}}{f}\backslash{h}{l}\in{e}{6}-{8}&{7}&{4}&{28}&{196}\backslash{h}{l}\in{e}{9}-{11}&{10}&{24}&{240}&{2400}\backslash{h}{l}\in{e}{12}-{14}&{13}&{15}&{195}&{2535}\backslash{h}{l}\in{e}{15}-{17}&{16}&{7}&{112}&{1792}\backslash{h}{l}\in{e}&&&&\backslash{h}{l}\in{e}&{S}{U}{M}&{50}&{575}&{6923}\backslash{h}{l}\in{e}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}\)
The sample mean is then:
\(\displaystyle\overline{{{x}}}={\frac{{\sum\ {x}{f}}}{{{n}}}}={\frac{{{575}}}{{{50}}}}={11.5}\)
The sample standard deviation is then:
\(\displaystyle{s}=\sqrt{{{\frac{{\sum\ {x}^{{{2}}}{f}\ -\ \frac{{\left(\sum\ {x}{f}\right)}^{{{2}}}}{{n}}}}{{{n}\ -\ {1}}}}}}=\sqrt{{{\frac{{{6923}\ -\ \frac{{\left({575}\right)}^{{{2}}}}{{50}}}}{{{50}\ -\ {1}}}}}}\ \approx\ {2.5173}\)
Using Chebyshev's Rule with \(\displaystyle{k}={2}\), we know that at least
\(\displaystyle{100}{\left({1}\ -\ {\frac{{{1}}}{{{k}^{{{2}}}}}}\right)}\%={100}{\left({1}\ -\ {\frac{{{1}}}{{{4}}}}\right)}\%={75}\%\)
is within 2 standard deviations from the mean.
\(\displaystyle\overline{{{x}}}\ -\ {2}{s}={11.5}\ -\ {2}{\left({2.5173}\right)}={6.4654}\)
\(\displaystyle\overline{{{x}}}\ +\ {2}{s}={11.5}\ +\ {2}{\left({2.5173}\right)}={16.5346}\)
c) n is the number of values in the data set.
\(\displaystyle{n}={50}\)
The mean is the sum of all values divided by the number of values:
\(\displaystyle\overline{{{x}}}={\frac{{{15}\ +\ {14}\ +\ {14}\ +\ \cdots\ +\ {10}\ +\ {11}\ +\ {9}}}{{{50}}}}={11.48}\)
The variance is the sum of squared deviations from the mean divided by \(\displaystyle{n}\ -\ {1}.\) The standard deviation is the square root of the variance:
\(\displaystyle{s}=\sqrt{{{\frac{{{\left({15}\ -\ {11.48}\right)}^{{{2}}}\ +\ \cdots\ +\ {\left({9}\ -\ {11.48}\right)}^{{{2}}}}}{{{50}\ -\ {1}}}}}}\ \approx\ {2.4431}\)
0

Relevant Questions

asked 2020-11-08
The pathogen Phytophthora capsici causes bell pepper plants to wilt and die. A research project was designed to study the effect of soil water content and the spread of the disease in fields of bell peppers. It is thought that too much water helps spread the disease. The fields were divided into rows and quadrants. The soil water content (percent of water by volume of soil) was determined for each plot. An important first step in such a research project is to give a statistical description of the data. Soil Water Content for Bell Pepper Study \begin{matrix} 15 & 14 & 14 & 14 & 13 & 12 & 11 & 11 & 11 & 11 & 10 & 11 & 13 & 16 \\ 9 & 15 & 12 & 9 & 10 & 7 & 14 & 13 & 14 & 8 & 9 & 8 & 11 & 13 \\ 15 & 12 & 9 & 10 & 9 & 9 & 16 & 16 & 12 & 10 & 11 & 11 & 12 & 15 \\ 10 & 10 & 10 & 11 & 9 \end{matrix} If you have a statistical calculator or computer, use it to find the actual sample mean and sample standard deviation.
asked 2020-11-26
Use either the critical-value approach or the P-value approach to perform the required hypothesis test. For several years, evidence had been mounting that folic acid reduces major birth defects. A. Czeizel and I. Dudas of the National Institute of Hygiene in Budapest directed a study that provided the strongest evidence to date. Their results were published in the paper “Prevention of the First Occurrence of Neural-Tube Defects by Periconceptional Vitamin Supplementation” (New England Journal of Medicine, Vol. 327(26), p. 1832). For the study, the doctors enrolled women prior to conception and divided them randomly into two groups. One group, consisting of 2701 women, took daily multivitamins containing 0.8 mg of folic acid, the other group, consisting of 2052 women, received only trace elements. Major birth defects occurred in 35 cases when the women took folic acid and in 47 cases when the women did not. a. At the 1% significance level, do the data provide sufficient evidence to conclude that women who take folic acid are at lesser risk of having children with major birth defects? b. Is this study a designed experiment or an observational study? Explain your answer. c. In view of your answers to parts (a) and (b), could you reasonably conclude that taking folic acid causes a reduction in major birth defects? Explain your answer.
asked 2021-03-06
Use either the critical-value approach or the P-value approach to perform the required hypothesis test. Approximately 450,000 vasectomies are performed each year in the United States. In this surgical procedure for contraception, the tube carrying sperm from the testicles is cut and tied. Several studies have been conducted to analyze the relationship between vasectomies and prostate cancer. The results of one such study by E. Giovannucci et al. appeared in the paper “A Retrospective Cohort Study of Vasectomy and Prostate Cancer in U.S. Men” (Journal of the American Medical Association, Vol. 269(7), pp. 878-882). Of 21,300 men who had not had a vasectomy, 69 were found to have prostate cancer, of 22,000 men who had had a vasectomy, 113 were found to have prostate cancer. a. At the 1% significance level, do the data provide sufficient evidence to conclude that men who have had a vasectomy are at greater risk of having prostate cancer? b. Is this study a designed experiment or an observational study? Explain your answer. c. In view of your answers to parts (a) and (b), could you reasonably conclude that having a vasectomy causes an increased risk of prostate cancer? Explain your answer.
asked 2021-02-25
Give a full and correct answer Why is it important that a sample be random and representative when conducting hypothesis testing? Representative Sample vs. Random Sample: An Overview Economists and researchers seek to reduce sampling bias to near negligible levels when employing statistical analysis. Three basic characteristics in a sample reduce the chances of sampling bias and allow economists to make more confident inferences about a general population from the results obtained from the sample analysis or study: * Such samples must be representative of the chosen population studied. * They must be randomly chosen, meaning that each member of the larger population has an equal chance of being chosen. * They must be large enough so as not to skew the results. The optimal size of the sample group depends on the precise degree of confidence required for making an inference. Representative sampling and random sampling are two techniques used to help ensure data is free of bias. These sampling techniques are not mutually exclusive and, in fact, they are often used in tandem to reduce the degree of sampling error in an analysis and allow for greater confidence in making statistical inferences from the sample in regard to the larger group. Representative Sample A representative sample is a group or set chosen from a larger statistical population or group of factors or instances that adequately replicates the larger group according to whatever characteristic or quality is under study. A representative sample parallels key variables and characteristics of the large society under examination. Some examples include sex, age, education level, socioeconomic status (SES), or marital status. A larger sample size reduced sampling error and increases the likelihood that the sample accurately reflects the target population. Random Sample A random sample is a group or set chosen from a larger population or group of factors of instances in a random manner that allows for each member of the larger group to have an equal chance of being chosen. A random sample is meant to be an unbiased representation of the larger population. It is considered a fair way to select a sample from a larger population since every member of the population has an equal chance of getting selected. Special Considerations: People collecting samples need to ensure that bias is minimized. Representative sampling is one of the key methods of achieving this because such samples replicate as closely as possible elements of the larger population under study. This alone, however, is not enough to make the sampling bias negligible. Combining the random sampling technique with the representative sampling method reduces bias further because no specific member of the representative population has a greater chance of selection into the sample than any other. Summarize this article in 250 words.
asked 2020-10-19
n an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Eng., 1976: 235–242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low light level with a black background and a higher level with a white background. Each data value is the time (sec) required to complete the task. \(\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\mathcal}\right\rbrace}{h}{l}\in{e}&{a}\mp&{a}\mp&{a}\mp\ \text{Subject}\backslash{h}{l}\in{e}&{a}\mp\ {1}&{a}\mp\ {2}&{a}\mp\ {3}&{a}\mp\ {4}&{a}\mp\ {5}&{a}\mp\ {6}&{a}\mp\ {7}&{a}\mp\ {8}&{a}\mp\ {9}&{a}\mp\backslash{h}{l}\in{e}\text{Black}&{a}\mp\ {25.85}&{a}\mp\ {28.84}&{a}\mp\ {32.05}&{a}\mp\ {25.74}&{a}\mp\ {20.89}&{a}\mp\ {41.05}&{a}\mp\ {25.01}&{a}\mp\ {24.96}&{a}\mp\ {27.47}&{a}\mp\backslash{h}{l}\in{e}\text{White}&{a}\mp\ {18.23}&{a}\mp\ {20.84}&{a}\mp\ {22.96}&{a}\mp\ {19.68}&{a}\mp\ {19.509}&{a}\mp\ {24.98}&{a}\mp\ {16.61}&{a}\mp\ {16.07}&{a}\mp\ {24.59}&{a}\mp\backslash{h}{l}\in{e}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}\) Does the data indicate that the higher level of illumination yields a decrease of more than 5 sec in true average task completion time? Test the appropriate hypotheses using the P-value approach.
asked 2020-11-29
State whether the investigation in question is an observational study or a designed experiment. Justify your answer in each case.
The Salk Vaccine. In the 1940s and early 1950s, the public was greatly concerned about polio. In an attempt to prevent this disease, Jonas Salk of the University of Pittsburgh developed a polio vaccine. In a test of the vaccine’s efficacy, involving nearly 2 million grade-school children, half of the children received the Salk vaccine, the other half received a placebo, in this case an injection of salt dissolved in water. Neither the children nor the doctors performing the diagnoses knew which children belonged to which group, but an evaluation center did. The center found that the incidence of polio was far less among the children inoculated with the Salk vaccine. From that information, the researchers concluded that the vaccine would be effective in preventing polio for all U.S. school children, consequently, it was made available for general use.
asked 2020-10-27
To find:The conditions for the population and the study design that are required by the procedure and are used to contruct the confidence interval.
To identify:The important conditions for the validity of the procedure in the given case.
In an experiment on the effect of calcium and blood pressure, 54 healthy white males are divided into two groups’ calcium and placebo. The summary statistic for the systolic blood pressure of the 27 members of the placebo group is ¥ = 114.9 and s=9.3.
asked 2021-02-06
At what age do babies learn to crawl? Does it take longer to learn in the winter when babies are often bundled in clothes that restrict their movement? Data were collected from parents who brought their babies into the University of Denver Infant Study Center to participate in one of a number of experiments between 1988 and 1991. Parents reported the birth month and the age at which their child was first able to creep or crawl a distance of 4 feet within 1 minute. The resulting data were grouped by month of birth: January, May, and September: \(\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{c}\right\rbrace}&{C}{r}{a}{w}{l}\in{g}\ {a}\ge\backslash{h}{l}\in{e}{B}{i}{r}{t}{h}\ {m}{o}{n}{t}{h}&{M}{e}{a}{n}&{S}{t}.{d}{e}{v}.&{n}\backslash{h}{l}\in{e}{J}{a}\nu{a}{r}{y}&{29.84}&{7.08}&{32}\backslash{M}{a}{y}&{28.58}&{8.07}&{27}\backslash{S}{e}{p}{t}{e}{m}{b}{e}{r}&{33.83}&{6.93}&{38}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}\) Crawling age is given in weeks. Assume the data represent three independent simple random samples, one from each of the three populations consisting of babies born in that particular month, and that the populations of crawling ages have Normal distributions. A partial ANOVA table is given below. \(\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{c}\right\rbrace}{S}{o}{u}{r}{c}{e}&{S}{u}{m}\ {o}{f}\ \boxempty{s}&{D}{F}&{M}{e}{a}{n}\ \boxempty\ {F}\backslash{h}{l}\in{e}{G}{r}{o}{u}{p}{s}&{505.26}\backslash{E}{r}{r}{\quad\text{or}\quad}&&&{53.45}\backslash{T}{o}{t}{a}{l}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}\) What are the degrees of freedom for the groups term?
asked 2021-02-02
Potential buyers for a new car were randomly divided into two groups. One group was shown the "A" version of an ad for the car, while the other group was shown the "B" version of the ad. All were then tested on their recall of key points made in the ad. The researcher should run a hypothesis test based upon a comparison of means for ?
In another study, a healthcare insurance company took measures of subscribers’ cardiac (heart) health. The people were then provided an app for their phones which provided "nudges" and reminders about heart-healthy behaviors, such as eating more vegetables and less fried or fatty food, taking walks and breaks from sitting too long, and getting enough sleep. After 4 months of having the app, the cardiac health measures were taken again, with the objective of seeing if nudges from the app would result in decreased cardiac risk. The researcher should run a hypothesis test based on a comparison of means for?
asked 2021-03-07
The American Journal of Political Science (Apr. 2014) published a study on a woman's impact in mixed-gender deliberating groups. The researchers randomly assigned subjects to one of several 5-member decision-making groups. The groups' gender composition varied as follows: 0 females, 1 female, 2 females, 3 females, 4 females, or 5 females. Each group was the n randomly assigned to utilize one of two types of decision rules: unanimous or majority rule. Ten groups were created for each of the \(\displaystyle{6}\ \times\ {2}={12}\) combinations of gender composition and decision rule. One variable of interest, measured for each group, was the number of words spoken by women on a certain topic per 1,000 total words spoken during the deliberations. a) Why is this experiment considered a designed study? b) Identify the experimental unit and dependent variable in this study. c) Identify the factors and treatments for this study.
...