# The article “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants” (Water Research, 1984: 1169-1174) suggests the uniform distribution on the interval (7.5, 20) as a model for depth (cm) of the bioturbation layer in sediment in a certain region. a. What are the mean and variance of depth? b. What is the cdf of depth? c. What is the probability that observed depth is at most 10? Between 10 and 15? d. What is the probability that the observed depth is within 1 standard deviation of the mean value? Within 2 standard deviations?

Question
Modeling data distributions
The article “Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants” (Water Research, 1984: 1169-1174) suggests the uniform distribution on the interval (7.5, 20) as a model for depth (cm) of the bioturbation layer in sediment in a certain region. a. What are the mean and variance of depth? b. What is the cdf of depth? c. What is the probability that observed depth is at most 10? Between 10 and 15? d. What is the probability that the observed depth is within 1 standard deviation of the mean value? Within 2 standard deviations?

2020-11-21
Step 1 a) Let f(x) be the uniform distribution of depth on the interval (7.5, 20) given to us. Hence value of f(x) in this interval is equal to: $$\displaystyle{f{{\left({x}\right)}}}={\frac{{{1}}}{{{20}-{7.5}}}}={\frac{{{1}}}{{{12.5}}}}={0.08}$$ and zero otherwise. Then f(x) can be written as: $$\displaystyle{f{{\left({x}\right)}}}{b}{e}{g}\in{\left\lbrace{c}{a}{s}{e}{s}\right\rbrace}{0.08}&{7.5}{<}{x}{<}{20}\backslash{0}&\text{otherwise}{e}{n}{d}{\left\lbrace{c}{a}{s}{e}{s}\right\rbrace}$$</span> The mean value of the given distribution can be given as: $$\displaystyle{E}{\left({X}\right)}={\int_{{-\infty}}^{{\infty}}}{x}\cdot{f{{\left({x}\right)}}}\cdot{\left.{d}{x}\right.}$$ ZSK $$\displaystyle={\int_{{{7.5}}}^{{{20}}}}{x}\cdot{\left({0.08}\right)}\cdot{\left.{d}{x}\right.}$$
$$\displaystyle={0.08}{\int_{{{7.5}}}^{{{20}}}}{x}\cdot{\left.{d}{x}\right.}$$
$$\displaystyle={0.08}{{\left[{\frac{{{x}^{{{2}}}}}{{{2}}}}\right]}_{{{7.5}}}^{{{20}}}}$$
$$\displaystyle={0.08}{\left[{\frac{{{20}^{{{2}}}}}{{{2}}}}-{\frac{{{7.5}^{{{2}}}}}{{{2}}}}\right]}$$
$$\displaystyle{E}{\left({X}\right)}={13.75}$$ Definition: The expected or mean value of a continuous rv X with pdf f(x) is $$\displaystyle\mu={E}{\left({X}\right)}={\int_{{-\infty}}^{{\infty}}}{x}\cdot{f{{\left({x}\right)}}}\cdot{\left.{d}{x}\right.}$$ Step 2 For the pdf f(x), to calculate variance, we first calculate $$\displaystyle{E}{\left({X}^{{{2}}}\right)}$$: $$\displaystyle{E}{\left({X}\right)}={\int_{{-\infty}}^{{\infty}}}{x}\cdot{f{{\left({x}\right)}}}\cdot{\left.{d}{x}\right.}$$ ZSK $$\displaystyle={\int_{{{7.5}}}^{{{20}}}}{x}^{{{2}}}\cdot{\left({0.08}\right)}\cdot{\left.{d}{x}\right.}$$
$$\displaystyle={\left({0.08}\right)}{\int_{{{7.5}}}^{{{20}}}}{x}^{{{2}}}\cdot{\left.{d}{x}\right.}$$
$$\displaystyle={\left({0.08}\right)}{{\left[{\frac{{{x}^{{{3}}}}}{{{3}}}}\right]}_{{{7.5}}}^{{{20}}}}$$
$$\displaystyle={\frac{{{0.08}}}{{{3}}}}{\left[{20}^{{{3}}}-{7.5}^{{{3}}}\right]}={\frac{{{0.08}}}{{{3}}}}{\left[{8000}-{421.875}\right]}$$
$$\displaystyle{E}{\left({X}^{{{2}}}\right)}={202.08}$$ As we have already calculated E(X) in the last part: $$\displaystyle{E}{\left({X}\right)}={13.75}$$, hence we use following proposition: Proposition: PSKV(X)=E(X^{2})-E(X)^{2} Using this, we can write: $$\displaystyle{V}{\left({X}\right)}={202.08}-{\left({13.75}\right)}^{{{2}}}$$
$$\displaystyle{V}{\left({X}\right)}={13}$$ Definition: If X is a continuous rv with pdf f(x) and h(X) is any function of X, then $$\displaystyle{E}{\left({h}{\left({x}\right)}\right)}={\int_{{-\infty}}^{{\infty}}}{h}{\left({x}\right)}\cdot{\left.{d}{x}\right.}$$ b) We recall the definition of a continuous variable. Definition: the cumulative distribution function F(x) for a continuous rv X is defined for every number x by $$\displaystyle{F}{\left({x}\right)}={P}{\left({X}\leq{x}\right)}={\int_{{-\infty}}^{{{x}}}}{f{{\left({y}\right)}}}\cdot{\left.{d}{y}\right.}$$ pdf f(x) is given to us as: $$\displaystyle{f{{\left({x}\right)}}}={b}{e}{g}\in{\left\lbrace{c}{a}{s}{e}{s}\right\rbrace}{0.08}&{7.5}{<}{x}{<}{20}\backslash{0}&{o}{t}{h}{e}{r}{w}{i}{s}{e}{e}{n}{d}{\left\lbrace{c}{a}{s}{e}{s}\right\rbrace}$$</span> For any number x between 7.5 and 20 $$\displaystyle{F}{\left({X}\right)}={\int_{{{7.5}}}^{{{x}}}}{\left({0.08}\right)}\cdot{\left.{d}{y}\right.}$$
$$\displaystyle={\left({0.08}\right)}{\int_{{{7.5}}}^{{{x}}}}{\left.{d}{y}\right.}$$
$$\displaystyle={0.08}{{\left[{y}\right]}_{{{7.5}}}^{{{x}}}}$$
$$\displaystyle={0.08}{\left[{x}-{7.5}\right]}$$
$$\displaystyle{F}{\left({X}\right)}={0.08}{x}-{0.6}$$ Thus F(X) can be given as: $$\displaystyle{F}{\left({x}\right)}={b}{e}{g}\in{\left\lbrace{c}{a}{s}{e}{s}\right\rbrace}{0}&{x}{<}{7.5}\backslash{0.08}{x}-{0.6}&{7.5}\leq{x}\leq{20}\backslash{1}&{x}{>}{20}{e}{n}{d}{\left\lbrace{c}{a}{s}{e}{s}\right\rbrace}$$ Step 4 c) Probability that observed depth is at most 10 is denoted by $$\displaystyle{P}{\left({X}\leq{10}\right)}$$. Then using the given cdf from part(b), we can write: $$\displaystyle{\left({X}\leq{10}\right)}={F}{\left({10}\right)}$$
$$\displaystyle={0.08}{\left({10}\right)}-{0.6}$$
$$\displaystyle{P}{\left({X}\leq{10}\right)}={0.2}$$ Probability that observed depth is between 10 and 15 is denoted by PSKP(10 $$\displaystyle{P}{\left({10}{<}{X}{<}{15}\right)}={F}{\left({15}\right)}-{F}{\left({10}\right)}$$</span>
$$\displaystyle={\left[{0.08}{\left({15}\right)}-{0.6}\right]}-{\left[{0.08}{\left({10}\right)}-{0.6}\right]}$$
$$\displaystyle={\left({0.6}\right)}-{\left({0.2}\right)}$$
$$\displaystyle{P}{\left({10}{<}{X}{<}{15}\right)}={0.4}$$</span> Proposition: Let X be a continuous rv with pdf f(x) and cdf F(x). Then for any number a, $$\displaystyle{P}{\left({x}\leq{a}\right)}={F}{\left({a}\right)}$$ Proposition: Let X be a continuous rv with pdf f(x) and cdf F(x). Then for any numbers a and b with $$\displaystyle{a}{<}{b}$$</span>
$$\displaystyle{P}{\left(\leq{X}\leq{b}\right)}={F}{\left({b}\right)}-{F}{\left({a}\right)}$$ Step 5 d) From part (a), we know the mean and variance of the given pdf as: $$\displaystyle{E}{\left({X}\right)}=\mu={13.75}$$
$$\displaystyle{V}{\left({X}\right)}={13}$$ Now, standard deviation $$\displaystyle{\left(\sigma{x}\right)}$$ can be written as: $$\displaystyle\sigma{x}=\sqrt{{{V}{\left({X}\right)}}}$$
$$\displaystyle=\sqrt{{{13}}}$$
$$\displaystyle\sigma{x}={3.6056}$$ The probability the observed depth is within 1 standard deviation of the mean value is denoted by $$\displaystyle{P}{\left(\mu-\sigma{x}{<}{X}{<}\mu+\sigma{x}\right)}$$</span>
$$\displaystyle{P}{\left(\mu-\sigma{x}{<}{X}{<}\mu+\sigma{x}\right)}={P}{\left({10.6844}{<}{X}{<}{17.3556}\right)}$$</span>
$$\displaystyle={F}{\left({17.3556}\right)}-{F}{\left({10.6844}\right)}$$
$$\displaystyle={\left[{0.08}{\left({17.3556}\right)}-{0.6}\right]}-{\left[{0.08}{\left({10.6844}\right)}-{0.6}\right]}$$
$$\displaystyle={\left({0.7884}\right)}-{\left({0.2548}\right)}$$
$$\displaystyle{P}{\left(\mu-\sigma{x}{<}{X}{<}\mu+\sigma{x}\right)}={0.5336}$$</span> The probability the observed depth is within 2 standard deviations of the mean value is denoted by $$\displaystyle{P}{\left(\mu-{2}\sigma{x}{<}{X}{<}\mu+{2}\sigma{x}\right)}$$</span>
$$\displaystyle={F}{\left({20.9612}\right)}-{F}{\left({6.5388}\right)}$$
$$\displaystyle={\left[{1}\right]}-{\left[{0}\right]}$$
$$\displaystyle{P}{\left(\mu-{2}\sigma{x}{<}{X}{<}\mu+{2}\sigma{x}\right)}={1}$$</span>

### Relevant Questions

The article "Modeling Sediment and Water Column Interactions for Hydrophobic Pollutants" (Water Research, $1984: 1169-1174$ ) suggests the uniform distribution on the interval (7.5,20) as a model for depth (cm) of the bioturbation layer in sediment in a certain region.
What are the mean and variance of depth?
b. What is the cdf of depth?
What is the probability that observed depth is at most 10? Between 10 and $15 ?$
What is the probability that the observed depth is within 1 standard deviation of the mean value? Within 2 standard deviations?
The article “Stochastic Modeling for Pavement Warranty Cost Estimation” (J. of Constr. Engr. and Mgmnt., 2009: 352–359) proposes the following model for the distribution of Y = time to pavement failure. Let $$\displaystyle{X}_{{{1}}}$$ be the time to failure due to rutting, and $$\displaystyle{X}_{{{2}}}$$ be the time to failure due to transverse cracking, these two rvs are assumed independent. Then $$\displaystyle{Y}=\min{\left({X}_{{{1}}},{X}_{{{2}}}\right)}$$. The probability of failure due to either one of these distress modes is assumed to be an increasing function of time t. After making certain distributional assumptions, the following form of the cdf for each mode is obtained: $$\displaystyle\Phi{\left[\frac{{{a}+{b}{t}}}{{\left({c}+{\left.{d}{t}\right.}+{e}{t}^{{{2}}}\right)}^{{\frac{{1}}{{2}}}}}\right]}$$ where $$\displaystyle{U}{p}{a}{r}{r}{o}{w}\Phi$$ is the standard normal cdf. Values of the five parameters a, b, c, d, and e are -25.49, 1.15, 4.45, -1.78, and .171 for cracking and -21.27, .0325, .972, -.00028, and .00022 for rutting. Determine the probability of pavement failure within $$\displaystyle{t}={5}$$ years and also $$\displaystyle{t}={10}$$ years.
M. F. Driscoll and N. A. Weiss discussed the modeling and solution of problems concerning motel reservation networks in “An Application of Queuing Theory to Reservation Networks” (TIMS, Vol. 22, No. 5, pp. 540–546). They defined a Type 1 call to be a call from a motel’s computer terminal to the national reservation center. For a certain motel, the number, X, of Type 1 calls per hour has a Poisson distribution with parameter $$\displaystyle\lambda={1.7}$$.
Determine the probability that the number of Type 1 calls made from this motel during a period of 1 hour will be:
a) exactly one.
b) at most two.
c) at least two.
(Hint: Use the complementation rule.)
d. Find and interpret the mean of the random variable X.
e. Determine the standard deviation of X.
True or False
1.The goal of descriptive statistics is to simplify, summarize, and organize data.
2.A summary value, usually numerical, that describes a sample is called a parameter.
3.A researcher records the average age for a group of 25 preschool children selected to participate in a research study. The average age is an example of a statistic.
4.The median is the most commonly used measure of central tendency.
5.The mode is the best way to measure central tendency for data from a nominal scale of measurement.
6.A distribution of scores and a mean of 55 and a standard deviation of 4. The variance for this distribution is 16.
7.In a distribution with a mean of M = 36 and a standard deviation of SD = 8, a score of 40 would be considered an extreme value.
8.In a distribution with a mean of M = 76 and a standard deviation of SD = 7, a score of 91 would be considered an extreme value.
9.A negative correlation means that as the X values decrease, the Y values also tend to decrease.
10.The goal of a hypothesis test is to demonstrate that the patterns observed in the sample data represent real patterns in the population and are not simply due to chance or sampling error.
$$\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\left|{c}\right|}{c}{\mid}\right\rbrace}{h}{l}\in{e}&{H}{o}{u}{s}{e}{w}{\quad\text{or}\quad}{k}{H}{o}{u}{r}{s}\backslash{h}{l}\in{e}{G}{e}{n}{d}{e}{r}&{S}{a}\mp\le\ {S}{i}{z}{e}&{M}{e}{a}{n}&{S}{\tan{{d}}}{a}{r}{d}\ {D}{e}{v}{i}{a}{t}{i}{o}{n}\backslash{h}{l}\in{e}{W}{o}{m}{e}{n}&{473473}&{33.133}{.1}&{14.214}{.2}\backslash{h}{l}\in{e}{M}{e}{n}&{488488}&{18.618}{.6}&{15.715}{.7}\backslash{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}$$ a. Based on this​ study, calculate how many more hours per​ week, on the​ average, women spend on housework than men. b. Find the standard error for comparing the means. What factor causes the standard error to be small compared to the sample standard deviations for the two​ groups? The cause the standard error to be small compared to the sample standard deviations for the two groups. c. Calculate the​ 95% confidence interval comparing the population means for women Interpret the result including the relevance of 0 being within the interval or not. The​ 95% confidence interval for ​$$\displaystyle{\left(\mu_{{W}}-\mu_{{M}}​\right)}$$ is: (Round to two decimal places as​ needed.) The values in the​ 95% confidence interval are less than 0, are greater than 0, include 0, which implies that the population mean for women could be the same as is less than is greater than the population mean for men. d. State the assumptions upon which the interval in part c is based. Upon which assumptions below is the interval​ based? Select all that apply. A.The standard deviations of the two populations are approximately equal. B.The population distribution for each group is approximately normal. C.The samples from the two groups are independent. D.The samples from the two groups are random.
1. Find each of the requested values for a population with a mean of $$? = 40$$, and a standard deviation of $$? = 8$$ A. What is the z-score corresponding to $$X = 52?$$ B. What is the X value corresponding to $$z = - 0.50?$$ C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of $$M=42$$ for a sample of $$n = 4$$ scores? E. What is the z-scores corresponding to a sample mean of $$M= 42$$ for a sample of $$n = 6$$ scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: $$a. -2.00 b. 1.25 c. 3.50 d. -0.34$$ 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with $$\mu = 78$$ and $$\sigma = 12$$. Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: $$82, 74, 62, 68, 79, 94, 90, 81, 80$$. 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about $$12 (\mu = 42, \sigma = 12)$$. You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is$44.50 from tips. Test for a difference between this value and the population mean at the $$\alpha = 0.05$$ level of significance.
The article “Anodic Fenton Treatment of Treflan MTF” describes a two-factor experiment designed to study the sorption of the herbicide trifluralin. The factors are the initial trifluralin concentration and the $$\displaystyle{F}{e}^{{{2}}}\ :\ {H}_{{{2}}}\ {O}_{{{2}}}$$ delivery ratio. There were three replications for each treatment. The results presented in the following table are consistent with the means and standard deviations reported in the article. $$\displaystyle{b}{e}{g}\in{\left\lbrace{m}{a}{t}{r}{i}{x}\right\rbrace}\text{Initial Concentration (M)}&\text{Delivery Ratio}&\text{Sorption (%)}\ {15}&{1}:{0}&{10.90}\quad{8.47}\quad{12.43}\ {15}&{1}:{1}&{3.33}\quad{2.40}\quad{2.67}\ {15}&{1}:{5}&{0.79}\quad{0.76}\quad{0.84}\ {15}&{1}:{10}&{0.54}\quad{0.69}\quad{0.57}\ {40}&{1}:{0}&{6.84}\quad{7.68}\quad{6.79}\ {40}&{1}:{1}&{1.72}\quad{1.55}\quad{1.82}\ {40}&{1}:{5}&{0.68}\quad{0.83}\quad{0.89}\ {40}&{1}:{10}&{0.58}\quad{1.13}\quad{1.28}\ {100}&{1}:{0}&{6.61}\quad{6.66}\quad{7.43}\ {100}&{1}:{1}&{1.25}\quad{1.46}\quad{1.49}\ {100}&{1}:{5}&{1.17}\quad{1.27}\quad{1.16}\ {100}&{1}:{10}&{0.93}&{0.67}&{0.80}\ {e}{n}{d}{\left\lbrace{m}{a}{t}{r}{i}{x}\right\rbrace}$$ a) Estimate all main effects and interactions. b) Construct an ANOVA table. You may give ranges for the P-values. c) Is the additive model plausible? Provide the value of the test statistic, its null distribution, and the P-value.
An automobile tire manufacturer collected the data in the table relating tire pressure x​ (in pounds per square​ inch) and mileage​ (in thousands of​ miles). A mathematical model for the data is given by
$$\displaystyle​ f{{\left({x}\right)}}=-{0.554}{x}^{2}+{35.5}{x}-{514}.$$
$$\begin{array}{|c|c|} \hline x & Mileage \\ \hline 28 & 45 \\ \hline 30 & 51\\ \hline 32 & 56\\ \hline 34 & 50\\ \hline 36 & 46\\ \hline \end{array}$$
​(A) Complete the table below.
$$\begin{array}{|c|c|} \hline x & Mileage & f(x) \\ \hline 28 & 45 \\ \hline 30 & 51\\ \hline 32 & 56\\ \hline 34 & 50\\ \hline 36 & 46\\ \hline \end{array}$$
​(Round to one decimal place as​ needed.)
$$A. 20602060xf(x)$$
A coordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (28,45), (30,51), (32,56), (34,50), and (36,46). A parabola opens downward and passes through the points (28,45.7), (30,52.4), (32,54.7), (34,52.6), and (36,46.0). All points are approximate.
$$B. 20602060xf(x)$$
Acoordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2.
Data points are plotted at (43,30), (45,36), (47,41), (49,35), and (51,31). A parabola opens downward and passes through the points (43,30.7), (45,37.4), (47,39.7), (49,37.6), and (51,31). All points are approximate.
$$C. 20602060xf(x)$$
A coordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (43,45), (45,51), (47,56), (49,50), and (51,46). A parabola opens downward and passes through the points (43,45.7), (45,52.4), (47,54.7), (49,52.6), and (51,46.0). All points are approximate.
$$D.20602060xf(x)$$
A coordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (28,30), (30,36), (32,41), (34,35), and (36,31). A parabola opens downward and passes through the points (28,30.7), (30,37.4), (32,39.7), (34,37.6), and (36,31). All points are approximate.
​(C) Use the modeling function​ f(x) to estimate the mileage for a tire pressure of 29
$$\displaystyle​\frac{{{l}{b}{s}}}{{{s}{q}}}\in.$$ and for 35
$$\displaystyle​\frac{{{l}{b}{s}}}{{{s}{q}}}\in.$$
The mileage for the tire pressure $$\displaystyle{29}\frac{{{l}{b}{s}}}{{{s}{q}}}\in.$$ is
The mileage for the tire pressure $$\displaystyle{35}\frac{{{l}{b}{s}}}{{{s}{q}}}$$ in. is
(Round to two decimal places as​ needed.)
(D) Write a brief description of the relationship between tire pressure and mileage.
A. As tire pressure​ increases, mileage decreases to a minimum at a certain tire​ pressure, then begins to increase.
B. As tire pressure​ increases, mileage decreases.
C. As tire pressure​ increases, mileage increases to a maximum at a certain tire​ pressure, then begins to decrease.
D. As tire pressure​ increases, mileage increases.
a) Find a $$\displaystyle{95}\%$$ confidence interval for the improvement in traffic flow due to the new system.
b) Find a $$\displaystyle{98}\%$$ confidence interval for the improvement in traffic flow due to the new system.
d) Approximately what sample size is needed so that a $$\displaystyle{95}\%$$
confidence interval will specify the mean to within $$\displaystyle\pm\ {50}$$ vehicles per hour?
e) Approximately what sample size is needed so that a $$\displaystyle{98}\%$$ confidence
interval will specify the mean to within $$\displaystyle\pm\ {50}$$ vehicles per hour?