# According to the article “Modeling and Predicting the Effects of Submerged Arc Weldment Process Parameters on Weldment Characteristics and Shape Profiles” (J. of Engr. Manuf., 2012: 1230–1240), the submerged arc welding (SAW) process is commonly used for joining thick plates and pipes. The heat affected zone (HAZ), a band created within the base metal during welding, was of particular interest to the investigators. Here are observations on depth (mm) of the HAZ both when the current setting was high and when it was lower. begin{matrix} Non-high & 1.04 & 1.15 & 1.23 & 1.69 & 1.92 & 1.98 & 2.36 & 2.49 & 2.72 & 1.37 & 1.43 & 1.57 & 1.71 & 1.94 & 2.06 & 2.55 & 2.64 & 2.82 High & 1.55 & 2.02 & 2.02 & 2.05 & 2.35 & 2.57 & 2.93 & 2.94 & 2.97 end{matrix} c. Does it appear that true average

Question
Modeling data distributions
According to the article “Modeling and Predicting the Effects of Submerged Arc Weldment Process Parameters on Weldment Characteristics and Shape Profiles” (J. of Engr. Manuf., 2012: 1230–1240), the submerged arc welding (SAW) process is commonly used for joining thick plates and pipes. The heat affected zone (HAZ), a band created within the base metal during welding, was of particular interest to the investigators. Here are observations on depth (mm) of the HAZ both when the current setting was high and when it was lower. PSK\begin{matrix} Non-high & 1.04 & 1.15 & 1.23 & 1.69 & 1.92 & 1.98 & 2.36 & 2.49 & 2.72 & 1.37 & 1.43 & 1.57 & 1.71 & 1.94 & 2.06 & 2.55 & 2.64 & 2.82 \\ High & 1.55 & 2.02 & 2.02 & 2.05 & 2.35 & 2.57 & 2.93 & 2.94 & 2.97 \\ \end{matrix}ZSK c. Does it appear that true average HAZ depth is larger for the higher current condition than for the lower condition? Carry out a test of appropriate hypotheses using a significance level of .01.

2020-11-25
Step 1 Given: $$\displaystyle{n}_{{{1}}}={18}$$
$$\displaystyle{n}_{{{2}}}={9}$$
$$\displaystyle\alpha={0.01}$$ The mean is the sum of all values divided by the number of values: $$\displaystyle\overline{{{x}}}_{{{1}}}={\frac{{{1.04}+{1.15}+{1.23}+\ldots+{2.55}+{2.64}+{2.82}}}{{{18}}}}\approx{1.9261}$$
$$\displaystyle\overline{{{x}}}_{{{2}}}={\frac{{{1.55}+{2.02}+{2.02}+\ldots+{2.93}+{2.94}+{2.97}}}{{{9}}}}\approx{2.3778}$$ The variance is the sum of squared deviations from the mean divided by $$\displaystyle{n}-{1}$$. The standard deviation is the square root of the variance: $$\displaystyle{s}_{{{1}}}=\sqrt{{{\frac{{{\left({1.04}-{1.9261}\right)}^{{{2}}}+\ldots.+{\left({2.82}-{1.9261}\right)}^{{{2}}}}}{{{18}-{1}}}}}}\approx{0.5694}$$
$$\displaystyle{s}_{{{2}}}=\sqrt{{{\frac{{{\left({1.55}-{2.3778}\right)}^{{{2}}}+\ldots.+{\left({2.97}-{2.3778}\right)}^{{{2}}}}}{{{9}-{1}}}}}}\approx{0.5072}$$ Given claim: larger the higher current condition. The claim is either the null hypothesis or the alternative hypothesis. The null hypothesis and the alternative hypothesis state the opposite of each other. The null hypothesis needs to contain the value mentioned in the claim. $$\displaystyle{H}_{{0}}:\mu_{{{1}}}={u}_{{{2}}}$$
$$\displaystyle{H}_{{\alpha}}:\mu_{{{1}}}{<}\mu_{{{2}}}$$</span> Step 2 Determine the test statistic: $$\displaystyle{t}={\frac{{\overline{{{x}}}_{{{1}}}-\overline{{{x}}}_{{{2}}}}}{{\sqrt{{{\frac{{{{s}_{{{1}}}^{{{2}}}}}}{{{n}_{{{1}}}}}}+{\frac{{{{s}_{{{2}}}^{{{2}}}}}}{{{n}_{{{2}}}}}}}}}}}={\frac{{{1.9261}-{2.3778}}}{{\sqrt{{{\frac{{{0.5694}^{{{2}}}}}{{{18}}}}+{\frac{{{0.5072}^{{{2}}}}}{{{9}}}}}}}}}\approx-{2.093}$$ Determine the degrees of freedom (rounded down to the nearest integer): $$\displaystyle\triangle={\frac{{{\left({\frac{{{{s}_{{{1}}}^{{{2}}}}}}{{{n}_{{{1}}}}}}+{\frac{{{{s}_{{{2}}}^{{{2}}}}}}{{{n}_{{{2}}}}}}\right)}}}{{{\frac{{{\left(\frac{{{s}_{{{1}}}^{{{2}}}}}{{n}_{{{1}}}}\right)}^{{{2}}}}}{{{n}_{{{1}}}-{1}}}}+{\frac{{{\left(\frac{{{s}_{{{2}}}^{{{2}}}}}{{n}_{{{2}}}}\right)}^{{{2}}}}}{{{n}_{{{2}}}-{1}}}}}}}={\frac{{{\left({\frac{{{0.5694}^{{{2}}}}}{{{18}}}}+{\frac{{{0.5072}^{{{2}}}}}{{{9}}}}\right)}^{{{2}}}}}{{{\frac{{{\left(\frac{{0.5694}^{{{2}}}}{{18}}\right)}^{{{2}}}}}{{{18}-{1}}}}+{\frac{{{\left(\frac{{0.5072}^{{{2}}}}{{9}}\right)}^{{{2}}}}}{{{9}-{1}}}}}}}\approx{17}$$ The P-value is the probability of obtaining the value of the test statistic, or a value more extreme. The P-value is the number (or interval) in the column title of Student's T distribution in the appendix containing the t-value in the $$\displaystyle{d}{f}={17}$$: $$\displaystyle{0.025}{<}{P}{<}{0.05}$$</span> If the P-value is less than or equal to the significance level, then the null hypothesis is rejected: $$\displaystyle{P}{>}{0.01}\Rightarrow$$ Fail to reject $$\displaystyle{H}_{{{0}}}$$ There is not sufficient evidence to support the claim that the true average HAZ depth is larger for the higher current condition than for the lower condition.

### Relevant Questions

In an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Eng., 1976: 235–242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low light level with a black background and a higher level with a white background. Each data value is the time (sec) required to complete the task.
$$\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\left|{c}\right|}{c}{\mid}\right\rbrace}{h}{l}\in{e}{S}{u}{b}{j}{e}{c}{t}&{\left({1}\right)}&{\left({2}\right)}&{\left({3}\right)}&{\left({4}\right)}&{\left({5}\right)}&{\left({6}\right)}&{\left({7}\right)}&{\left({8}\right)}&{\left({9}\right)}\backslash{h}{l}\in{e}{B}{l}{a}{c}{k}&{25.85}&{28.84}&{32.05}&{25.74}&{20.89}&{41.05}&{25.01}&{24.96}&{27.47}\backslash{h}{l}\in{e}{W}{h}{i}{t}{e}&{18.28}&{20.84}&{22.96}&{19.68}&{19.509}&{24.98}&{16.61}&{16.07}&{24.59}\backslash{h}{l}\in{e}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}$$
Does the data indicate that the higher level of illumination yields a decrease of more than 5 sec in true average task completion time? Test the appropriate hypotheses using the P-value approach.
n an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Eng., 1976: 235–242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low light level with a black background and a higher level with a white background. Each data value is the time (sec) required to complete the task. $$\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\mathcal}\right\rbrace}{h}{l}\in{e}&{a}\mp&{a}\mp&{a}\mp\ \text{Subject}\backslash{h}{l}\in{e}&{a}\mp\ {1}&{a}\mp\ {2}&{a}\mp\ {3}&{a}\mp\ {4}&{a}\mp\ {5}&{a}\mp\ {6}&{a}\mp\ {7}&{a}\mp\ {8}&{a}\mp\ {9}&{a}\mp\backslash{h}{l}\in{e}\text{Black}&{a}\mp\ {25.85}&{a}\mp\ {28.84}&{a}\mp\ {32.05}&{a}\mp\ {25.74}&{a}\mp\ {20.89}&{a}\mp\ {41.05}&{a}\mp\ {25.01}&{a}\mp\ {24.96}&{a}\mp\ {27.47}&{a}\mp\backslash{h}{l}\in{e}\text{White}&{a}\mp\ {18.23}&{a}\mp\ {20.84}&{a}\mp\ {22.96}&{a}\mp\ {19.68}&{a}\mp\ {19.509}&{a}\mp\ {24.98}&{a}\mp\ {16.61}&{a}\mp\ {16.07}&{a}\mp\ {24.59}&{a}\mp\backslash{h}{l}\in{e}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}$$ Does the data indicate that the higher level of illumination yields a decrease of more than 5 sec in true average task completion time? Test the appropriate hypotheses using the P-value approach.
The article “Anodic Fenton Treatment of Treflan MTF” describes a two-factor experiment designed to study the sorption of the herbicide trifluralin. The factors are the initial trifluralin concentration and the $$\displaystyle{F}{e}^{{{2}}}\ :\ {H}_{{{2}}}\ {O}_{{{2}}}$$ delivery ratio. There were three replications for each treatment. The results presented in the following table are consistent with the means and standard deviations reported in the article. $$\displaystyle{b}{e}{g}\in{\left\lbrace{m}{a}{t}{r}{i}{x}\right\rbrace}\text{Initial Concentration (M)}&\text{Delivery Ratio}&\text{Sorption (%)}\ {15}&{1}:{0}&{10.90}\quad{8.47}\quad{12.43}\ {15}&{1}:{1}&{3.33}\quad{2.40}\quad{2.67}\ {15}&{1}:{5}&{0.79}\quad{0.76}\quad{0.84}\ {15}&{1}:{10}&{0.54}\quad{0.69}\quad{0.57}\ {40}&{1}:{0}&{6.84}\quad{7.68}\quad{6.79}\ {40}&{1}:{1}&{1.72}\quad{1.55}\quad{1.82}\ {40}&{1}:{5}&{0.68}\quad{0.83}\quad{0.89}\ {40}&{1}:{10}&{0.58}\quad{1.13}\quad{1.28}\ {100}&{1}:{0}&{6.61}\quad{6.66}\quad{7.43}\ {100}&{1}:{1}&{1.25}\quad{1.46}\quad{1.49}\ {100}&{1}:{5}&{1.17}\quad{1.27}\quad{1.16}\ {100}&{1}:{10}&{0.93}&{0.67}&{0.80}\ {e}{n}{d}{\left\lbrace{m}{a}{t}{r}{i}{x}\right\rbrace}$$ a) Estimate all main effects and interactions. b) Construct an ANOVA table. You may give ranges for the P-values. c) Is the additive model plausible? Provide the value of the test statistic, its null distribution, and the P-value.
A random sample of $$\displaystyle{n}_{{1}}={16}$$ communities in western Kansas gave the following information for people under 25 years of age.
$$\displaystyle{X}_{{1}}:$$ Rate of hay fever per 1000 population for people under 25
$$\begin{array}{|c|c|} \hline 97 & 91 & 121 & 129 & 94 & 123 & 112 &93\\ \hline 125 & 95 & 125 & 117 & 97 & 122 & 127 & 88 \\ \hline \end{array}$$
A random sample of $$\displaystyle{n}_{{2}}={14}$$ regions in western Kansas gave the following information for people over 50 years old.
$$\displaystyle{X}_{{2}}:$$ Rate of hay fever per 1000 population for people over 50
$$\begin{array}{|c|c|} \hline 94 & 109 & 99 & 95 & 113 & 88 & 110\\ \hline 79 & 115 & 100 & 89 & 114 & 85 & 96\\ \hline \end{array}$$
(i) Use a calculator to calculate $$\displaystyle\overline{{x}}_{{1}},{s}_{{1}},\overline{{x}}_{{2}},{\quad\text{and}\quad}{s}_{{2}}.$$ (Round your answers to two decimal places.)
(ii) Assume that the hay fever rate in each age group has an approximately normal distribution. Do the data indicate that the age group over 50 has a lower rate of hay fever? Use $$\displaystyle\alpha={0.05}.$$
(a) What is the level of significance?
State the null and alternate hypotheses.
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}<\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}>\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}\ne\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}>\mu_{{2}},{H}_{{1}}:\mu_{{1}}=\mu_{{12}}$$
(b) What sampling distribution will you use? What assumptions are you making?
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations,
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations,
The Student's t. We assume that both population distributions are approximately normal with known standard deviations,
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimalplaces.)
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimal places.)
(c) Find (or estimate) the P-value.
P-value $$\displaystyle>{0.250}$$
$$\displaystyle{0.125}<{P}-\text{value}<{0},{250}$$
$$\displaystyle{0},{050}<{P}-\text{value}<{0},{125}$$
$$\displaystyle{0},{025}<{P}-\text{value}<{0},{050}$$
$$\displaystyle{0},{005}<{P}-\text{value}<{0},{025}$$
P-value $$\displaystyle<{0.005}$$
Sketch the sampling distribution and show the area corresponding to the P-value.
P.vaiue Pevgiue
P-value f P-value
The accompanying data on y = normalized energy $$\displaystyle{\left[{\left(\frac{{J}}{{m}^{{2}}}\right)}\right]}$$ and x = intraocular pressure (mmHg) appeared in a scatterplot in the article “Evaluating the Risk of Eye Injuries: Intraocular Pressure During High Speed Projectile Impacts” (Current Eye Research, 2012: 43–49), an estimated regression function was superimposed on the plot.
x 2761 19764 25713 3980 12782 19008 y 1553 14999 32813 1667 8741 16526 x 19028 14397 9606 3905 25731 y 26770 16526 9868 6640 1220 30730
Here is Minitab output from fitting the simple linear regression model. Does the model appear to specify a useful relationship between the two variables?
Predictor Coef SE Coef T P Constant -5090 2257 -2.26 0.048 Pressure 1.2912 0.1347 9.59 0.000
The article “Stochastic Modeling for Pavement Warranty Cost Estimation” (J. of Constr. Engr. and Mgmnt., 2009: 352–359) proposes the following model for the distribution of Y = time to pavement failure. Let $$\displaystyle{X}_{{{1}}}$$ be the time to failure due to rutting, and $$\displaystyle{X}_{{{2}}}$$ be the time to failure due to transverse cracking, these two rvs are assumed independent. Then $$\displaystyle{Y}=\min{\left({X}_{{{1}}},{X}_{{{2}}}\right)}$$. The probability of failure due to either one of these distress modes is assumed to be an increasing function of time t. After making certain distributional assumptions, the following form of the cdf for each mode is obtained: $$\displaystyle\Phi{\left[\frac{{{a}+{b}{t}}}{{\left({c}+{\left.{d}{t}\right.}+{e}{t}^{{{2}}}\right)}^{{\frac{{1}}{{2}}}}}\right]}$$ where $$\displaystyle{U}{p}{a}{r}{r}{o}{w}\Phi$$ is the standard normal cdf. Values of the five parameters a, b, c, d, and e are -25.49, 1.15, 4.45, -1.78, and .171 for cracking and -21.27, .0325, .972, -.00028, and .00022 for rutting. Determine the probability of pavement failure within $$\displaystyle{t}={5}$$ years and also $$\displaystyle{t}={10}$$ years.
1. Find each of the requested values for a population with a mean of $$? = 40$$, and a standard deviation of $$? = 8$$ A. What is the z-score corresponding to $$X = 52?$$ B. What is the X value corresponding to $$z = - 0.50?$$ C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of $$M=42$$ for a sample of $$n = 4$$ scores? E. What is the z-scores corresponding to a sample mean of $$M= 42$$ for a sample of $$n = 6$$ scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: $$a. -2.00 b. 1.25 c. 3.50 d. -0.34$$ 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with $$\mu = 78$$ and $$\sigma = 12$$. Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: $$82, 74, 62, 68, 79, 94, 90, 81, 80$$. 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about $$12 (\mu = 42, \sigma = 12)$$. You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is$44.50 from tips. Test for a difference between this value and the population mean at the $$\alpha = 0.05$$ level of significance.
Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of $$\alpha = 0.05$$. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.) Lemons and Car Crashes Listed below are annual data for various years. The data are weights (metric tons) of lemons imported from Mexico and U.S. car crash fatality rates per 100,000 population [based on data from “The Trouble with QSAR (or How I Learned to Stop Worrying and Embrace Fallacy),” by Stephen Johnson, Journal of Chemical Information and Modeling, Vol. 48, No. 1]. Is there sufficient evidence to conclude that there is a linear correlation between weights of lemon imports from Mexico and U.S. car fatality rates? Do the results suggest that imported lemons cause car fatalities? $$\begin{matrix} \text{Lemon Imports} & 230 & 265 & 358 & 480 & 530\\ \text{Crashe Fatality Rate} & 15.9 & 15.7 & 15.4 & 15.3 & 14.9\\ \end{matrix}$$
American automobiles produced in 2012 and classified as “large” had a mean fuel economy of 19.6 miles per gallon with a standard deviation of 3.36 miles per gallon. A particular model on this list was rated at 23 miles per gallon, giving it a z-score of about 1.01. Which statement is true based on this information? A) Because the standard deviation is small compared to the mean, a Normal model is appropriate and we can say that about 84.4% of “large” automobiles have a fuel economy of 23 miles per gallon or less. B) Because a z-score was calculated, it is appropriate to use a Normal model to say that about 84.4% of “large” automobiles have a fuel economy of 23 miles per gallon or less. C) Because 23 miles per gallon is greater than the mean of 19.6 miles per gallon, the distribution is skewed to the right. This means the z-score cannot be used to calculate a proportion.D) Because no information was given about the shape of the distribution, it is not appropriate to use the z-score to calculate the proportion of automobiles with a fuel economy of 23 miles per gallon or less. E) Because no information was given about the shape of the distribution, it is not appropriate to calculate a z-score, so the z-score has no meaning in this situation.
A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of $$25^{\circ}F$$. However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to $$25^{\circ}F$$. One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a $$5\%$$ level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
$$H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}$$
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
$$df_{N} = ?$$
$$df_{D} = ?$$
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
...