# An experiment designed to study the relationship between hypertension and cigarette smoking yielded the following data.

Question
Modeling data distributions

An experiment designed to study the relationship between hypertension and cigarette smoking yielded the following data.
$$\begin{array}{|c|c|} \hline Tension\ level & Non-smoker & Moderate\ smoker & Heavy\ smoker \\ \hline Hypertension & 20 & 38 & 28 \\ \hline No\ hypertension & 50 & 27 & 18 \\ \hline \end{array}$$
Test the hypothesis that whether or not an individual has hypertension is independent of how much that person smokes.

2021-03-03

Under the null hypothesis of the independence of hypertension and smoking status, we have the following expected table (rounding off the expected frequencies)
$$\begin{array}{|c|c|} \hline Tension\ level & Non-smoker & Moderate\ smoker & Heavy\ smoker \\ \hline Hypertension & 33 & 31 & 22 \\ \hline No\ hypertension & 37 & 34 & 24 \\ \hline \end{array}$$
This gives the value of T to be 16.486, which, under the null, follows an asymptotic chi-squared distribution, with 2 degrees of freedom. This yields a p value of 0.00026, and hence we reject the null hypothesis.

### Relevant Questions

In an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Eng., 1976: 235–242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low light level with a black background and a higher level with a white background. Each data value is the time (sec) required to complete the task.
$$\begin{array}{|c|c|} \hline Subject & (1) & (2) & (3) & (4) & (5) &(6) & (7) & (8) & (9) \\ \hline Black & 25.85 & 28.84 & 32.05 & 25.74 & 20.89 & 41.05 & 25.01 & 24.96 & 27.47 \\ \hline White & 18.28 & 20.84 & 22.96 & 19.68 & 19.509 & 24.98 & 16.61 & 16.07 & 24.59 \\ \hline \end{array}$$
Does the data indicate that the higher level of illumination yields a decrease of more than 5 sec in true average task completion time? Test the appropriate hypotheses using the P-value approach.

1)A rewiew of voted registration record in a small town yielded the dollowing data of the number of males and females registered as Democrat, Republican, or some other affilation:

$$\begin{array}{c} Gender \\ \hline Affilation & Male & Female \\ \hline Democrat & 300 & 600 \\ Republican & 500 & 300 \\ Other & 200 & 100 \\ \hline \end{array}$$

What proportion of all voters is male and registered as a Democrat? 2)A survey was conducted invocted involving 303 subject concerning their preferences with respect to the size of car thay would consider purchasing. The following table shows the count of the responses by gender of the respondents:

$$\begin{array}{c} Size\ of\ Car \\ \hline Gender & Small & Medium & lange & Total \\ \hline Female & 58 & 63 & 17 & 138 \\ Male & 79 & 61 & 25 & 165 \\ Total & 137 & 124 & 42 & 303 \\ \hline \end{array}$$

the data are to be summarized by constructing marginal distributions. In the marginal distributio for car size, the entry for mediums car is ?

An analysis of laboratory data collected with the goal of modeling the weight (in grams) of a bacterial culture after several hours of growth produced the least squares regression line $$\log(weight) = 0.25 + 0.61$$hours. Estimate the weight of the culture after 3 hours.

A) 0.32 g

B) 2.08 g

C) 8.0 g

D) 67.9 g

E) 120.2 g

The tables show the battery lives (in hours) of two brands of laptops. a) Make a double box-and-whisker plot that represent's the data. b) Identifity the shape of each distribution. c) Which brand's battery lives are more spread out? Explain. d) Compare the distributions using their shapes and appropriate measures of center and variation.

Gastroenterology
We present data relating protein concentration to pancreatic function as measured by trypsin secretion among patients with cystic fibrosis.
If we do not want to assume normality for these distributions, then what statistical procedure can be used to compare the three groups?
Perform the test mentioned in Problem 12.42 and report a p-value. How do your results compare with a parametric analysis of the data?
Relationship between protein concentration $$(mg/mL)$$ of duodenal secretions to pancreatic function as measured by trypsin secretion:
$$[U/ \frac{kg}{hr}]$$
Tapsin secreton [UGA]
$$\leq\ 50$$
$$\begin{array}{|c|c|}\hline \text{Subject number} & \text{Protetion concentration} \\ \hline 1 & 1.7 \\ \hline 2 & 2.0 \\ \hline 3 & 2.0 \\ \hline 4 & 2.2 \\ \hline 5 & 4.0 \\ \hline 6 & 4.0 \\ \hline 7 & 5.0 \\ \hline 8 & 6.7 \\ \hline 9 & 7.8 \\ \hline \end{array}$$
$$51\ -\ 1000$$
$$\begin{array}{|c|c|}\hline \text{Subject number} & \text{Protetion concentration} \\ \hline 1 & 1.4 \\ \hline 2 & 2.4 \\ \hline 3 & 2.4 \\ \hline 4 & 3.3 \\ \hline 5 & 4.4 \\ \hline 6 & 4.7 \\ \hline 7 & 6.7 \\ \hline 8 & 7.9 \\ \hline 9 & 9.5 \\ \hline 10 & 11.7 \\ \hline \end{array}$$
$$>\ 1000$$
$$\begin{array}{|c|c|}\hline \text{Subject number} & \text{Protetion concentration} \\ \hline 1 & 2.9 \\ \hline 2 & 3.8 \\ \hline 3 & 4.4 \\ \hline 4 & 4.7 \\ \hline 5 & 5.5 \\ \hline 6 & 5.6 \\ \hline 7 & 7.4 \\ \hline 8 & 9.4 \\ \hline 9 & 10.3 \\ \hline \end{array}$$

Use the table from the Theoretical Distribution section to calculate the following answers. Round your answers to four decimal places. $$P(x = 3)=?$$
$$P(1 < x < 4) = ?$$
$$P(x \geq 8) = ?$$ Use the data from the Organize the Data section to calculate the following answers. Round your answers to four decimal places. $$RF(x = 3) = ?$$
$$RF(1 < x < 4) =?$$
$$RF(x \geq 8) = ?$$ Discussion Questions 1. Knowing that data vary, describe three similarities between the graphs and distributions of the theoretical, empirical, and simulation distributions. Use complete sentences.

Determine which of the following functions $$\displaystyle{f{{\left({x}\right)}}}={c}{x},\ {g{{\left({x}\right)}}}={c}{x}^{{{2}}},\ {h}{\left({x}\right)}={c}\sqrt{{{\left|{x}\right|}}},\ \text{and}\ {r}{\left({x}\right)}=\ {\frac{{{c}}}{{{x}}}}$$ can be used to model the data and determine the value of the constant c that will make the function fit the data in the table. $$\begin{array}{|c|c|} \hline x & -4 & -1 & 0 & 1 & 4 \\ \hline y & -32 & -2 & 0 & -2 & -32 \\ \hline \end {array}$$

The following table shows the average yearly tuition and required fees, in thousand of dollars, charged by a certain private university in the school year beginning in the given year.
$$\begin{array}{|c|c|}\hline \text{Year} & \text{Average tuition} \\ \hline 2005 & 17.6 \\ \hline 2007 & 18.1 \\ \hline 2009 & 19.5 \\ \hline 2011 & 20.7 \\ \hline 2013 & 21.8 \\ \hline \end{array}$$
What prediction does the formula modeling this data give for average yearly tuition and required fees for the university for the academic year beginning in 2019?

The measure of the supplement of an angle is $$\displaystyle{40}^{{\circ}}$$ more than three times the measure of the original angle. Find the measure of the angles. Instructions: Use the statement: " Let the original angle be x " to begin modeling the working of this question.

a) Write algebraic expression in terms of x for the following:

I) $$40^{\circ}$$ more than three times the measure of the original angle

II) The measure of the Supplement angle in terms of the original angle, x

b) Write an algebraic equation in x equating I) and II) in a)

c) Hence solve the algebraic equation in

b) and find the measure of the angles.

According to the article “Modeling and Predicting the Effects of Submerged Arc Weldment Process Parameters on Weldment Characteristics and Shape Profiles” (J. of Engr. Manuf., 2012: 1230–1240), the submerged arc welding (SAW) process is commonly used for joining thick plates and pipes. The heat affected zone (HAZ), a band created within the base metal during welding, was of particular interest to the investigators. Here are observations on depth (mm) of the HAZ both when the current setting was high and when it was lower. $$\begin{matrix} Non-high & 1.04 & 1.15 & 1.23 & 1.69 & 1.92 & 1.98 & 2.36 & 2.49 & 2.72 & 1.37 & 1.43 & 1.57 & 1.71 & 1.94 & 2.06 & 2.55 & 2.64 & 2.82 \\ High & 1.55 & 2.02 & 2.02 & 2.05 & 2.35 & 2.57 & 2.93 & 2.94 & 2.97 \\ \end{matrix}$$ c. Does it appear that true average HAZ depth is larger for the higher current condition than for the lower condition? Carry out a test of appropriate hypotheses using a significance level of .01.