In 1985, neither Florida nor Georgia had laws banning open alcohol containers in vehicle passenger compartments. By 1990, Florida had passed such a law, but Georgia had not. (i) Suppose you can collect random samples of the driving-age population in both states, for 1985 and 1990. Let arrest be a binary variable equal to unity if a person was arrested for drunk driving during the year. Without controlling for any other factors, write down a linear probability model that allows you to test whether the open container law reduced the probability of being arrested for drunk driving. Which coefficient in your model measures the effect of the law? (ii) Why might you want to control for other factors in the model? What might some of these factors be? (iii) Now, suppose that you can only collec

In 1985, neither Florida nor Georgia had laws banning open alcohol containers in vehicle passenger compartments. By 1990, Florida had passed such a law, but Georgia had not. (i) Suppose you can collect random samples of the driving-age population in both states, for 1985 and 1990. Let arrest be a binary variable equal to unity if a person was arrested for drunk driving during the year. Without controlling for any other factors, write down a linear probability model that allows you to test whether the open container law reduced the probability of being arrested for drunk driving. Which coefficient in your model measures the effect of the law? (ii) Why might you want to control for other factors in the model? What might some of these factors be? (iii) Now, suppose that you can only collec

Question
Modeling data distributions
asked 2021-01-10
In 1985, neither Florida nor Georgia had laws banning open alcohol containers in vehicle passenger compartments. By 1990, Florida had passed such a law, but Georgia had not.
(i) Suppose you can collect random samples of the driving-age population in both states, for 1985 and 1990. Let arrest be a binary variable equal to unity if a person was arrested for drunk driving during the year. Without controlling for any other factors, write down a linear probability model that allows you to test whether the open container law reduced the probability of being arrested for drunk driving. Which coefficient in your model measures the effect of the law?
(ii) Why might you want to control for other factors in the model? What might some of these factors be?
(iii) Now, suppose that you can only collect data for 1985 and 1990 at the county level for the two states. The dependent variable would be the fraction of licensed drivers arrested for drunk driving during the year. How does this data structure differ from the individual-level data described in part (i)? What econometric method would you use?

Answers (1)

2021-01-11

(i) Consider FL be a dummy or binary variable which is equal to one if a person lives in Florida, and otherwise zero.
Now, consider y90 be a dummy variable for the year 1990.
Then, the linear probability model is,
\(\text{arrest} = \beta_0 + \beta_{1} y\ 90 + \beta_{2}\ FL + \beta_{3} y\ 90 \times FL + u\)
The effect of the law is measured by \(\beta_3\) which is the variable of interest here since it describes the probability of drunk driving arrest due to the new law in Florida.
(ii) Any factor that leads to different overall trends in both states could be relevant. That is, arrest due to the law of some other exogenous factors or merely an unexplained trend. These can include age, race, education, previous arrest or gender distributions may have changed.
These factors are important to be considered as these might affect whether someone is arrested for drunk driving that makes them important to control. At the least, there are the chances of obtaining a more precise estimator of \(\displaystyle\beta_{{3}}\) by reducing the error variance. Essentially, any explanatory variable that affects arrest can be used for this purpose.
(iii) According to the mentioned set up, the actual arrest rates are present, instead of only a sample, reducing the error from sampling. The interpretation of the coefficients will differ, because they represent averages across counties in a given state rather than state level averages. The individual level data allows the control of individual level variation that can potentially help in reducing the standard errors. The first difference can also be used because of the same set of counties in both years observed at two points in time.

0

Relevant Questions

asked 2020-10-23
The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.
Suspect was Armed:
Black - 543
White - 1176
Hispanic - 378
Total - 2097
Suspect was unarmed:
Black - 60
White - 67
Hispanic - 38
Total - 165
Total:
Black - 603
White - 1243
Hispanic - 416
Total - 2262
Give your answer as a decimal to at least three decimal places.
a) What percent are Black?
b) What percent are Unarmed?
c) In order for two variables to be Independent of each other, the P \((A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).\)
This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).
Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).
Remember, the previous answer is only correct if the variables are Independent.
d) Now let's get the real percent that are Black and Unarmed by using the table?
If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.
Let's compare the percentage of unarmed shot for each race.
e) What percent are White and Unarmed?
f) What percent are Hispanic and Unarmed?
If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.
Why is that?
This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.
Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades
The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.
g) What percent of blacks shot and killed by police were unarmed?
h) What percent of whites shot and killed by police were unarmed?
i) What percent of Hispanics shot and killed by police were unarmed?
You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.
j) Why do you believe this is happening?
Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.
asked 2021-05-09
The dominant form of drag experienced by vehicles (bikes, cars,planes, etc.) at operating speeds is called form drag. Itincreases quadratically with velocity (essentially because theamount of air you run into increase with v and so does the amount of force you must exert on each small volume of air). Thus
\(\displaystyle{F}_{{{d}{r}{u}{g}}}={C}_{{d}}{A}{v}^{{2}}\)
where A is the cross-sectional area of the vehicle and \(\displaystyle{C}_{{d}}\) is called the coefficient of drag.
Part A:
Consider a vehicle moving with constant velocity \(\displaystyle\vec{{{v}}}\). Find the power dissipated by form drag.
Express your answer in terms of \(\displaystyle{C}_{{d}},{A},\) and speed v.
Part B:
A certain car has an engine that provides a maximum power \(\displaystyle{P}_{{0}}\). Suppose that the maximum speed of thee car, \(\displaystyle{v}_{{0}}\), is limited by a drag force proportional to the square of the speed (as in the previous part). The car engine is now modified, so that the new power \(\displaystyle{P}_{{1}}\) is 10 percent greater than the original power (\(\displaystyle{P}_{{1}}={110}\%{P}_{{0}}\)).
Assume the following:
The top speed is limited by air drag.
The magnitude of the force of air drag at these speeds is proportional to the square of the speed.
By what percentage, \(\displaystyle{\frac{{{v}_{{1}}-{v}_{{0}}}}{{{v}_{{0}}}}}\), is the top speed of the car increased?
Express the percent increase in top speed numerically to two significant figures.
asked 2021-02-25
Give a full and correct answer Why is it important that a sample be random and representative when conducting hypothesis testing? Representative Sample vs. Random Sample: An Overview Economists and researchers seek to reduce sampling bias to near negligible levels when employing statistical analysis. Three basic characteristics in a sample reduce the chances of sampling bias and allow economists to make more confident inferences about a general population from the results obtained from the sample analysis or study: * Such samples must be representative of the chosen population studied. * They must be randomly chosen, meaning that each member of the larger population has an equal chance of being chosen. * They must be large enough so as not to skew the results. The optimal size of the sample group depends on the precise degree of confidence required for making an inference. Representative sampling and random sampling are two techniques used to help ensure data is free of bias. These sampling techniques are not mutually exclusive and, in fact, they are often used in tandem to reduce the degree of sampling error in an analysis and allow for greater confidence in making statistical inferences from the sample in regard to the larger group. Representative Sample A representative sample is a group or set chosen from a larger statistical population or group of factors or instances that adequately replicates the larger group according to whatever characteristic or quality is under study. A representative sample parallels key variables and characteristics of the large society under examination. Some examples include sex, age, education level, socioeconomic status (SES), or marital status. A larger sample size reduced sampling error and increases the likelihood that the sample accurately reflects the target population. Random Sample A random sample is a group or set chosen from a larger population or group of factors of instances in a random manner that allows for each member of the larger group to have an equal chance of being chosen. A random sample is meant to be an unbiased representation of the larger population. It is considered a fair way to select a sample from a larger population since every member of the population has an equal chance of getting selected. Special Considerations: People collecting samples need to ensure that bias is minimized. Representative sampling is one of the key methods of achieving this because such samples replicate as closely as possible elements of the larger population under study. This alone, however, is not enough to make the sampling bias negligible. Combining the random sampling technique with the representative sampling method reduces bias further because no specific member of the representative population has a greater chance of selection into the sample than any other. Summarize this article in 250 words.
asked 2021-04-13
As depicted in the applet, Albertine finds herself in a very odd contraption. She sits in a reclining chair, in front of a large, compressed spring. The spring is compressed 5.00 m from its equilibrium position, and a glass sits 19.8m from her outstretched foot.
a)Assuming that Albertine's mass is 60.0kg , what is \(\displaystyle\mu_{{k}}\), the coefficient of kinetic friction between the chair and the waxed floor? Use \(\displaystyle{g}={9.80}\frac{{m}}{{s}^{{2}}}\) for the magnitude of the acceleration due to gravity. Assume that the value of k found in Part A has three significant figures. Note that if you did not assume that k has three significant figures, it would be impossible to get three significant figures for \(\displaystyle\mu_{{k}}\), since the length scale along the bottom of the applet does not allow you to measure distances to that accuracy with different values of k.
asked 2020-12-25
Case: Dr. Jung’s Diamonds Selection
With Christmas coming, Dr. Jung became interested in buying diamonds for his wife. After perusing the Web, he learned about the “4Cs” of diamonds: cut, color, clarity, and carat. He knew his wife wanted round-cut earrings mounted in white gold settings, so he immediately narrowed his focus to evaluating color, clarity, and carat for that style earring.
After a bit of searching, Dr. Jung located a number of earring sets that he would consider purchasing. But he knew the pricing of diamonds varied considerably. To assist in his decision making, Dr. Jung decided to use regression analysis to develop a model to predict the retail price of different sets of round-cut earrings based on their color, clarity, and carat scores. He assembled the data in the file Diamonds.xls for this purpose. Use this data to answer the following questions for Dr. Jung.
1) Prepare scatter plots showing the relationship between the earring prices (Y) and each of the potential independent variables. What sort of relationship does each plot suggest?
2) Let X1, X2, and X3 represent diamond color, clarity, and carats, respectively. If Dr. Jung wanted to build a linear regression model to estimate earring prices using these variables, which variables would you recommend that he use? Why?
3) Suppose Dr. Jung decides to use clarity (X2) and carats (X3) as independent variables in a regression model to predict earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
4) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. Which sets of earrings appear to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
5) Dr. Jung now remembers that it sometimes helps to perform a square root transformation on the dependent variable in a regression problem. Modify your spreadsheet to include a new dependent variable that is the square root on the earring prices (use Excel’s SQRT( ) function). If Dr. Jung wanted to build a linear regression model to estimate the square root of earring prices using the same independent variables as before, which variables would you recommend that he use? Why?
1
6) Suppose Dr. Jung decides to use clarity (X2) and carats (X3) as independent variables in a regression model to predict the square root of the earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
7) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. (Remember, your model estimates the square root of the earring prices. So you must actually square the model’s estimates to convert them to price estimates.) Which sets of earring appears to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
8) Dr. Jung now also remembers that it sometimes helps to include interaction terms in a regression model—where you create a new independent variable as the product of two of the original variables. Modify your spreadsheet to include three new independent variables, X4, X5, and X6, representing interaction terms where: X4 = X1 × X2, X5 = X1 × X3, and X6 = X2 × X3. There are now six potential independent variables. If Dr. Jung wanted to build a linear regression model to estimate the square root of earring prices using the same independent variables as before, which variables would you recommend that he use? Why?
9) Suppose Dr. Jung decides to use color (X1), carats (X3) and the interaction terms X4 (color * clarity) and X5 (color * carats) as independent variables in a regression model to predict the square root of the earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
10) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. (Remember, your model estimates the square root of the earring prices. So you must square the model’s estimates to convert them to actual price estimates.) Which sets of earrings appear to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
asked 2021-05-04

When a gas is taken from a to c along the curved path in the figure (Figure 1) , the work done by the gas is W = -40 J and the heat added to the gas is Q = -140 J . Along path abc, the work done by the gas is W = -50 J . (That is, 50 J of work is done on the gas.)
I keep on missing Part D. The answer for part D is not -150,150,-155,108,105( was close but it said not quite check calculations)
Part A
What is Q for path abc?
Express your answer to two significant figures and include the appropriate units.
Part B
f Pc=1/2Pb, what is W for path cda?
Express your answer to two significant figures and include the appropriate units.
Part C
What is Q for path cda?
Express your answer to two significant figures and include the appropriate units.
Part D
What is Ua?Uc?
Express your answer to two significant figures and include the appropriate units.
Part E
If Ud?Uc=42J, what is Q for path da?
Express your answer to two significant figures and include the appropriate units.
asked 2021-01-17
A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of \(25^{\circ}F\). However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to \(25^{\circ}F\). One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a \(5\%\) level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
\(H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}\)
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
\(df_{N} = ?\)
\(df_{D} = ?\)
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
asked 2021-05-05
The bulk density of soil is defined as the mass of dry solidsper unit bulk volume. A high bulk density implies a compact soilwith few pores. Bulk density is an important factor in influencing root development, seedling emergence, and aeration. Let X denotethe bulk density of Pima clay loam. Studies show that X is normally distributed with \(\displaystyle\mu={1.5}\) and \(\displaystyle\sigma={0.2}\frac{{g}}{{c}}{m}^{{3}}\).
(a) What is thedensity for X? Sketch a graph of the density function. Indicate onthis graph the probability that X lies between 1.1 and 1.9. Findthis probability.
(b) Find the probability that arandomly selected sample of Pima clay loam will have bulk densityless than \(\displaystyle{0.9}\frac{{g}}{{c}}{m}^{{3}}\).
(c) Would you be surprised if a randomly selected sample of this type of soil has a bulkdensity in excess of \(\displaystyle{2.0}\frac{{g}}{{c}}{m}^{{3}}\)? Explain, based on theprobability of this occurring.
(d) What point has the property that only 10% of the soil samples have bulk density this high orhigher?
(e) What is the moment generating function for X?
asked 2021-05-01
A boy is to sell lemonade to make some money to afford some holiday shopping. The capacity of the lemonade bucket is 1. At the start of each day, the amount of lemonade in the bucket is a random variable X, from which a random variable Y is sold during the day. The two random variables X and Y are jointly uniform.
Write down the joint pdf of X and Y and clearly indicate the region of interest.
asked 2021-05-17
A boy is to sell lemonade to make some money to afford some holiday shopping. The capacity of the lemonade bucket is 1. At the start of each day, the amount of lemonade in the bucket is a random variable X, from which a random variable Y is sold during the day. The two random variables X and Y are jointly uniform.
What is the probability that the amount of lemonade sold is less than \(\frac{1}{3}\)?
...