"I am studying a Tutorial on Maximum Likelihood Estimation in Linear Regression and I have a question. When we have more than one regressor (a.k.a. multiple linear regression1), the model comes in its matrix form y=Xbeta+in, (1)where y is the response vector, X is the design matrix with each its row specifying under what design or conditions the corresponding response is observed (hence the name), beta is the vector of regression coefficients, and ϵ is the residual vector distributing as a zero-mean multivariable Gaussian with a diagonal covariance matrix

Pranav Ward 2022-09-13 Answered
I am studying a Tutorial on Maximum Likelihood Estimation in Linear Regression and I have a question.
When we have more than one regressor (a.k.a. multiple linear regression1), the model comes in its matrix form y = X β + ϵ, (1)where y is the response vector, X is the design matrix with each its row specifying under what design or conditions the corresponding response is observed (hence the name), β is the vector of regression coefficients, and ϵ is the residual vector distributing as a zero-mean multivariable Gaussian with a diagonal covariance matrix N ( 0 , σ 2 I N ), where I N is the N × N identity matrix. Therefore y N ( X β , σ 2 I N ), (2)meaning that linear combination X β explains (or predicts) response y with uncertainty characterized by a variance of σ 2 .
Assume y, β, and ϵ R n Under the model assumptions, we aim to estimate the unknown parameters ( β and σ 2 ) from the data available (X and y).
Maximum likelihood (ML) estimation is the most common estimator. We maximize the log-likelihood w.r.t. β and σ 2 L ( β , σ 2 | y , X ) = N 2 log 2 π N 2 l o g σ 2 1 2 σ 2 ( y X β ) T ( y X β )
I am trying to understand that how the log-likelihood, L ( β , σ 2 | y , X ), is formed. Normally, I saw these problems when we have x i as vector of size d(d is number of parameter for each data). specifically, when xi is a vector, I wrote is as
ln i = 1 N 1 ( 2 π ) d σ 2 exp ( 1 2 σ 2 ( x i μ ) T ( x i μ ) ) = i ln 1 ( 2 π ) d σ 2 exp ( 1 2 σ 2 ( x i μ ) T ( x i μ ) ) . But in the case that is shown in this tutorial, there is no index I to apply summation.
You can still ask an expert for help

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Solve your problem for the price of one coffee

  • Available 24/7
  • Math expert for every subject
  • Pay only if we can solve it
Ask Question

Answers (1)

Nelson Santana
Answered 2022-09-14 Author has 13 answers
I think it's relatively easy to get mixed up here due to notation. In the case you present from the textbook, they're considering a product of one-dimensional gaussians which are independent from each other, and then writing the form as a multi-dimensional gaussian (since then the covariance matrix of this multidimensional gaussian is exactly σ 2 I). E.g. note that each sample has the form y i ( X β ) i N ( 0 , σ 2 ). Taking the product of these distributions yields the multi-dimensional gaussian above.
In your exposition, on the other hand, you're writing a multi-dimensional gaussian which is i.i.d.; this is different than what the textbook is referring to, since the mean should change between distributions (e.g. the samples are independent, but not identically drawn, since we're observing different data points with some additional noise, ε N ( 0 , σ 2 )).

We have step-by-step solutions for your answer!

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

You might be interested in

asked 2022-09-04
Sorry if my title sounds vague and inaccurate. I can't think of a better way to put it lol. Anyway, I stumbled upon this problem today while preparing for Oxford tsa:
A survey of households in a town showed that (allowing for sampling errors) between 75% and 85% owned a dishwasher, between 35% and 40% owned a tumble dryer and less than 5% owned neither.
How many people own both a tumble dryer and a dishwasher?
In which the answer is: "between 10% and 30%"
I spent lots of time trying to figure out how to solve it but to no avail. The inclusion of the ranges in a typical Venn diagram question totally throws me off. Can anyone help me with this?
asked 2022-09-10
Is there a way to test the "accuracy of a binomial survey"?
I've been away from mathematics for a while and forgotten almost everything. It doesn't come from a text book; I was given an assignment in my training for a job and wondered if I can use my mathematical knowledge. All I have to do is actually interpret the data and say "more than 50 percent of the people surveyed thinks "yes" to the question" but I'm taking a step further and trying to say how "accurate" is the result? Basically, here's what I am given
There's a supermarket that is experiencing a fall in revenues. A survey was conducted and it asked whether "the customer thinks the workers are unfriendly/unhelpful." out of a 100 randomly chosen customers on the same day (100 different customers) 51% answered "yes."
However, the total number of customers that visited the supermarket is expected to be around 485. The total that visited the supermarket that month is 19700. How confident are we to say that more than 50% are not happy with the workers among all of those who visited the store a. that day b. that month?
I vaguely recalled Chi-squared ad z-test but I wasn't so sure; I tried the z-test with
z = p π π ( 1 π ) n
where p=0.51,π=0.5,n=100. Thing is, I get z=0.2 and the z table seems to tell me this is a very inaccurate result. In any case, my data and the question I ultimately want to answer is as above. Along the process of doing so, if no one would want to actually show me how to do this, can you please answer
What's the most apt test to answer a question like this? And why?
I think, as people start writing some answers, my senses will come back, some words and terms ringing a bell, reminding me of certain formulas, rules etc.
asked 2022-09-13
Sample size requirements in survey
If am doing some market research and want to answer the question "What percentage of the users of a service, searched for the given service online?". Lets say I go out and get people to take a survey.
How do I calculate the required sample size that would create the correct distribution for a given country or region?
asked 2022-09-06
Survey of the tax increase
In a survey of 1,000 people, 420 are opposed to the tax increase.
a) Construct a 95% confidence interval for the proportion of those people opposed to the tax increase.
b) Interpret the CI in terms of the question.
c) Is the estimate in part (a) valid? Explain.
My work:
I was able to solve a) and get the answer of [0.4205,0.4195].
For b) I am confused as to how to interpret the CI in terms of the question... could it be that its the percentage range of the possible number of people opposed to the tax increase?
For c) I think that it is valid because the number of people opposed to the tax increase is 420, which lies in the range of the confidence interval.
EDIT:
To solve for a) I did p ± 1.96 x [ 0.42 x 0 × 0.58 1000 , which equalled 0.42 ± 0.0005...
I now see that I forgot to apply the sqrt before multiplying by 1.96. The correct answer should then be: 0.42 ± 1.96x0.0156 = 0.42 ± 0.0306, which results in the range of [0.4506, 0.3894]
asked 2022-09-08
Suppose that in a city of 100 people, a survey conclude that 30 of them do not agree (says 'no') with the building of a new luxury apartment. If you randomly chose 12 people in the city what is the probability that 2 to 6 of them are those who disagree with the building of the new luxury apartment?
This seemed to be a binomial probability question.... though I fail to recall how to do so.
bonus point if you know how to solve this in minitab!
asked 2022-09-18
Statistical analysis of study with categorical and numerical variables
I am researching the effect of a certain innovation type on firm performance. The innovation type is measured through a 6-item survey with nominal answers (yes/no; 1/0) and is retrospective (e.g. Did you introduce XY in the last 5 years). For firm performance I have financial data for the 5 year period I'm interested in. Now, there are two possible approaches I could take:
1) I compute an "innovation" variable from the survey answers, to distinguish between adopters and non-adopters and I examine whether the adopter-group shows to have better firm performance than the non-adopter group. Which type of analysis would this be? And how would I control for firm size and time effects?
2) I investigate whether firms that answered more questions with "yes" perform better than firms that answered with fewer "yes". Which type of analysis would this then be? Regression?
asked 2022-09-17
Just want to check if my answer and reasoning is correct for the following problem (Not a homework problem - it is a sample question for a test I'm preparing for)
In a survey, viewers were given a list of 20 TV Shows and are asked to label 3 favourites not in any order. Then they must tick the ones that they have heard of before, if any. How many ways can the form be filled, assuming everyone has 3 favourites?
My reasoning:
1) Choose 3 shows out of 20: c(20,3)
2) Choosing 0-17 shows from 17 choices: c ( 17 , 0 ) + c ( 17 , 1 ) + c ( 17 , 2 ) + . . . + c ( 17 , 16 ) + c ( 17 , 17 )
and add 1) and 2) together for the final answer.
Would this be correct? Is there a better way of doing the second part that doesn't involve so many calculations?

New questions