What is Var[b] in multiple regression?

trkalo84 2022-09-27 Answered
What is V a r [ b ] in multiple regression?
Assume a linear regression model y = X β + ϵ with ϵ N ( 0 , σ 2 I ) and y ^ = X b where b = ( X X ) 1 X y. Besides H = X ( X X ) 1 X is the linear projection from the response space to the span of X, i.e., y ^ = H y
Now I want to calculate V a r [ b ] but what I get is an k × k matrix, not an n × n one. Here's my calculation:
V a r [ b ] = V a r [ ( X X ) 1 X y ] = ( X X ) 1 X V a r [ y ] = σ 2 I X ( X X ) 1 Here you can  see already this thing will be k  ×  k = σ 2 ( X X ) 1 X X I ( X X ) 1 = σ 2 ( X X ) 1 R k × k
What am I doing wrong?
Besides, are E [ b ] = β, E [ y ^ ] = H X β, V a r [ y ^ ] = σ 2 H, E [ y y ^ ] = ( I H ) X β, V a r [ y y ^ ] = ( I H ) σ 2 correct (this is just on a side note, my main question is the one above)?
You can still ask an expert for help

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Solve your problem for the price of one coffee

  • Available 24/7
  • Math expert for every subject
  • Pay only if we can solve it
Ask Question

Answers (1)

Jade Mejia
Answered 2022-09-28 Author has 8 answers
The covariance matrix for b (the estimator for β) should be k × k. If the X matrix is n × k then β has to be k × 1; otherwise the product X β wouldn't be n × 1
So if β is a constant vector of k parameters, then its estimator b is a random vector with k elements. Therefore the covariance matrix for b consists of covariances for all possible combinations of two members selected from the random vector, hence it must be a k × k matrix.
To answer your side notes, all your calculations are correct but some can be simplified further. Check that H X = X, so that E [ y ^ ] = X β, and E [ y y ^ ] = 0
Did you like this example?
Subscribe for all access

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

You might be interested in

asked 2022-08-17
Is there such a thing as a weighted multiple regression?
I'm new to linear algebra, but I know how multiple linear regressions work. What I want to do is something slightly different.
As an example, let's say that I have a list of nutrients I want to get every day. Say I have a list of foods, and I want to know how much of each food to eat to get the best fit for my nutrition plan. Assume I'm fine with using a linear model.
However, some nutrients are more important than others. The errors on protein and calcium might be equal in a typical linear regression, but that's no use. Protein has higher priority than calcium (in this model), so I'd want a model that is better fitting to the higher priority points than to the lower ones.
I tried putting weights on the error function, and I end up with a matrix of matrices. At that point, I'm not sure if I'm minimising for the weights or for the coefficients on the nutrients. I think both, but I wasn't sure how to minimise for both at the same time.
Is it possible to solve this with linear algebra, or does this require some numerical approximation solution?
asked 2022-09-27
Multiple Regression Forecast
"Part C: asks what salary would you forecast for a man with 12 years of education, 10 months of experience, and 15 months with the company."
This is straight forward enough just reading off the coefficients table. y = 3526.4 + ( 722.5 ) ( 1 ) + ( 90.02 ) ( 12 ) + ( 1.269 ) ( 10 ) + ( 23.406 ) ( 15 ) = 5692.92
"Part D: asks what salary would you forecast for men with 12 years of education, 10 months of experience, and 15 months with the company."
I know that the answer to this must be different from C, but I have no idea why, I would of just done exactly the same as in part C,
What is wrong with my train of thought or intuition and how might I go about calculating the salary for men, rather than a man?
asked 2022-10-29
Can you use bivariate analysis in a multiple regression problem
I have one response variable and around 10 potential explanatory variables. I am looking for simple ways to visualize and explore my data before modelling. As well as carrying out PCA and looking for correlations between my explanatory variables I wanted to look at bi-variate scatter plots of the response with each explanatory variable - but I have been told this wont work.
This is they way I am picturing it -
Imagine you have n independent, uncorrelated explanatory variables.
You propose to use these variables in a multiple regression. You want to simplify the analysis by not including more variables than is necessary.
So , you want to know if the explanatory variables have any correlation with the response variable to decide whether to include the variable in the regression.
My feeling is 1) that you can look at bi-variate plots for the response variable with each explanatory variable to see if there is a correlation.
2) If a single explanatory variable has no correlation with a the response in bi-variate correlation, it cannot have any affect on the response if included in the multiple regression.
(The only way this variable could affect the response variable would through an interaction with another explanatory variable - but it has already been determined that all the explanatory variables are uncorrelated.)
I have been told that the above is incorrect and a variable showing no correlation in a bi-variate plot can affect the response variable in a multiple regression.
Could someone help me see where I am going wrong? Many Thanks.
asked 2022-10-13
2-dimensional representation of a multiple regression function
Supposing I have a multiple regression population function of the form:
Y i = β 1 + β 2 X 2 i + β 3 X 3 i + u i
with X 3 i a dummy variable (only takes values 0 and 1).
I am given a sample of points. Although the latter takes place in 3 dimensional space, the question states "its results can be represented in Y vs X 2 space". I don't understand how graphing Y vs X 2 will give us a 2 dimensional representation of our population regression function. Isn't X 3 i being completely omitted?
asked 2022-10-24
Questions about multiple linear regression
I have a couple true/false questions basically, one of them is this
In the multiple linear regression model the coefficient of multiple determination gives the proportion of total variability due to the effect of a single predictor
I know the coefficient of multiple determination indicates the amount of total variability explained by the model, but I'm not sure about the single predictor part, I don't think this is true because it uses x1, x2... as predictors no?
The other question is this;
In the multiple linear regression model
y i = β 0 + β 1 x i , 1 + β 2 x i , 2 + β 3 x i , 3 + ε i
the parameter β 1 represents the variation in the response corresponding to a unit increase in the variable x 1
I don't think this question is true but can't really explain why
All help would be greatly appreciated
asked 2022-10-05
How to write an equation where both independent variables and dependent variables are log transformed in a multiple regression?
How to write the multiple regression model when both the dependent variable and independent variables are log-transformed?
I know that without any log transformation the linear regression model would be written as enter image description here
y = β 0 + β 1 ( x 1 ) + β 2 ( x 2 ) +
But now I have transformed both my dependent variables and independent variable with log. So is correct to write as enter image description here log ( y ) = β 0 + β 1 log ( x 1 ) + β 2 log ( x 2 ) +
Or since I am transforming both sides of question so can I write it as enter image description here
ln ( y ) = β 0 + β 1 ( x 1 ) + β 2 ( x 2 ) +
asked 2022-08-11
Find the constrained least-squares estimator for a multiple regression model
Consider the multiple regression model
Y = X β + ϵ
with the restriction that l n b i = 1
I want to find the least squares estimator of β, so I need to solve the following optimization problem
m i n ( Y X β ) t ( Y X β )
s . t . l n b i = 1
Let's set
L = ( Y X β ) t ( Y X β ) λ ( U t β 1 ) = Y t Y + β t X t X β + 2 β t X t Y λ ( U t β 1 )
where U is a dummy vector of ones (and therefore U T β = l n b i ).
Take derivatives
d d β = 2 X t X β 2 X t Y λ U t = 0
d d λ = U t β 1 = 0
So from the first equation we can get an expression for β, but what should I do with the λ? The second equation doesn't seem to be useful to get rid of it.

New questions

i'm seeking out thoughts for a 15-hour mathematical enrichment course in a chinese language high faculty. What (pretty) simple concern would you advocate as a subject for any such course?
historical past/issues:
My students are generally pretty good at math, but many of them have no longer been uncovered to rigorous or summary mathematical reasoning. an amazing topic would be one that could not be impossibly hard for students who have by no means written or study proofs in English.
i have taught this magnificence three times earlier than. (a part of the purpose that i'm posting that is that i have used up all my thoughts!) the primary semester I taught an introductory range theory elegance (which meandered its way toward a proof of quadratic reciprocity, though I think this become in the end too advanced/abstract for some of the students). the second one semester I taught fundamental graph idea and packages (with a focal point on planarity and coloring). The 1/3 semester I taught a class at the Rubik's dice.
the students' math backgrounds are pretty numerous: a number of them take part in contest math competitions, and so are familiar with IMO-fashion techniques, however many aren't. a number of them may additionally realize some calculus, however I cannot assume it. all of them are superb at what in the united states is on occasion termed "pre-calculus": trigonometry, conic sections, systems of linear equations (though, shockingly, no matrices), and the like. They realize what a binomial coefficient is.
So, any ideas? preferably, i'd like to find some thing a bit "sexy" (like the Rubik's cube) -- tries to encourage wide variety theory through cryptography seemed to fall on deaf ears, however being capable of "see" institution idea on the cube became pretty popular.
(Responses specifically welcome from folks who grew up in the percent -- any mathematical subjects you desire were protected within the excessive college curriculum?)