Recent questions in Multiple Regression

Multiple Regression
Answered

zabuheljz
2022-08-14

If I consider universal kriging (or multiple spatial regression) in matrix form as:

$\mathbf{V}\mathbf{=}\mathbf{X}\mathbf{A}\mathbf{+}\mathbf{R}$

where $\mathbf{R}$ is the residual and $\mathbf{A}$ are the trend coefficients, then the estimate of $\hat{\mathbf{A}}$ is:

$\hat{\mathbf{A}}=({\mathbf{X}}^{\mathbf{T}}{\mathbf{C}}^{\mathbf{-}\mathbf{1}}\mathbf{X}{\mathbf{)}}^{\mathbf{-}\mathbf{1}}{\mathbf{X}}^{\mathbf{T}}{\mathbf{C}}^{\mathbf{-}\mathbf{1}}\mathbf{V}$

(as I understand it), where $\mathbf{C}$ is the covariance matrix, if it is known. Then, the variance of the coefficients is:

VAR($\text{VAR}(\hat{\mathbf{A}})=({\mathbf{X}}^{\mathbf{T}}{\mathbf{C}}^{\mathbf{-}\mathbf{1}}\mathbf{X}{\mathbf{)}}^{\mathbf{-}\mathbf{1}}$???

How does one get from the estimate of $\hat{\mathbf{A}}$, to its variance? i.e. how can I derive that variance?

Multiple Regression
Answered

Jazmin Clark
2022-08-13

I'm finding conflicting information from college textbooks on calculating the degrees of freedom for a a global $F$-test on a multiple regression. To be absolutely clear, assume there are 50 observations and 3 independent variables. Can you please tell me the df for the numerator and denominator? I have found 2 sets of numbers in college texts. One indicating the numerator is equal to $P$, in this case 3, and alternatively $P-1$. For the denominator I am finding $n-p$,which in this case would be 47, and alternatively, $n-p-1$. Perhaps I am misunderstanding the material and there are circumstances when one vs. the other formula applies. I've not done any regression analysis in more than 25 years and now find I'm stuck on a Christmas vacation project I wanted to do with my son. So any help that would explain, in a gentle way, (I can't get through the quadratic explanation, or something that will bury me in calculus) how to determine the df would be appreciated. Concrete examples would be very beneficial. Also, if there is a good practical walk through of multiple regression/Anova that will show some examples and explain concepts (but please do not recommend Regression for Dummies) I'd appreciate a referral to that as well. Thanks for your help.

Multiple Regression
Answered

Crancichhb
2022-08-13

I am trying to calculate the coefficients ${b}_{1},{b}_{2},...$ of a multiple linear regression, with the condition that ${b}_{0}=0$. In Excel this can be done using the RGP Function and setting the constant to FALSE.

How can this be done with a simple Formular?

Thank you in Advance!

Multiple Regression
Answered

cofak48
2022-08-11

The basic setup in multiple linear regression model is

$\begin{array}{rl}Y& =\left[\begin{array}{c}{y}_{1}\\ {y}_{2}\\ \vdots \\ {y}_{n}\end{array}\right]\end{array}$

$\begin{array}{rl}X& =\left[\begin{array}{cccc}1& {x}_{11}& \dots & {x}_{1k}\\ 1& {x}_{21}& \dots & {x}_{2k}\\ \vdots & \dots & \dots \\ 1& {x}_{n1}& \dots & {x}_{nk}\end{array}\right]\end{array}$

$\begin{array}{rl}\beta & =\left[\begin{array}{c}{\beta}_{0}\\ {\beta}_{1}\\ \vdots \\ {\beta}_{k}\end{array}\right]\end{array}$

$\begin{array}{rl}\u03f5& =\left[\begin{array}{c}{\u03f5}_{1}\\ {\u03f5}_{2}\\ \vdots \\ {\u03f5}_{n}\end{array}\right]\end{array}$

The regression model is $Y=X\beta +\u03f5$

To find least square estimator of $\beta $ vector, we need to minimize $S(\beta )={\mathrm{\Sigma}}_{i=1}^{n}{\u03f5}_{i}^{2}={\u03f5}^{\prime}\u03f5=(y-x\beta {)}^{\prime}(y-x\beta )={y}^{\prime}y-2{\beta}^{\prime}{x}^{\prime}y+{\beta}^{\prime}{x}^{\prime}x\beta $

$\frac{\mathrm{\partial}S(\beta )}{\mathrm{\partial}\beta}=0$

My question: how to get $-2{x}^{\prime}y+2{x}^{\prime}x\beta $?

Multiple Regression
Answered

gladilkamwy
2022-08-11

Consider the multiple regression model

$Y=X\beta +\u03f5$

with the restriction that $\sum _{l}^{n}{b}_{i}=1$

I want to find the least squares estimator of $\beta $, so I need to solve the following optimization problem

$min(Y-X\beta {)}^{t}(Y-X\beta )$

$s.t.\sum _{l}^{n}{b}_{i}=1$

Let's set

$L=(Y-X\beta {)}^{t}(Y-X\beta )-\lambda ({U}^{t}\beta -1)={Y}^{t}Y+{\beta}^{t}{X}^{t}X\beta +-2{\beta}^{t}{X}^{t}Y-\lambda ({U}^{t}\beta -1)$

where U is a dummy vector of ones (and therefore ${U}^{T}\beta =\sum _{l}^{n}{b}_{i}$).

Take derivatives

$\frac{d}{d\beta}=2{X}^{t}X\beta -2{X}^{t}Y-\lambda {U}^{t}=0$

$\frac{d}{d\lambda}={U}^{t}\beta -1=0$

So from the first equation we can get an expression for $\beta $, but what should I do with the $\lambda $? The second equation doesn't seem to be useful to get rid of it.

Multiple Regression
Answered

Bernard Boyer
2022-07-19

What will be the best measure of the contribution of a variable in multiple linear regression? I was thinking of using the coefficient ratio as a marker of a variable's contribution.

For example:

If the equation is

${Y}_{predicted}={a}_{1}{X}_{1}+{a}_{2}{X}_{2}+{a}_{3}$

Then ${X}_{1}$'s contribution can be written down as:

$\frac{{a}_{1}}{{a}_{1}+{a}_{2}+{a}_{3}}$

Is there some other method possible to write down the contribution. Since in this case if any coefficient is negative, there is a possibility that a variable's contribution exceeds $100\mathrm{\%}$

Multiple Regression
Answered

Marisol Rivers
2022-07-18

Q1.

Model 1: $Y={X}_{1}{\beta}_{1}+\epsilon $

Model 2: $Y={X}_{1}{\beta}_{1}+{X}_{2}{\beta}_{2}+\epsilon $

(a) Suppose that Model 1 is true. If we estimates OLS estrimator ${b}_{1}$ for ${\beta}_{1}$ in Model 2, what will happen to the size and power properties of the test?

(b) Suppose that Model 2 is true. If we estimates OLS estrimator ${b}_{1}$ for ${\beta}_{1}$ in Model 1, what will happen to the size and power properties of the test?

-> Here is my guess.

(a) ${b}_{1}$ is unbiased, inefficient estimator. (I calculated it using formula for "inclusion of irrelevant variable" and ${b}_{1}=({X}_{1}^{\prime}{M}_{2}{X}_{1}{)}^{-1}{X}_{1}^{\prime}{M}_{2}Y$ where ${M}_{2}$ is symmetric and idempotent matrix) Inefficient means that it has larger variance thus size increases and power increases too.

(b) ${b}_{1}$ is biased, efficient estimator. (I use formular for "exclusion of relevant variable" and ${b}_{1}=({X}_{1}^{\prime}{X}_{1}{)}^{-1}{X}_{1}^{\prime}Y$) Um... I stuck here. What should I say using that information?

Q2.

Let $Q$ and $P$ be the quantity and price. Relation between them is different across reions of east, west, south and north, and as well, for different 4 seasons. Construct a model.

-> Actually, I don't know well about dummy variables. So any please solve this problem to help me.

Multiple Regression
Answered

anudoneddbv
2022-07-16

I would like to know something easy but very important.

Imagine I have a database with 0 NA, a perfect database who has been clean. And I have to do a PCA on this database. This datebase got a lot of individuals and variables ( 95 individuals and 10 variables)

I have to do a multiple regression and a PCA.

I must start per my multiple regression and eventually delete somme individuals who has been a Cook's distance > at the limit. And after I do my PCA on " new data base"

OR I must start per my PCA on my complete database, and after I do my multiple regression.

In conclusion, I must do :

- PCA

- multiple Regression

or

-multiple Regression

-PCA

Ty for helping me !

Multiple Regression
Answered

scherezade29pc
2022-07-14

Euclidean distance is not linear in high dimensions. However, in multiple regression the idea is to minimize square distances from data points to a hyperplane.

Other data analysis techniques have been considered problematic for their reliance on Euclidean distances (nearest neighbors), and dimensionality reduction techniques have been proposed.

Why is this not a problem in multiple regression?