Deromediqm

Answered

2022-07-22

Help in understanding properties of big O notation with norms of vectors

In Fast Exact Multiplication by the Hessian equation 1,

$O({\Vert \mathrm{\Delta}w\Vert}^{2})$ gets taken from RHS to LHS and $\mathrm{\Delta}w$ is substituted as rv where r is small scalar and v is a vector. I understand that $O({\Vert rv\Vert}^{2})=O({r}^{2}{\Vert v\Vert}^{2})$. But what I don't get is how did the sign of O not become negative when it was taken to LHS. And why did ${\Vert v\Vert}^{2}$ disappear form O. Is it because as r is tiny it will govern the O term and ${\Vert v\Vert}^{2}$ won't matter?

In Fast Exact Multiplication by the Hessian equation 1,

$O({\Vert \mathrm{\Delta}w\Vert}^{2})$ gets taken from RHS to LHS and $\mathrm{\Delta}w$ is substituted as rv where r is small scalar and v is a vector. I understand that $O({\Vert rv\Vert}^{2})=O({r}^{2}{\Vert v\Vert}^{2})$. But what I don't get is how did the sign of O not become negative when it was taken to LHS. And why did ${\Vert v\Vert}^{2}$ disappear form O. Is it because as r is tiny it will govern the O term and ${\Vert v\Vert}^{2}$ won't matter?

Answer & Explanation

Hassan Watkins

Expert

2022-07-23Added 18 answers

$O(f(x))$ refers to a quantity that's bounded (in the limit) by some multiple of f(x); it properly (IMHO) represents a set. That is, to say that $g(x)\in O(f(x))$ means that there's some constant C such that for all sufficiently large (or sufficiently small, depending on the direction of the limit) x, $|g(x)|<C|f(x)|$. But this means in particular that O() is 'sign-agnostic'; if $g(x)\in O(f(x))$, then $cg(x)\in O(f(x))$ for all constants c, positive or negative. This same fact is also what 'removes' the dependency on ${\Vert v\Vert}^{2}$; because the limit that's implicit in O() is being taken with respect to r and v doesn't depend on r, the quantity ${\Vert v\Vert}^{2}$ is 'constant' and can be absorbed into the constant in O().

Hayley Bernard

Expert

2022-07-24Added 5 answers

When we write x=O(z), where x is some expression, and z is some other quantity, then we mean that x is a quantity such that $|x|\le Cz$ for some number C. If we have $x=y+O(z)$, then we really mean x−y=O(z) in the sense I just described. So x−y is equal to some quantity z′ such that $|{z}^{\prime}|\le Cz$. So we can say

$y=x+O(z)$

too, since $y-x=-{z}^{\prime}$, and $|\phantom{\rule{negativethinmathspace}{0ex}}-\phantom{\rule{negativethinmathspace}{0ex}}{z}^{\prime}|=|{z}^{\prime}|\le Cz$. So the minus sign does not need to be kept track of, and it's best to think of O(z) as meaning a quantity, whose absolute value is bounded by a constant times z.

$y=x+O(z)$

too, since $y-x=-{z}^{\prime}$, and $|\phantom{\rule{negativethinmathspace}{0ex}}-\phantom{\rule{negativethinmathspace}{0ex}}{z}^{\prime}|=|{z}^{\prime}|\le Cz$. So the minus sign does not need to be kept track of, and it's best to think of O(z) as meaning a quantity, whose absolute value is bounded by a constant times z.

Most Popular Questions