Suppose that there is a positive definite matrix $\mathbf{A}\in {\mathbb{R}}^{n\times n}$, and a vector $\mathbf{b}\in {\mathbb{R}}^{n}$, then minimization of quadratic functions with linear terms can be done in closed form as

$\mathrm{arg}\underset{\mathbf{x}\in {\mathbb{R}}_{n}}{min}(\frac{1}{2}{\mathbf{x}}^{\mathsf{T}}\mathbf{A}\mathbf{x}-{\mathbf{b}}^{\mathsf{T}}\mathbf{x})={\mathbf{A}}^{-1}\mathbf{b}$

I met this in a machine learning book. However, the book didn't provide a proof. I wonder why this can be well-formed. Hope that someone can help me with it. I find that many machine learning books like to skip all of the proofs, which made me uncomfortable.

$\mathrm{arg}\underset{\mathbf{x}\in {\mathbb{R}}_{n}}{min}(\frac{1}{2}{\mathbf{x}}^{\mathsf{T}}\mathbf{A}\mathbf{x}-{\mathbf{b}}^{\mathsf{T}}\mathbf{x})={\mathbf{A}}^{-1}\mathbf{b}$

I met this in a machine learning book. However, the book didn't provide a proof. I wonder why this can be well-formed. Hope that someone can help me with it. I find that many machine learning books like to skip all of the proofs, which made me uncomfortable.