Sometimes when solving very sparse equation systems A x = b with conjugate gradie

Lucille Cummings

Lucille Cummings

Answered question

2022-06-16

Sometimes when solving very sparse equation systems
A x = b
with conjugate gradient using computers, if A is a very sparse matrix, it can be difficult to utilize the hardware computational power maximally
Does there exist some way to rewrite
A x = b  or  A T A x = A T b
to be able to utilize hardware better?
One idea I had is that often A 2 , A 3 et.c. are increasingly non-sparse. Maybe it would be possible to take multiple steps at once..?
One motivation why this should be possible is that the Krylov subspaces which the Conjugate Gradient investigates are precisely the powers of the matrix. A second motivation why this is possible is of course Caley Hamilton theorem
P ( A ) = 0 A 1 = P 2 ( A )
For some polynomial P 2 other than P.

Answer & Explanation

Patricia Curry

Patricia Curry

Beginner2022-06-17Added 15 answers

From a mathematical point of view the main problem with the idea of use A T A is due to the classical error estimate of CG:
e n e 0 = | | x x n | | A | | x x 0 | | A < 2 ( K A 1 K A + 1 ) n , K A = λ m a x λ m i n
For the matrix A T A its condition number is:
K A T A = ( K A ) 2
so you get a worsening that can be influence also the convergence of CG.
To be precise I tell can be because this estimate is an upper bound for the general case with any distribution of eigenvalues with extremes λmin,λmax. Anyway I think this is the only estimate that can be use from a practical point of view, and there are examples in which this estimate is reached.
From computational aspect there is the intention to write a code faster than blas with the idea to optimize the CPU usage, without using the current blas.
I would like to precise that if the idea is to speed up code the more suitable way is to use the blas in the best way inside the code, not to rewrite the blas. This in general.
The usual blas are for dense algebra and over to implement the correct math algorithm they also optimize the movement of the data in the memory. Nowadays the CPU are so fast that for a lot of time they wait the data and do not work. So blas are written with an optic of low level, also with assembly code, and to obtain the max result are written for specific architecture.

Do you have a similar question?

Recalculate according to your conditions!

New Questions in College algebra

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?