Parallel H-Matrix Arithmetic for Shared Memory Systems
Ronald Kriemann (MPI Leipzig)
H-matrices allow the usage of the usual matrix arithmetic in an
efficient, almost optimal way. A way to further improve the performance of
H-matrices is the parallelisation of the underlying algorithms.
Of special interest are shared memory systems with a moderate number of
processors, i.e. with 2 up to 32 processors. These systems provide a simple and
commonly supported programming interface in the form of
POSIX-Threads. Furthermore they are widely available, e.g. as
workstations or compute-servers.
Online scheduling algorithms are presented for matrix building, matrix
multiplication and inversion. These algorithms allow a good or optimal parallel
speedup without any knowledge of the actual costs. This property is especially
interesting for H-matrices with a fixed accuracy, since the rank and
therefore the costs are not known a-priori. For the matrix-vector multiplication
an offline scheduling algorithm with an optimal parallel speedup is introduced.
The theoretical results for all parallel algorithms are tested with numerical
examples coming from different BEM applications.