Scalable Parallel Algorithms for Incomplete Factorization Preconditioning
Collaborators:
David Hysom
Objective:
The parallel computation of robust preconditioners
is a priority for solving large systems of equations
iteratively in several scientific computing problems.
We are developing parallel algorithms and software that
can compute incomplete factorization
preconditioners for high level fill.
Approach:
We assume that the adjacency graph of the coefficient matrix
is well-partitionable: i.e., that it
can be partitioned into subgraphs
of roughly equal sizes such that each subgraph has few
boundary nodes relative to the number of interior nodes.
We map the subgraphs to processors, form
a subdomain interconnection graph, and order the subdomains
by a graph coloring to reduce global dependences.
On each subdomain, we locally reorder the interior vertices before
the boundary vertices. This reordering limits the fill
that joins a subgraph on one processor to a subgraph on another,
and enhances the concurrency in the computation.
The preconditioner computation takes places in two phases:
in the first phase, each processor computes the rows of the preconditioner
corresponding to the interior vertices of their subdomains.
In the second phase, the rows corresponding to the boundary nodes
are computed.
This approach can make use of level-based and numerical threshold based
algorithms for computing preconditioners in parallel.
Accomplishment:
We have reported results on up to 216 processors
of the SGI Origin, the Cray T3E,
the Sun HPC 10000, and Coral, the ICASE
Beowulf cluster. These results show that the algorithm is scalable,
in agreement with our analysis of the algorithm.
Our parallel code, EUCLID, has been released with the HYPRE
preconditioner package at Lawrence Livermore National Lab,
and is available with interaces to PETSc.
back to
highlights page