Scalable Parallel Algorithms for Incomplete Factorization Preconditioning

Collaborators:

David Hysom

Objective:

The parallel computation of robust preconditioners is a priority for solving large systems of equations iteratively in several scientific computing problems. We are developing parallel algorithms and software that can compute incomplete factorization preconditioners for high level fill.

Approach:

We assume that the adjacency graph of the coefficient matrix is well-partitionable: i.e., that it can be partitioned into subgraphs of roughly equal sizes such that each subgraph has few boundary nodes relative to the number of interior nodes. We map the subgraphs to processors, form a subdomain interconnection graph, and order the subdomains by a graph coloring to reduce global dependences. On each subdomain, we locally reorder the interior vertices before the boundary vertices. This reordering limits the fill that joins a subgraph on one processor to a subgraph on another, and enhances the concurrency in the computation. The preconditioner computation takes places in two phases: in the first phase, each processor computes the rows of the preconditioner corresponding to the interior vertices of their subdomains. In the second phase, the rows corresponding to the boundary nodes are computed. This approach can make use of level-based and numerical threshold based algorithms for computing preconditioners in parallel.

Accomplishment:

We have reported results on up to 216 processors of the SGI Origin, the Cray T3E, the Sun HPC 10000, and Coral, the ICASE Beowulf cluster. These results show that the algorithm is scalable, in agreement with our analysis of the algorithm. Our parallel code, EUCLID, has been released with the HYPRE preconditioner package at Lawrence Livermore National Lab, and is available with interaces to PETSc.

back to highlights page