dlap/dlap.html

<html>

  <head>
    <title>
      DLAP - Sparse Linear Algebra Package
    </title>
  </head>

  <body bgcolor="#EEEEEE" link="#CC0000" alink="#FF3300" vlink="#000055">

    <h1 align = "center">
      DLAP <br> Sparse Linear Algebra Package
    </h1>

    <hr>

    <p>
      <b>DLAP</b>
      is a FORTRAN90 library which
      implements a number of routines
      for solving sparse linear systems,
      by Anne Greenbaum and Mark Seager.
    </p>

    <p>
      <b>DLAP</b> contains "core" routines for the iterative solution of
      symmetric and non-symmetric positive definite and positive
      semi-definite linear systems.
    </p>

    <p>
      Included in this package are core routines to do
      <ul>
        <li>
          Iterative Refinement iteration,
        </li>
        <li>
          Preconditioned Conjugate Gradient iteration,
        </li>
        <li>
          Preconditioned Conjugate Gradient iteration on the
          Normal Equations,
        </li>
        <li>
          Preconditioned BiConjugate Gradient iteration,
        </li>
        <li>
          Preconditioned BiConjugate Gradient Squared iteration,
        </li>
        <li>
          Orthomin iteration,
        </li>
        <li>
          Generalized Minimum Residual iteration (GMRES).
        </li>
      </ul>
    </p>

    <p>
      The DLAP core routines generally do not interact directly with
      the sparse matrix.  Instead, they require the user to supply
      two routines:
      <ul>
        <li>
          <b>MATVEC</b> (Matrix Vector Multiply)
        </li>
        <li>
          <b>MSOLVE</b> (Preconditioning).
        </li>
      </ul>
      This allows the core routines to be written in a way that makes them
      independent of the matrix data structure.  For each core routine
      there are several drivers and support routines that allow the
      user to utilize Diagonal Scaling and Incomplete Cholesky factorization
      or Incomplete LU factorization as preconditioners with no
      coding.  The price for this convenience is that one must use the
      a specific matrix data structure: DLAP Column or DLAP Triad
      format.
    </p>

    <p>
      This document contains the specifications for the DLAP Version
      2.0 package, a Fortran package for the solution of large
      sparse linear systems, Ax = b, via preconditioned iterative
      methods.
    </p>

    <p>
      Included in this package are core routines to do
      <ul>
        <li>
          Iterative Refinement (Jacobi's method),
        </li>
        <li>
          Conjugate Gradient,
        </li>
        <li>
          Conjugate Gradient on the normal equations, AA'y = b,
          (where x = A'y and A' denotes the  transpose of A),
        </li>
        <li>
          BiConjugate Gradient,
        </li>
        <li>
          BiConjugate Gradient Squared,
        </li>
        <li>
          Orthomin
        </li>
        <li>
          Generalized Minimum Residual (GMRES) Iteration.
        </li>
      </ul>
    </p>

    <p>
      These core routines do not require a fixed data structure for
      storing the matrix A and the preconditioning matrix M.
      The  user  is free  to  choose any
      structure that facilitates  efficient solution  of the problem at
      hand.  The drawback  to this approach  is that the user must also
      supply at least two routines  (MATVEC and MSOLVE,  say).
    </p>

    <p>
      MATVEC must calculate y = A*x, given x and the user's data structure for
      A.  MSOLVE must solve,  r = M*z, for z (*NOT*  r) given r  and the
      user's data  structure for  M (or its  inverse).  The user should
      choose   M  so that M*A   is  approximately the   identity and the
      solution step r = M*z is "easy" to  solve.
    </p>

    <p>
      For some of the "core"
      routines (Orthomin,  BiConjugate Gradient and  Conjugate Gradient
      on the  normal equations)   the user must  also  supply  a matrix
      transpose times   vector  routine  (MTTVEC,  say)  and (possibly,
      depending    on the "core"  method)   a  routine  that solves the
      transpose  of   the   preconditioning    step     (MTSOLV,  say).
      Specifically, MTTVEC is a routine which calculates y = A'x, given
      x and the user's data structure for A (A' is the transpose of A).
      MTSOLV is a routine which solves the system r = M'z for z given r
      and the user's data structure for M.
    </p>

    <p>
      This process of writing the matrix vector operations  can be time
      consuming and error prone.  To alleviate these problems we have
      written drivers  for the  "core" methods that  assume the user
      supplies one of two specific data structures (DLAP Triad and DLAP
      Column format), see below.  Utilizing these data structures  we
      have augmented  each "core" method with two preconditioners:
      Diagonal Scaling and Incomplete Factorization.  Diagonal scaling
      is easy to implement, vectorizes very  well and for problems that
      are not too ill-conditioned  reduces the number  of iterations
      enough to warrant its use.  On the other hand, an Incomplete
      factorization (Incomplete  Cholesky for symmetric systems   and
      Incomplete LU for nonsymmetric systems) may take much longer to
      calculate, but it reduces the iteration count (for most problems)
      significantly.  Our implementations of IC and ILU vectorize for
      machines with hardware gather scatter, but the vector lengths can
      be quite short if the number of nonzeros in a column is not
      large.
    </p>

    <h2>
      DLAP Triad format
    </h2>

    <p>
      In the DLAP Triad format only the non-zeros are stored.  They may
      appear in any order.  The user supplies three  arrays of length
      NELT, where NELT  is the  number of non-zeros  in the  matrix:
      (IA(NELT), JA(NELT), A(NELT)).  If the matrix is symmetric then
      one need only store the lower triangle (including the diagonal)
      and NELT would be the corresponding number of non-zeros stored.
      For each non-zero the user puts the row and column index of that
      matrix element in the IA and JA  arrays.  The value of the
      non-zero matrix element is placed in  the corresponding location
      of the A array.   This is an extremely easy data structure to
      generate.  On the other hand, it is not very efficient on vector
      computers for the iterative  solution of linear systems.  Hence,
      DLAP changes this input data structure to the DLAP Column format
      for the iteration (but does not change it back).
    </p>

    <p>
      Here  is an example   of  the  DLAP  Triad storage  format  for a
      nonsymmetric 5x5 Matrix.  NELT=11.   Recall that the  entries may
      appear in any order.
      <pre>
      |11 12  0  0 15|
      |21 22  0  0  0|
      | 0  0 33  0 35|
      | 0  0  0 44  0|
      |51  0 53  0 55|
      </pre>
      Here is the DLAP Triad format for the same 5x5 matrix:
      <pre>
           1  2  3  4  5  6  7  8  9 10 11
       A: 51 12 11 33 15 53 55 22 35 44 21
      IA:  5  1  1  3  1  5  5  2  3  4  2
      JA:  1  2  1  3  5  3  5  2  5  4  1
      </pre>
    </p>

    <h2>
      DLAP Column format
    </h2>

    <p>
      In the DLAP Column format  the non-zeros are stored counting down
      columns (except for the  diagonal entry,  which must appear first
      in each "column") and are stored  in the real array A.   In other
      words, for each column in the matrix first put the diagonal entry
      in A.   Then put in  the other non-zero  elements going  down the
      column (except the  diagonal) in order.   The IA  array holds the
      row index for each non-zero.  The JA array holds the offsets into
      the  IA,  A arrays for the  beginning  of each column.   That is,
      IA(JA(ICOL)), A(JA(ICOL)) are the  first elements of the  ICOL-th
      column in IA   and A.  IA(JA(ICOL+1)-1),  A(JA(ICOL+1)-1) are the
      last elements  of  the ICOL-th column.   Note that we always have
      JA(N+1) = NELT+1, where N is the number of  columns in the matrix
      and NELT is the number of non-zeros in the matrix.  If the matrix
      is  symmetric one need  only store the  lower triangle (including
      the diagonal) and  NELT  would be the   corresponding   number of
      non-zeros stored.
    </p>

    <p>
      Here is  an  example of the  DLAP   Column storage format  for  a
      nonsymmetric 5x5 Matrix (in the  A and  IA arrays '|' denotes the
      end of a column):
      <pre>
      |11 12  0  0 15|
      |21 22  0  0  0|
      | 0  0 33  0 35|
      | 0  0  0 44  0|
      |51  0 53  0 55|
      </pre>
        Here is the DLAP Column format for the 5x5 matrix:
      <pre>
           1  2  3    4  5    6  7    8    9 10 11
       A: 11 21 51 | 22 12 | 33 53 | 44 | 55 15 35
      IA:  1  2  5 |  2  1 |  3  5 |  4 |  5  1  3
      JA:  1  4  6    8  9   12
      </pre>
    </p>

    <h2>
      Naming Conventions
    </h2>

    <p>
      DLAP  iterative  methods,    matrix vector    and  preconditioner
      calculation  routines   follow a naming   convention  which, when
      understood, allows one to determine the iterative method and data
      structure(s) used.  The  subroutine  naming convention  takes the
      following form:
      <blockquote>
        P[S][M]DESC
      </blockquote>
      where P stands  for the precision (or  data type)  of the routine
      and is required in   all names, S denotes   whether  or  not  the
      routine requires the DLAP Triad or Column format (it  does if the
      second letter  of the name   is S and   does not otherwise),  the
      optional  M   stands for the   type of preconditioner used  (only
      appears in drivers for  "core" routines) and  DESC is some number
      of letters describing the method  or purpose of the  routine.  In
      this  incarnation of DLAP  only single  precision  data types are
      supported (no double precision or complex data type routines have
      been written).   Hence,  all routines start with an   S,  boring.
      The  brackets around  S  and M  designate  that these  fields are
      optional.
    </p>

    <p>
      The  following is a  list  of  the  "DESC"  fields for  iterative
      methods  and   their meaning: BCG:   BiConjugate   Gradient;  CG:
      Conjugate  Gradient; CGN:   Conjugate  Gradient   on   the Normal
      equations; CGS, CS: biConjugate Gradient Squared; GMRES, GMR, GM:
      Generalized Minimum  RESidual; IR, R:  Iterative Refinement; JAC:
      JACobi's method; GS: Gauss-Seidel; OMN, OM: Orthomin.
    </p>

    <p>
      Here are some examples of the routines:
      <ol>
        <li>
          SBCG: Single precision BiConjugate Gradient "core" routine.
          One can deduce that this is a "core" routine, because the S and M
          fields are missing and BiConjugate Gradient is an iterative
          method.
        </li>
        <li>
          SSDBCG: Single precision, DLAP data structure BCG with
          Diagonal scaling.
        </li>
        <li>
          SSLUBC: Single precision, BCG with
          incomplete LU factorization as the preconditioning.
        </li>
        <li>
          SCG: Single precision Conjugate Gradient "core" routine.
        </li>
        <li>
          SSDCG: Single precision, DLAP data structure Conjugate Gradient with
          Diagonal scaling.
        </li>
        <li>
          SSICCG: Single precision, DLAP data structure Conjugate Gradient
          with Incomplete Cholesky factorization preconditioning.
        </li>
      </ol>
    </p>

    <h2>
      Which Method To Use
    </h2>

    <p>
      In solving a large sparse linear system Ax = b using an iterative
      method, it is not necessary to actually store the matrix A.
      Rather, what is needed is a procedure  for multiplying the matrix
      A times a given vector y to obtain the matrix-vector product, Ay.
      DLAP has been written to take advantage of this fact.  The higher
      level routines in the package require storage only of the nonzero
      elements of A (and their positions), and even this can be
      avoided, if the user writes his own subroutine for multiplying
      the matrix times a vector and calls the lower-level iterative
      routines in the package.
    </p>

    <p>
      If the matrix A is ill-conditioned, then most iterative methods
      will be slow to converge (if they converge at all!).  To improve
      the convergence rate, one may use a "matrix splitting," or,
      "preconditioning matrix," say, M.  It is then necessary to solve,
      at each iteration, a linear system with coefficient matrix M.  A
      good preconditioner M should have two properties: (1) M should
      "approximate" A, in the sense that the matrix inv(M)*A  (or some
      variant thereof) is better conditioned than the original matrix
      A; and (2) linear systems with coefficient matrix M should be
      much easier to solve than the original system with coefficient
      matrix A.   Preconditioning routines in the DLAP package are
      separate from the iterative routines, so that any of the
      preconditioners provided in the package, or one that the user
      codes, can be used with any of the iterative routines.
    </p>

    <p>
      If you are  willing to live with either the DLAP Triad or Column
      matrix data structure  you can then  choose one of two types of
      preconditioners to use: diagonal scaling or incomplete
      factorization.  To choose between these two methods requires
      knowing something about the computer you're going to run these
      codes on and how well incomplete factorization  approximates the
      inverse of your matrix.
    </p>

    <p>
      Let's suppose you have a scalar machine.   Then, unless the
      incomplete factorization is very,  very poor this is *GENERALLY*
      the method to choose.  It will reduce the number of iterations
      significantly and is not all that expensive to compute.  So if
      you have just one  linear system to solve and "just want to get
      the job  done" then try  incomplete factorization first.   If you
      are thinking of integrating some DLAP  iterative method into your
      favorite "production code" then  try incomplete  factorization
      first, but also check to see that  diagonal  scaling is indeed
      slower for a large sample of test problems.
    </p>

    <p>
      Let's now  suppose you have a vector computer with hardware
      gather/scatter support (Cray X-MP, Y-MP, SCS-40 or Cyber 205, ETA
      10, ETA Piper  or Convex C-1, etc.).   Then it's much harder to
      choose between the  two  methods.  The versions of incomplete
      factorization in DLAP do in fact vectorize, but have short vector
      lengths and the factorization step is relatively more expensive.
      Hence, for most problems (i.e., unless your problem  is ill
      conditioned, sic!) diagonal scaling is faster, with its very
      fast set up time and vectorized  (with long vectors)
      preconditioning step (even though it may take more iterations).
      If you have several systems (or right hand sides) to  solve that
      can utilize the same preconditioner then the cost of the
      incomplete factorization can be  amortized over these several
      solutions.  This situation gives more advantage to the incomplete
      factorization methods.  If you have a vector machine without
      hardware gather/scatter (Cray  1, Cray 2  & Cray 3) then the
      advantages for incomplete factorization are even less.
    </p>

    <p>
      If you're trying to shoehorn DLAP into your favorite "production
      code" and can not easily generate either the DLAP Triad or Column
      format, then you are left to your own devices in terms of
      preconditioning.  Also, you may find that the preconditioners
      supplied with DLAP are not sufficient for your problem.  In this
      situation, we would recommend that you talk with a numerical
      analyst versed in iterative methods about writing other
      preconditioning subroutines (e.g.,  polynomial preconditioning,
      shifted incomplete factorization, SOR or SSOR iteration).  You
      can always "roll your own" by using the "core" iterative methods
      and supplying your own MSOLVE and MATVEC (and possibly MTSOLV and
      MTTVEC) routines.  If you do develop a new preconditioner for the
      DLAP data structure send the code to us  (if you can do that with
      no strings attached!, i.e. copyright restrictions) and we'll add
      it to the package!
    </p>

    <p>
      If your matrix is symmetric then you would want to use one of the
      symmetric system  solvers.    If your system  is  also positive
      definite,  (Ax,x) (Ax dot  product with x) is  positive for all
      non-zero vectors x, then use Conjugate Gradient (SCG,  SSDCG,
      SSICSG).  If you're not sure it's SPD  (symmetric and Positive
      Definite) then try SCG anyway and if it works, fine.  If you're
      sure your matrix is not positive definite then you may want to
      try the iterative refinement methods (SIR) or the GMRES code
      (SGMRES) if SIR converges too slowly.
    </p>

    <p>
      Nonsymmetric systems are an area of active research in numerical
      analysis and there are new strategies being developed.
      Consequently take the following advice with a grain of salt.  If
      your matrix is positive definite, (Ax,x)  (Ax  dot product with x
      is positive for all non-zero vectors x), then you can use any of
      the methods for nonsymmetric systems (Orthomin, GMRES,
      BiConjugate Gradient, BiConjugate Gradient  Squared and Conjugate
      Gradient applied to the normal equations).  If your system is not
      too ill conditioned then try  BiConjugate Gradient Squared (BCGS)
      or GMRES (SGMRES).  Both of these methods converge very quickly
      and do not require A' or M' (' denotes transpose) information.
      SGMRES does require some additional storage, though.  If the
      system is very ill conditioned or nearly positive  indefinite
      ((Ax,x) is positive, but may be very small), then GMRES should
      be the first choice, but try the other methods if you have to
      fine tune the solution process for a "production code".  If you
      have a great preconditioner for the normal equations (i.e., M is
      an approximation to the inverse of AA' rather than just A) then
      this is not a bad route to travel.  Old wisdom would say that the
      normal equations are a disaster (since it squares the condition
      number of the system and SCG convergence is linked to this number
      of infamy), but some preconditioners (like incomplete
      factorization) can reduce the condition number back below that of
      the original system.
    </p>

    <p>
      <b>DLAP</b> can store a sparse matrix in a file, using
      <a href = "../../data/dlap/dlap.html">
      SLAP/DLAP Sparse Matrix File Format</a>.  Such a file can be written
      by the routine <b>DTOUT</b>, or read back in by <b>DTIN</b>.
    </p>

    <p>
      There is also a routine <b>DBHIN</b> which can read in a file in
      <a href = "../../data/hb/hb.html">
      Harwell Boeing Sparse Matrix File Format</a>.
    </p>

    <h3 align = "center">
      Related Data and Programs:
    </h3>

    <p>
      <a href = "../../f_src/blas1_d/blas1_d.html">
      BLAS1_D</a>,
      a FORTRAN90 library which
      carries out vector-vector operations.
    </p>


    <p>
      <a href = "../../c_src/csparse/csparse.html">
      CSPARSE</a>,
      a C library which
      implements direct methods of solving
      sparse linear systems,
      by Timothy Davis.
    </p>

    <p>
      <a href = "../../f_src/dlap_io/dlap_io.html">
      DLAP_IO</a>,
      a FORTRAN90 library which
      reads and writes files
      describing sparse matrices used by DLAP.
    </p>

    <p>
      <a href = "../../data/dsp/dsp.html">
      DSP</a>,
      a data directory which
      contains a description and
      examples of the DSP format for storing sparse matrices,
      which is equivalent to the SLAP/DLAP Triad format.
    </p>

    <p>
      <a href = "../../f_src/hb_to_st/hb_to_st.html">
      HB_TO_ST</a>,
      a FORTRAN90 program which
      converts the sparse matrix information stored in a Harwell-Boeing
      file into a sparse triplet file.
    </p>

    <p>
      <a href = "../../f_src/linplus/linplus.html">
      LINPLUS</a>,
      a FORTRAN90 library which
      manipulates
      matrices in a variety of formats.
    </p>

    <p>
      <a href = "../../f_src/machine/machine.html">
      MACHINE</a>,
      a FORTRAN90 library which
      supplies certain
      machine arithmetic constants.  A copy of this library
      is used by <b>DLAP</b>.
    </p>

    <p>
      <a href = "../../f_src/mgmres/mgmres.html">
      MGMRES</a>,
      a FORTRAN90 library which
      applies the restarted GMRES algorithm
      to solve a sparse linear system,
      by Lili Ju.
    </p>

    <p>
      <a href = "../../f_src/slatec/slatec.html">
      SLATEC</a>,
      a FORTRAN90 library which
      includes <b>DLAP</b>.
    </p>

    <p>
      <a href = "../../data/sparse_cc/sparse_cc.html">
      SPARSE_CC</a>,
      a data directory which
      contains a description and examples of the CC format,
      ("compressed column") for storing a sparse matrix,
      including a way to write the matrix as a set of three files.
    </p>

    <p>
      <a href = "../../data/sparse_cr/sparse_cr.html">
      SPARSE_CR</a>,
      a data directory which
      contains a description and examples of the CR format,
      ("compressed row") for storing a sparse matrix,
      including a way to write the matrix as a set of three files.
    </p>

    <p>
      <a href = "../../f_src/sparsekit/sparsekit.html">
      SPARSEKIT</a>,
      a FORTRAN90 library which
      carries out sparse matrix operations, 
      by Yousef Saad.
    </p>

    <p>
      <a href = "../../f_src/sparsepak/sparsepak.html">
      SPARSEPAK</a>,
      a FORTRAN90 library which
      forms an obsolete version of
      the Waterloo Sparse Matrix Package.
    </p>

    <p>
      <a href = "../../f_src/xerror/xerror.html">
      XERROR</a>,
      is a FORTRAN90 library which
      prints error message routines used by DLAP.
    </p>

    <h3 align = "center">
      Author:
    </h3>

    <p>
      Anne Greenbaum, <br>
      Courant Institute;<br>
      <br>
      Mark Seager, <br>
      Lawrence Livermore National Laboratory
    </p>

    <h3 align = "center">
      Reference:
    </h3>

    <p>
      <ol>
        <li>
          Peter Brown, Alan Hindmarsh,<br>
          Reduced Storage Matrix Methods In Stiff ODE Systems,<br>
          Technical Report UCRL-95088, Revision 1,<br>
          Lawrence Livermore National Laboratory, June 1987.
        </li>
        <li>
          Paul Concus, Gene Golub, Dianne OLeary,<br>
          A Generalized Conjugate Gradient Method for the Numerical
          Solution of Elliptic Partial Differential Equations,<br>
          in Symposium on Sparse Matrix Computations,<br>
          edited by James Bunch, Donald Rose,<br>
          Academic Press, 1979,<br>
          ISBN: 0121410501,<br>
          LC: QA188.S9.
        </li>
        <li>
          Gene Golub, Charles VanLoan,<br>
          Matrix Computations,<br>
          Third Edition,<br>
          Johns Hopkins, 1996,<br>
          ISBN: 0-8018-4513-X,<br>
          LC: QA188.G65.
        </li>
        <li>
          Louis Hageman, David Young,<br>
          Applied Iterative Methods,<br>
          Academic Press, 1981,<br>
          ISBN: 0-12-313340-8,<br>
          LC: QA297.8.H34.
        </li>
        <li>
          Ron Jones, David Kahaner,<br>
          XERROR, The SLATEC Error Handling Package,<br>
          Technical Report SAND82-0800,<br>
          Sandia National Laboratories, 1982.
        </li>
        <li>
          Ron Jones, David Kahaner,<br>
          XERROR, The SLATEC Error Handling Package,<br>
          Software: Practice and Experience,<br>
          Volume 13, Number 3, 1983, pages 251-257.
        </li>
        <li>
          Erik Kaasschieter,<br>
          The solution of non-symmetric linear systems by bi-conjugate
          gradients or conjugate gradients squared,<br>
          Technical Report 86-21,<br>
          Delft University of Technology Report, 1986.
        </li>
        <li>
          Charles Lawson, Richard Hanson, David Kincaid, Fred Krogh,<br>
          Algorithm 539:
          Basic Linear Algebra Subprograms for Fortran Usage,<br>
          ACM Transactions on Mathematical Software,<br>
          Volume 5, Number 3, September 1979, pages 308-323.
        </li>
        <li>
          Mark Seager,<br>
          A SLAP for the Masses,<br>
          Technical Report: UCRL-100267,<br>
          Lawrence Livermore National Laboratory, December 1988.
        </li>
        <li>
          Richard Singleton,<br>
          Algorithm 347:
          An Efficient Algorithm for Sorting with Minimal Storage,<br>
          Communications of the ACM,<br>
          Volume 12, Number 3, March 1969, pages 185-187.
        </li>
        <li>
          Peter Sonneveld,<br>
          CGS, a fast Lanczos-type solver for nonsymmetric linear systems,<br>
          Technical Report 84-16,<br>
          Department of Mathematics and Informatics,<br>
          Delft University of Technology, 1984.
        </li>
        <li>
          <a href = "http://www.netlib.org/slap/index.html">
                     http://www.netlib.org/slap/index.html</a>,
          NETLIB web site.
        </li>
      </ol>
    </p>

    <h3 align = "center">
      Source Code:
    </h3>

    <p>
      <ul>
        <li>
          <a href = "dlap.f90">dlap.f90</a>, the source code.
        </li>
        <li>
          <a href = "dlap.sh">dlap.sh</a>,
          commands to compile the source code.
        </li>
      </ul>
    </p>

    <h3 align = "center">
      Examples and Tests:
    </h3>

    <p>
      <b>DLAP_PRB</b> is a formal test program that checks all the DLAP
      routines.  However, it's not very useful as a template for
      writing your own program!
      <ul>
        <li>
          <a href = "dlap_prb.f90">dlap_prb.f90</a>, a sample problem.
        </li>
        <li>
          <a href = "dlap_prb.sh">dlap_prb.sh</a>,
          commands to compile, link and run the sample problem.
        </li>
        <li>
          <a href = "dlap_prb_output.txt">dlap_prb_output.txt</a>,
          the output file.
        </li>
      </ul>
    </p>

    <p>
      <b>DLAP_DGMRES_PRB</b> is a "simple" test program applies the GMRES
      algorithm to the [-1,2,-1] matrix.  No preconditioning is used.  The
      entries of the matrix are stored naively, in DLAP Triad format.  You
      might find this example easier to adapt to your own purposes.
      <ul>
        <li>
          <a href = "dlap_dgmres_prb.f90">dlap_dgmres_prb.f90</a>,
          a sample problem.
        </li>
        <li>
          <a href = "dlap_dgmres_prb.sh">dlap_dgmres_prb.sh</a>,
          commands to compile, link and run the sample problem.
        </li>
        <li>
          <a href = "dlap_dgmres_prb_output.txt">dlap_dgmres_prb_output.txt</a>,
          the output file.
        </li>
      </ul>
    </p>

    <h3 align = "center">
      List of Routines:
    </h3>

    <p>
      <ul>
        <li>
          <b>I1MACH</b> returns integer machine constants.
        </li>
        <li>
          <b>ISAMAX</b> finds the index of the vector element of maximum absolute value.
        </li>
        <li>
          <b>ISDBCG:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition, Stop Test.
        </li>
        <li>
          <b>ISDCG</b> calculates the stop test for the Conjugate Gradient iteration scheme.
        </li>
        <li>
          <b>ISDCGN:</b> stop test for the Conjugate Gradient applied to the normal equations.
        </li>
        <li>
          <b>ISDCGS:</b> stop test for the BiConjugate Gradient iteration scheme.
        </li>
        <li>
          <b>ISDGMR:</b> stop test for the GMRES iteration.
        </li>
        <li>
          <b>ISDIR:</b> stop test for the iterative refinement iteration.
        </li>
        <li>
          <b>ISDOMN</b> : stop test for the Orthomin iteration.
        </li>
        <li>
          <b>J4SAVE</b> saves and recalls variables needed for error handling.
        </li>
        <li>
          <b>QS2I1R</b> sorts an integer array, and updates companion integer and real arrays.
        </li>
        <li>
          <b>D1MACH</b> returns double precision machine constants.
        </li>
        <li>
          <b>RAND</b> generates a uniformly distributed random number.
        </li>
        <li>
          <b>DASUM</b> sums the absolute values of the entries of a vector.
        </li>
        <li>
          <b>DAMAX</b> returns the maximum absolute value of the entries in a vector.
        </li>
        <li>
          <b>DAXPY</b> adds a constant times one vector to another.
        </li>
        <li>
          <b>DAXPYX</b> computes a constant times a vector plus a vector.
        </li>
        <li>
          <b>DBCG:</b> solve a Non-Symmetric system using Preconditioned BiConjugate Gradient.
        </li>
        <li>
          <b>DBHIN:</b> read a sparse linear system in Boeing Harwell format.
        </li>
        <li>
          <b>DCG:</b> solve PDS system using Preconditioned Conjugate Gradient.
        </li>
        <li>
          <b>DCGN:</b> solve system using Preconditioned Conjugate Gradient/Normal Equations.
        </li>
        <li>
          <b>DCGS</b> solve Non-Symmetric system using Preconditioned BiConjugate Gradient.
        </li>
        <li>
          <b>DCHKW</b> checks work array lengths.
        </li>
        <li>
          <b>DCOPY</b> copies one real vector into another.
        </li>
        <li>
          <b>DCPPLT</b> prints a DLAP column format matrix.
        </li>
        <li>
          <b>DDOT</b> forms the dot product of two vectors.
        </li>
        <li>
          <b>DDSDOT</b> computes the dot product of real vectors using double precision.
        </li>
        <li>
          <b>DGMRES</b> solves a nonsymmetric system using preconditioned GMRES.
        </li>
        <li>
          <b>DHELS</b> minimizes the norm of B-A*X using factors from SHEQR.
        </li>
        <li>
          <b>DHEQR</b> QR decomposes an upper Hessenberg matrix using Givens rotations.
        </li>
        <li>
          <b>DIR:</b> Linear system, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DLLTI2</b> solves (L*D*L')*X=B, L is unit lower triangular, and D is diagonal.
        </li>
        <li>
          <b>DNRM2</b> computes the Euclidean norm of a vector.
        </li>
        <li>
          <b>DOMN:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition, Orthomin
        </li>
        <li>
          <b>DORTH:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition, Generalized Minimum Residual
        </li>
        <li>
          <b>DPIGMR:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition, Generalized Minimum Residual
        </li>
        <li>
          <b>DRLCAL</b> calculates the scaled residual RL from the V(I)'s.
        </li>
        <li>
          <b>DROT</b> applies a plane rotation.
        </li>
        <li>
          <b>DROTG</b> constructs a Givens plane rotation.
        </li>
        <li>
          <b>DROTM</b> applies a modified Givens plane rotation.
        </li>
        <li>
          <b>DROTMG</b> constructs a modified Givens plane rotation.
        </li>
        <li>
          <b>DS2LT:</b> Linear system, DLAP Sparse, Lower Triangle.
        </li>
        <li>
          <b>DS2Y:</b> convert from DLAP Triad to DLAP Column format.
        </li>
        <li>
          <b>DSCAL</b> scales a vector by a constant.
        </li>
        <li>
          <b>DSD2S:</b> compute the inverse of the diagonal of A*A'.
        </li>
        <li>
          <b>DSDBCG:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSDCG:</b> Symmetric Linear system, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSDCGN:</b> Non-Symmetric Linear system solve, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSDCGS:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSDGMR:</b>  Non-Symmetric Linear system, Sparse, Iterative Precondition, Generalized Minimum Residual
        </li>
        <li>
          <b>DSDI</b> calculates the product X = DIAG*B where DIAG is a diagonal matrix.
        </li>
        <li>
          <b>DSDOMN:</b> Non-Symmetric Linear system solve, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSDS:</b> DLAP Sparse, Diagonal
        </li>
        <li>
          <b>DSDSCL:</b> DLAP Sparse, Diagonal
        </li>
        <li>
          <b>DSGS:</b> Linear system, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSICCG:</b> Symmetric Linear system, Sparse, Iterative Precondition, Incomplete Cholesky
        </li>
        <li>
          <b>DSICS:</b> Linear system, DLAP Sparse, Iterative Precondition, Incomplete Cholesky Factorization.
        </li>
        <li>
          <b>DSILUR:</b> Linear system, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSILUS:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition, Incomplete LU Factorization
        </li>
        <li>
          <b>DSJAC</b> solves a linear system A*x=b using Jacobi iteration.
        </li>
        <li>
          <b>DSLI:</b> Linear system solve, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSLI2:</b> Linear system solve, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSLLTI:</b> Linear system solve, Sparse, Iterative Precondition
        </li>
        <li>
          <b>DSLUBC:</b> Non-Symmetric Linear system, Sparse, Iterative incomplete LU Precondition
        </li>
        <li>
          <b>DSLUCN:</b> Non-Symmetric Linear system, Sparse, Iterative Incomplete LU Precondition
        </li>
        <li>
          <b>DSLUCS:</b> Non-Symmetric Linear system, Sparse, Iterative incomplete LU Precondition
        </li>
        <li>
          <b>DSLUGM:</b> Non-Symmetric Linear system, Sparse, Iterative Precondition, Generalized Minimum Residual
        </li>
        <li>
          <b>DSLUI</b> applies the incomplete LU preconditioner.
        </li>
        <li>
          <b>DSLUI2</b> carries out the incomplete LU preconditioning.
        </li>
        <li>
          <b>DSLUI4</b> solves (L*D*U)'*X = B.
        </li>
        <li>
          <b>DSLUOM</b> is an incomplete LU Orthomin solver for A*x=b.
        </li>
        <li>
          <b>DSLUTI</b> is the DLAP MTSOLV for LDU Factorization.
        </li>
        <li>
          <b>DSMMI2</b> back solves for LDU factorization of the normal equations.
        </li>
        <li>
          <b>DSMMTI</b> is the DLAP MSOLVE for LDU Factorization of Normal Equations.
        </li>
        <li>
          <b>DSMTV:</b> Matrix transpose Vector Multiply, Sparse
        </li>
        <li>
          <b>DSMV,</b> Matrix Vector Multiply, Sparse
        </li>
        <li>
          <b>DSWAP</b> interchanges two vectors.
        </li>
        <li>
          <b>DTIN</b> Linear system, DLAP Sparse, Diagnostics
        </li>
        <li>
          <b>DTOUT</b> writes out DLAP Triad Format Linear System.
        </li>
        <li>
          <b>DXLCAL</b> computes the solution XL, the current DGMRES iterate.
        </li>
        <li>
          <b>XERABT</b> aborts program execution and print error message.
        </li>
        <li>
          <b>XERCLR</b> resets current error number to zero.
        </li>
        <li>
          <b>XERCTL</b> allow user control over handling of errors.
        </li>
        <li>
          <b>XERDMP</b> prints the error tables and then clears them.
        </li>
        <li>
          <b>XERMAX</b> sets maximum number of times any error message is to be printed.
        </li>
        <li>
          <b>XERPRT</b> prints error messages.
        </li>
        <li>
          <b>XERROR</b> processes an error message.
        </li>
        <li>
          <b>XERRWV</b> processes an error message.
        </li>
        <li>
          <b>XERSAV</b> records that an error has occurred.
        </li>
        <li>
          <b>XGETF</b> returns the current value of the error control flag.
        </li>
        <li>
          <b>XGETUA</b> returns unit number(s) to which error messages are being sent.
        </li>
        <li>
          <b>XGETUN</b> returns the (first) output file to which error messages are being sent.
        </li>
        <li>
          <b>XSETF</b> sets the error control flag.
        </li>
        <li>
          <b>XSETUA</b> sets logical unit numbers to which error messages are to be sent.
        </li>
        <li>
          <b>XSETUN</b> sets the output file to which error messages are to be sent.
        </li>
        <li>
          <b>TIMESTAMP</b> prints the current YMDHMS date as a time stamp.
        </li>
      </ul>
    </p>

    <p>
      You can go up one level to <a href = "../f_src.html">
      the FORTRAN90 source codes</a>.
    </p>

    <hr>

    <i>
      Last revised on 01 January 2010.
    </i>

    <!-- John Burkardt -->

  </body>

</html>