My list of technical reports and preprints.

This is a list of my technical reports and preprints that you may look at:

Additive Schwarz Methods for Elliptic Mortar Finite Element Problems
A Coarse Space Formulation with good Parallel Properties for an Additive Schwarz Domain Decomposition Algorithm
A note on high precision solutions of two fourth order eigenvalue problems
Computational Science, en tredje vei for forskning
Multilevel Parallel Solution of Large, Sparse Finite Element Equations from Structural Analysis
Mathematics, parallel computing and reservoir simulation.
Parallel implementation of a Schwarz Domain Decomposition Algorithm.
Efficient algorithms for solving a fourth order equation with the spectral-Galerkin method.
Two Different Data-Parallel Implementations of the BLAS.
Data-Parallel BLAS as a basis for LAPACK on Massively Parallel Computers.
Domain Decomposition, Parallel Computing and Petroleum Engineering.
Unstructured Grids on SIMD Torus Machines.
Parallel Substructuring Algorithms in Structural Analysis, Direct and Iterative Methods.
Parallel Domain Decomposition and Iterative Refinement Algorithms.
A New Algorithm for the SLALOM Benchmark.
Domain Decomposition Algorithms of Schwarz type, designed for Massively Parallel Computers.
To Overlap or not to Overlap: A Note on A Domain Decomposition Method for Elliptic Problems.
Efficient Matrix Multiplication on SIMD Computers.
Implementation of a SAR processing algorithm on MasPar MP-1208.
On the Spectra of Sums of Orthogonal Projections with Applications to Parallel Computing.
Large Scale Structural Analysis on Massively Parallel Computers.
Multiplicative and Additive Schwarz' Methods: Convergence in the 2-domain case.
Parallel Domain Decomposition Applied to Coupled Transport Equations.
Parallel Domain Decomposition and Iterative Refinement Algorithms.
Experience with industrial applications on MIMD machines.
Additive Schwarz Methods without Subdomain Overlap and with New Coarse Spaces.
Industrial Computing on MIMD machines.
Domain Decomposition Techniques in Parallelization of the 3-dimensional FRONTSIM code.

A Coarse Space Formulation with good Parallel Properties for an Additive Schwarz Domain Decomposition Algorithm

Petter E. Bj{\o}rstad and Maksymilian Dryja
April 1998.

This paper studies a variant of the Additive Average Schwarz algorithm \cite{Bjorstad:1997:ASM} where the coarse space consists of two subspaces. This approach results in a simpler and more parallel algorithm while we retain the essential convergence properties of the original method. Our theory is confirmed by numerical experiments showing that the new algorithm is often superior to the original variant.

BIBTEX reference:

@ARTICLE{Bjorstad:1998:CSF,
    author="Petter E. Bj{\o}rstad and Maksymilian Dryja ",
    title="A Coarse Space Formulation with good parallel
           properties for an Additive Schwarz Domain Decomposition Algorithm",
    journal="Submitted to Numerische Mathematik",
    year="1999",
    URL="http://www.ii.uib.no/~petter/reports/pbmax_olof98.ps.gz"
    }

A note on high precision solutions of two fourth order eigenvalue problems

Petter E. Bj{\o}rstad and Bj{\o}rn Peter Tj{\o}stheim
May 1998.

We solve the biharmonic eigenvalue problem $\Delta^2u = \lambda u$ and the buckling plate problem ${\Delta}^2u = - {\lambda}\Delta u$ on the unit square using a highly accurate spectral Legendre-Galerkin method. We study the nodal lines for the first eigenfunction near a corner for the two problems. Five sign changes are computed and the results show that the eigenfunction exhibits a self similar pattern as one approaches the corner. The amplitudes of the extremal values and the coordinates of their location as measured from the corner are reduced by constant factors. These results are compared with the known asymptotic expansion of the solution near a corner. This comparison shows that the asymptotic expansion is highly accurate already from the first sign change as we have complete agreement between the numerical and the analytical results. Thus, we have an accurate description of the eigenfunction in the entire domain.

BIBTEX reference:

@ARTICLE{Bjorstad:1998:ANH,
         AUTHOR="Petter E. Bj{\o}rstad and Bj{\o}rn Peter Tj{\o}stheim",
         Title="A note on high precision solutions of two fourth order eigenvalue problems",
         Year="1999",
         Journal="Submitted to Computing",
         URL="http://www.ii.uib.no/~petter/reports/hump.ps.gz"
         }

Computational Science, en tredje vei for forskning

Petter E. Bj{\o}rstad
November 1997.

This article, written for the science student paper at the University of Bergen in Norwegian (sorry!), gives a brief overview of computer evolution and the use of high performance computers to advance scientific research. The entire text is written in nontechnical style for a general audience.

BIBTEX reference:

@ARTICLE{Bjorstad:1997:CST,
         AUTHOR="Petter E. Bj{\o}rstad",
         Title="Computational Science, en tredje vei for forskning",
         Month="November",
         Year="1997",
         Note="In UiB student paper QED "
         Journal="QED",
         URL="http://www.ii.uib.no/~petter/reports/qed97.pdf.gz"
         }

Multilevel Parallel Solution of Large, Sparse Finite Element Equations from Structural Analysis

Jeremy Cook, Petter E. Bj{\o}rstad, Jon Br{\ae}khus
April 1996.

We discuss the use of high performance computing equipment in large, commercial structural analysis programs. In particular, we consider the strategy for modifying a standard industrial code from a pure F77 version to a form suitable for a range of parallel computers. The code is parallelized using the PVM message passing library for communication between processes and the ScaLAPACK and BLACS libraries for parallel linear algebra operations. The parallel code is suitable for a range of parallel computers, however for the purposes of verification and benchmarking, two specific hardware architectures were targeted in this work. These are an 8-node DEC Alpha cluster with 233MHz EV45 processors and FDDI/GIGASwitch interconnect, and a 32-node Parsytec GC/PowerPlus with 64 PowerPC-601 processors and a Transputer interconnect.

BIBTEX reference:

@INPROCEEDINGS{Cook:1996:MPS,
         AUTHOR="Jeremy Cook and Petter Bj{\o}rstad and Jon Br{\ae}khus",
         Title="Multilevel Parallel Solution of Large, Sparse Finite Element
                Equations from Structural Analysis",
         BOOKTITLE="High-Performance Computing and Networking",
         Month="April",
         Year="1996",
         Note="Lecture Notes in Computer Science"
         Volume="1067",
         Pages="404---412",
         Editor = "H. Liddel and A. Colbrook and B Hertzberger and P. Sloot",
         Publisher="Springer",
         URL="http://www.ii.uib.no/~petter/reports/hpcn96.ps.gz"
         }

Mathematics, parallel computing and reservoir simulation

Petter E. Bj{\o}rstad
November 1996

This paper will highlight, by way of examples, a few seemingly very different mathematical problems and show how they have direct relevance to the construction of efficient computational procedures for the simulation of oil reservoirs on parallel computers.

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1997:MPC,
       AUTHOR="Petter E. Bj{\o}rstad", 
       TITLE=" Mathematics, parallel computing and reservoir simulation",
       BOOKTITLE="Proceedings  of the Second European Congress of Mathematic",
       NOTE="Budapest July 1996",
       YEAR="1997",
       EDITOR="Antal Balog",
       PUBLISHER="Birkhauser Verlag",
       URL="http://www.ii.uib.no/~petter/reports/budapest96.ps.gz"
       }

Parallel implementation of a Schwarz Domain Decomposition Algorithm

Petter E. Bjørstad, Maksimillian Dryja and Eero Vainikko
Oktober 1996

We describe and compare some recent domain decomposition algorithms of Schwarz type with respect to parallel performance. A new, robust domain decomposition algorithm -- Additive Average Schwarz is compared with a classical overlapping Schwarz code. Complexity estimates are given in both two and three dimensions and actual implementations are compared on a Paragon machine as well as on a cluster of modern workstations.

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1996:PIS,
   AUTHOR="Petter E. Bj{\o}rstad and Maksimilian Dryja and Eero Vainikko",
   TITLE="Parallel implementation of a Schwarz Domain Decomposition Algorithm",
   BOOKTITLE="Applied Parallel Computing in Industrial Problems and Optimization",
   YEAR="1996",
   EDITOR="Jerzy Wasniewski and Jack Dongarra and  Kaj Madsen and Dorte Olesen",
   PAGES="",
   ORGANIZATION="",
   PUBLISHER="Springer",
   ADDRESS="",
   MONTH="December",
   NOTE="Lecture Notes in Computer Science volume 1184",
   URL="http://www.ii.uib.no/~petter/reports/para96.ps.gz"
   }

Efficient algorithms for solving a fourth order equation with the spectral-Galerkin method

Petter E. Bjørstad and Bjørn Peter Tjøstheim:
January 1996.

We show that one can derive an $O(N^3)$ spectral-Galerkin method for fourth order (biharmonic type) elliptic equations based on the use of Chebyshev polynomials. The use of Chebyshev polynomials provides a fast transform between physical and spectral space which is advantageous when a sequence of problems must be solved e.g., as part of a nonlinear iteration. This improves the result of Shen which reported an $O(N^4)$ algorithm inferior to the $O(N^3)$ method developed earlier based on Legendre polynomials, but less practical in the case of multiple problems. We further compare our method with an improved implementation of the Legendre-Galerkin method based on the same approach.

BIBTEX reference:

@ARTICLE{Bjorstad:1996:EAS,
     AUTHOR="Petter E. Bj{\o}rstad, Bj{\o}rn Peter Tj{\o}stheim",
     TITLE="Efficient algorithms for solving a fourth order equation
            with the Spectral-Galerkin method",
     JOURNAL="SIAM J. Sci.  Comp.",
     VOLUME="18",
     NUMBER="2",
     MONTH="March",
     YEAR = "1997",
     URL="http://www.ii.uib.no/~petter/reports/Leg_Cheb.ps.gz"
          }

Two Different Data-Parallel Implementations of the BLAS.

P. Bjørstad and T. Sørevik:
Nov. 92

This paper discusses the implementation and performance of the computational BLAS kernels in a data-parallel setting. Two different programming languages are compared and several compiler issues are discussed.

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1993:DDP,
     AUTHOR="P. E. Bj{\o}rstad and T. S{\o}revik",
     TITLE="Two Different Data-Parallel Implementations of the {BLAS}",
     BOOKTITLE=" Software for Parallel Computation",
     EDITOR="J. S.  Kowalik",
     YEAR=" 1993"
     URL="http://www.ii.uib.no/~petter/reports/mpp_blas.ps.gz"
          }

Data-Parallel BLAS as a basis for LAPACK on Massively Parallel Computers.

P. Bjørstad and T. Sørevik: 
Nov. 92

We consider a data-parallel implementation of LU-factorization based on the LAPACK routine DGETRF. We analyze the performance of the required BLAS routines and show that high performance is inhibited by current compiler limitations.

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1993:DPB,
     AUTHOR="P. E. Bj{\o}rstad and T. S{\o}revik",
     TITLE="Data-parallel {BLAS} as a basis for {L}apack on massively
     parallel computers",
     BOOKTITLE="Linear Algebra for Large Scale and Real-Time Applications",
     YEAR="1993",
     EDITOR="M. Moonen and G. Golub and B. De Moor",
     PAGES="13-20",
     PUBLISHER="Kluwer Academic Publishers"
     URL="http://www.ii.uib.no/~petter/reports/mpp_LU.ps.gz"
          }

Domain Decomposition, Parallel Computing and Petroleum Engineering.

Petter E. Bjørstad and Terje Kårstad:
July 94

A prototype black oil simulator is described. The simulator has a domain-based data structure whereby the reservoir is represented by a possibly large number of smaller reservoirs each having a complete local data structure. This design is essential for effective use of preconditioning techniques based on domain decomposition. The chapter describes a splitting technique for the solution of the nonlinear system and an effective implementation of the algorithm on massively parallel computer systems. Most communication is localized and long range communication is kept at a minimum. Results from an implementation of the method are reported for a 16384 processor MasPar MP-2.

Unstructured Grids on SIMD Torus Machines.

Petter E. Bjorstad and Robert Schreiber:
May 94

Unstructured grids lead to unstructured communication on distributed memory parallel computers, a problem that has been considered difficult. Here, we consider adaptive, offline communication routing for a SIMD processor grid. Our approach is empirical. We use large data sets drawn from supercomputing applications instead of an analytic model of communication load. The chief contribution of this paper is an experimental demonstration of the effectiveness of certain routing heuristics. Our routing algorithm is adaptive, nonminimal, and is generally designed to exploit locality. We have a parallel implementation of the router, and we report on its performance.

Parallel Substructuring Algorithms in Structural Analysis, Direct and Iterative Methods.

Petter E. Bjørstad, Jon Brækhus and Anders Hvidsten:
1991

This paper reviews the development that has occurred in order to adopt a large scale structural analysis package, SESAM (SESAM is marketed by Veritas Sesam Systems Inc.), to the rapid changes in computer architecture as well as to the algorithmic advances that has been made in the past few years. We describe a parallel implementation of the sophisticated direct solution algorithm, based on processing substructures in parallel, but also allowing a high degree of parallelism in the computation within each substructure. We further discuss the use of iterative solution strategies. The paper concludes that iterative techniques should be further investigated and that the current knowledge about such methods makes them attractive for certain special classes of problems already today. We provide a few numerical examples in support of our conclusions.

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1991:PSA,
     AUTHOR="Petter Bj{\o}rstad and Jon Br{\ae}khus and Anders Hvidsten",
     TITLE="Parallel substructuring algorithms in structural analysis,
            direct and iterative methods",
     BOOKTITLE="Fourth International Symposium on Domain Decomposition
                Metods for Partial Differential Equations",
     EDITOR="R. Glowinski and Y. A. Kuznetsov and
             G\'{e}rard Meurant and J. P\'{e}riaux and O. B. Widlund",
     PAGES="321-340",
     PUBLISHER="SIAM",
     ADDRESS="Philadelphia",
     YEAR="1991"
     URL="http://www.ii.uib.no/~petter/reports/moscow_siam.ps.gz"
          }

Parallel Domain Decomposition and Iterative Refinement Algorithms.

     Petter E. Bjørstad, Randi Moe and Morten Skogen:
     Jan. 90

Algorithms for the solution of partial differential equations based on a subdivision of the spatial domain, has received much interest in recent years. To a large extent this has been motivated by the new generation of parallel computers. This algorithmic approach can introduce independent parallel tasks of variable granularity, depending on the subdivision and can therefore be adapted to a wide range of parallel computers. We review some of the progress that has been made and report on numerical experiments that illustrate the convergence behavior. We also describe parallel implementations on both shared memory computers (Alliant FX/8) and on local memory systems (Intel iPSC/2 and network connected workstations).

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1990:PDD,
     AUTHOR=" Petter E. Bj{\o}rstad and Randi Moe and Morten Skogen",
     TITLE=" Parallel Domain Decomposition and Iterative Refinement
             Algorithms ",
     EDITOR="Wolfgang Hackbusch",
     BOOKTITLE="Parallel Algorithms for PDEs, Proceedings of the
                6th GAMM-Seminar held in Kiel, Germany, January 19--21, 1990"
     PUBLISHER="Vieweg-Verlag",
     ADDRESS="Braunschweig, Wiesbaden",
     YEAR="1990"
     URL="http://www.ii.uib.no/~petter/reports/pbrmms.ps.gz"
          }

A New Algorithm for the SLALOM Benchmark.

       Petter E Bjørstad and Erik Boman:
       Nov. 90

We apply a preconditioned conjugate gradient algorithm to the SLALOM benchmark, proving that the solution phase of the benchmark is reduced to $O(N^2)$. The algorithm therefore performs well over the full range of input parameters specified by the benchmark. This has a dramatic impact on the benchmark for all computers. We illustrate this with new implementations on an 8192 processor MP-1 system and a Cray X/MP.

BIBTEX reference:

@TECHREPORT{Bjorstad:1991:NAS,
     AUTHOR="Petter Bj{\o}rstad and Erik Boman",
     TITLE="A New Algorithm for the SLALOM Benchmark",
     INSTITUTION="Institutt for Informatikk, University of Bergen",
     NUMBER = "55",
     MONTH="May",
     YEAR = "1991",
     NOTE = "Also published in Supercomputing Review"
     URL="http://www.ii.uib.no/~petter/reports/slalom.ps.gz"
          }

Domain Decomposition Algorithms of Schwarz type, designed for Massively Parallel Computers.

        Petter E. Bjørstad and Morten D. Skogen:
        1991

We discuss implementation of additive Schwarz type algorithms on SIMD computers. A recursive, additive algorithm is compared with a two-level scheme. These methods are based on a subdivision of the domain into thousands of micro-patches that can reflect local properties, coupled with a coarser, global discretization where the `macro' behavior is reflected. The two-level method shows very promising flexibility, convergence and performance properties when implemented on a massively parallel SIMD computer.

BIBTEX reference:

@INPROCEEDINGS{Bjorstad:1992:DAS,
     AUTHOR=" Petter E. Bj{\o}rstad and Morten Skogen",
     TITLE="Domain Decomposition Algorithms of {S}chwarz
            Type, Designed for Massively Parallel Computers",
     EDITOR="David E. Keyes and Tony F. Chan and G{\'e}rard A. Meurant
             and Jeffrey S. Scroggs and Robert G. Voigt ",
     BOOKTITLE="Fifth International Symposium on Domain Decomposition
                Methods for Partial Differential Equations",
     PUBLISHER="SIAM",
     YEAR="1992",
     ADDRESS="Philadelphia, PA",
     PAGES="362--375"
     URL="http://www.ii.uib.no/~petter/reports/mgas_siam.ps.gz"
          }

To Overlap or not to Overlap: A Note on A Domain Decomposition Method for Elliptic Problems.

        Petter E Bjørstad and Olof B Widlund:
        Sep. 88

More than one hundred years ago, H.A. Schwarz introduced a domain decomposition method in which the original elliptic equation is solved on overlapping subregions, one after another, in an iterative process. A few years ago, Chan and Resasco, introduced a method that they classified as a domain decomposition method using nonoverlapping subdomains. In this note, it is shown that their method is an accelerated version of the classical method. It is also shown that the error propagation operator of the method can be expressed in terms of Schur complements of certain stiffness matrices and that techniques previously developed for the study of iterative substructuring algorithms can be used to derive estimates on the rate of convergence.

BIBTEX reference:

@ARTICLE{Bjorstad:1989:OON,
     AUTHOR=" Petter E. Bj{\o}rstad and Olof B. Widlund ",
     TITLE=" To Overlap or Not to Overlap: {A} Note on a Domain
             Decomposition Method for Elliptic Problems ",
     JOURNAL=" SIAM J. Sci. Stat. Comput.",
     YEAR="1989",
     VOLUME="10",
     NUMBER="5",
     PAGES= "1053--1061"
     URL="http://www.ii.uib.no/~petter/reports/overlap.ps.gz"
          }

Efficient Matrix Multiplication on SIMD Computers.

       Petter E Bjørstad, Frederik Manne, Tor Sørevik and M. Vajtersic:
       Aug. 91.

We describe efficient algorithms for matrix multiplication on SIMD computers. We consider SIMD implementations of Winograd's algorithm in the case where additions are faster than multiplications, as well as classical kernels and the use of Strassen's algorithm. Actual performance figures using the MasPar family of SIMD computers are presented and discussed.

BIBTEX references:

@ARTICLE{Bjorstad:1992:EMM,
    AUTHOR="P. Bj{\o}rstad and F. Manne and T. S{\o}revik and M. Vajter\v{s}ic",
    TITLE="Efficient Matrix Multiplication on {SIMD} Computers",
    JOURNAL="SIAM J. Matrix Anal. Appl.",
    VOLUME="13",
    NUMBER="1",
    PAGES="386-401",
    YEAR="1992",
    MONTH="January",
    KEY="Matrix multiplication, Winograd's algorithm, Strassen's algorithm,
         SIMD computer, parallel computing"
    URL="http://www.ii.uib.no/~petter/reports/matrix_sigene.ps.gz"
        }

Implementation of a SAR processing algorithm on MasPar MP-1208.

      Petter E. Bjorstad, Jeremy Cook, Hans Munthe-Kaas and Tor Sorevik:
      Oct. 1991

We describe the results of a three-week project to evaluate and implement a SAR algorithm from FFI on a MasPar MP-1208 at para//ab in Bergen. The report shows that a quite complex program written in Fortran, can be rewritten in C and made to run efficiently on a massively parallel SIMD computer with relatively little effort. The performance shows that general purpose SIMD computers are very competitive with specially designed computers for SAR processing.

BIBTEX references:

@TECHREPORT{Bjorstad:1991:ISP,
     AUTHOR=" Petter E. Bj{\o}rstad and Jeremy Cook and Hans Munthe-Kaas
              and Tor S{\o}revik",
     TITLE="Implementation of a {SAR} processing algorithm on {M}as{P}ar
            {MP}-1208",
     INSTITUTION="Institute of Informatics, University of Bergen ",
     ADDRESS="H{\o}yteknologisenteret,N-5020 Bergen, Norway",
     YEAR="1991",
     NUMBER="57",
     MONTH="October"
     URL="http://www.ii.uib.no/~petter/reports/sar.ps.gz"
          }

On the Spectra of Sums of Orthogonal Projections with Applications to Parallel Computing.

      Petter E. Bjørstad and Jan Mandel:
      Dec. 1989

Many parallel iterative algorithms for solving symmetric, positive definite problems proceed by solving in each iteration, a number of independent systems on subspaces. The convergence of such methods is determined by the spectrum of the sums of orthogonal projections on those subspaces, while the convergence of a related sequential method is determined by the spectrum of the product of complementary projections. We study spectral properties of sums of orthogonal projections and in the case of two projections, characterize the spectrum of the sum completely in terms of the spectrum of the product.

BIBTEX references:

@ARTICLE{Bjorstad:1991:SSO,
     AUTHOR=" Petter E. Bj{\o}rstad and Jan Mandel",
     TITLE=" On the Spectra of Sums of Orthogonal Projections
             with Applications to Parallel Computing ",
     JOURNAL="BIT",
     PAGES="76--88",
     VOLUME="31",
     YEAR="1991"
     URL="http://www.ii.uib.no/~petter/reports/projections.ps.gz"
          }

Large Scale Structural Analysis on Massively Parallel Computers.

      Petter E Bjørstad and Jeremy Cook:
      Nov. 92

We discuss the use of a massively parallel computer in large scale structural analysis computations. In particular, we consider the necessary modifications to a standard industrial code and a strategy where the code may be allowed to evolve gradually from a pure F77 version to a form more suitable for a range of parallel computers. We present preliminary computational results from a DECmpp computer, showing that we can achieve good price performance when compared to traditional vector supercomputers.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1993:LSS,
     AUTHOR="Petter Bj{\o}rstad and Jeremy Cook",
     TITLE="Large Scale Structural Analysis on Massively Parallel Computers",
     BOOKTITLE="Linear Algebra for Large Scale and Real-Time Applications",
     YEAR="1993",
     EDITOR="M. Moonen and G. Golub and B. De Moor",
     PAGES="3-11",
     PUBLISHER="Kluwer Academic Publishers",
     NOTE="NATO ASI Series"
     URL="http://www.ii.uib.no/~petter/reports/nato_sestra.ps.gz"
         }

Multiplicative and Additive Schwarz' Methods: Convergence in the 2-domain case.

      Petter E. Bjørstad:
      Jan. 88

We consider the classical Schwarz alternating algorithm and an additive version more suitable for parallel processing. The two methods are compared and analyzed in the case of two domains. We show that the rate of convergence for both methods, can be directly related to a generalized eigenvalue problem, derived from subdomain contributions to the global stiffness matrix. Analytical expressions are given for a model case.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1989:MAS,
     AUTHOR=" Petter E. Bj{\o}rstad",
     TITLE="{M}ultiplicative and {A}dditive {S}chwarz {M}ethods:
            {C}onvergence in the 2 domain case",
     BOOKTITLE="Domain Decomposition Methods ",
     YEAR="1989",
     EDITOR="Tony Chan and Roland Glowinski
             and Jacques P{\'e}riaux and Olof Widlund",
     PUBLISHER="SIAM",
     ADDRESS="Philadelphia, PA",
     PAGES="147--159"
     URL="http://www.ii.uib.no/~petter/reports/two_domain.ps.gz"
           }

Parallel Domain Decomposition Applied to Coupled Transport Equations.

      Petter E. Bjørstad, W. M. Coughran, Jr. and Eric Grosse:
      May 94

Modeling semiconductor devices is an important technological problem. The traditional approach solves coupled advection-diffusion carrier-transport equations in two and three spatial dimensions on either high-end scientific workstations or traditional vector supercomputers. The equations need specialized discretizations as well as nonlinear and linear iterative methods. We will describe some of these techniques and our preliminary experience with coarse-grained domain decomposition techniques applied on a collection of high-performance workstations connected at a 100Mb/s shared network.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1994:PDA,
     AUTHOR="Petter E. Bj{\o}rstad and Coughran, Jr., W. M. and Eric Grosse",
     TITLE="Parallel Domain Decomposition Applied to
            Coupled Transport Equations",
     BOOKTITLE="Seventh International Conference of Domain Decomposition
             Methods in Scientific and Engineering Computing",
        EDITOR="David E. Keyes and Jinchao Xu",
        YEAR="1994",
        PUBLISHER="AMS",
        SERIES="Contemporary Mathematics",
        VOLUME="180",
        PAGES="43--47",
        NOTE="Held at Penn State University,
              October 27-30, 1993.",
     URL="http://www.ii.uib.no/~petter/reports/semi_conductor.ps.gz"
          }

Parallel Domain Decomposition and Iterative Refinement Algorithms.

      Petter E. Bjørstad:
      Apr. 89

Algorithms for the solution of partial differential equations based on a subdivision of the spatial domain, has received much interest in recent years. To a large extent this has been motivated by the new generation of parallel computers. This algorithmic approach can introduce independent parallel tasks of variable granularity, depending on the subdivision and can therefore be adapted to a wide range of parallel computers. We review some of the progress that has been made and supply a few new numerical examples that illustrate the convergence behavior.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1989:PDI,
      AUTHOR= "Petter E. Bj{\o}rstad",
      TITLE=  "Parallel Domain Decomposition and
               Iterative Refinement Algorithms",
      YEAR="1989",
      URL="http://www.ii.uib.no/~petter/reports/tatra.ps.gz"
           }

Experience with industrial applications on MIMD machines.

      Petter E. Bjørstad
      1995

The laboratory for parallel computing - Parallab actively pursues the porting of significant industrial codes to parallel computing platforms. This work started already in 1985 with the acquisition of Europe's first 64 processor Intel hypercube. As of today, the laboratory has two MIMD machines, an Intel Paragon and a Parsytec GC/Power-Plus. As part of the Europort effort, four industrial codes are currently being ported by Parallab scientists. The paper will describe this effort and the issues, trends and experiences learned from working with industrial codes using MIMD type parallel computers.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1995:EIA,
     AUTHOR="Petter E. Bj{\o}rstad",
     TITLE="Experience with industrial applications
            on MIMD machines",
     PUBLISHER="K.G. Saur Verlag",
     NOTE="Proceedings from Supercomputer '95, June 22-24, Mannheim",
     URL="http://www.ii.uib.no/~petter/reports/mannheim.ps.gz"
          }

Additive Schwarz Methods without Subdomain Overlap and with New Coarse Spaces

       Petter E. Bjørstad, Maksymilian Dryja and Eero Vainikko
       Oct. 1995

We develop two additive Schwarz methods with new coarse spaces. The methods are designed for elliptic problems in 2 and 3 dimensions with discontinuous coefficients. The methods use no explicit overlap of the subdomains, subdomain interaction is via the coarse space. The first method has a rate of convergence proportional to (H/h)^(1/2) when combined with a suitable Krylov space iteration. This rate is independent of discontinuities in the coefficients of the equation. The method has good parallelization properties and do not require a coarse grid triangulation, that is, one is free to use arbitrary, irregular subdomains. The second method uses a diagonal scaling in addition to the standard coarse space. This method is not as robust and flexible as the first method.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1996:ASM,
     AUTHOR="Petter Bj{\o}rstad and Maksymilian Dryja and Eero Vainikko",
     BOOKTITLE="Domain Decomposition Methods in Sciences and Engineering",
     TITLE="Additive Schwarz Methods without Subdomain
            Overlap and with New Coarse Spaces",
     EDITOR="R. Glowinski and J. P\'{e}riaux and Z. Shi and O. B. Widlund",
     PAGES="xx-yy",
     PUBLISHER="John Wiley \& Sons",
     NOTE="Proceedings from the
           Eight International Conference on Domain Decomposition
           Metods, May 1995, Beijing",
     YEAR="1996",
     URL="http://www.ii.uib.no/~petter/reports/dd8_jump_pbmdev.ps.gz"
          }

Industrial Computing on MIMD machines

       Petter E. Bjørstad
       Oct. 1995

The laboratory for parallel computing - Parallab actively pursues the porting of significant industrial codes to parallel computing platforms. This work started already in 1985 with the acquisition of Europe's first 64 processor Intel hypercube. As of today, the laboratory has three MIMD machines, an Intel Paragon, a Parsytec GC/Power-Plus and a DEC $\alpha$-cluster. As part of the Europort effort, four industrial codes are currently being ported by Parallab scientists. The paper will describe this effort and the issues, trends and experiences learned from working with industrial codes using MIMD type parallel computers.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1995:ICM,
     AUTHOR="Petter E. Bj{\o}rstad",
     TITLE="Industrial Computing on {M}{I}{M}{D} machines",
     BOOKTITLE="NIK",
     EDITOR="Magne Haveraaen",
     PAGES="xx--yy",
     PUBLISHER="NIK",
     NOTE="Proceedings from NIK'95, Nov 20--22, Gran, Norway",
     URL="http://www.ii.uib.no/~petter/reports/NIK.ps.gz",
     YEAR="1995"
          }

Domain Decomposition Techniques in Parallelization of the 3-dimensional FRONTSIM code

       Petter E. Bjørstad, Randi Moe, Rudi Olufsen, Eero Vainikko 
       May 1995

We have studied the parallelization methods for the pressure equation in the reservoir simulator FRONTSIM\footnotemark[3]\ for 3-dimensional reservoirs. In order to obtain a faster simulator and finer resolution, we have studied the use of domain decomposition methods combined with the use of parallel processing. Both the additive Schwarz method and the multiplicative Schwarz method are implemented together with coarse-grid solver techniques. The purpose of the investigation is to find the best strategies for different problem sizes and different approximate solvers. The code is parallelized using the PVM message passing library for communication and has been tested on a cluster of workstations. This effort is part of the EC-ESPRIT III / EUROPORT2 project. Some implementation results and directions for further developments are given.

BIBTEX references:

@INPROCEEDINGS{Bjorstad:1995:PPA,
     AUTHOR="Petter E. Bj{\o}rstad and Randi Moe and Rudi Olufsen
             and Eero Vainikko",
     BOOKTITLE="Parallel Programming and Applications: Proceedings
                of the Workshop on Parallel Programming and
                Computation (ZEUS '95) and the 4th Nordic
                Transputer Conference (NTUG '95)",
     TITLE="Domain Decomposition Techniques in Parallelization of
            the 3-dimensional FRONTSIM code",
            Overlap and New Coarse Spaces",
     EDITOR="Peter Fritzon and Leif Finmo",
     PAGES="97-110",
     PUBLISHER="IOS Press",
     ORGANIZATION="Link\"{o}ping University, Department of Computer
                   and Information Science",
     YEAR="1995",
     URL="http://www.ii.uib.no/~petter/reports/frontsim.ps.gz"
          }

Additive Schwarz Methods for Elliptic Mortar Finite Element Problems

Petter E. Bj{\o}rstad, Maksymilian Dryja and Talal Rahman
May 2002.

Two variants of the additive Schwarz method for solving linear systems arising from the mortar finite element discretization on nonmatching meshes of second order ellip\-tic problems with discontinuous coefficients are designed and analyzed. The methods are defined on subdomains without overlap, and they use special coarse spaces, resulting in algorithms that are well suited for parallel computation. The condition number estimate for the preconditioned system in each method is proportional to the ratio $H/h$, where $H$ and $h$ are the mesh sizes, and it is independent of discontinuous jumps of the coefficients. For one of the methods presented the choice of the mortar (nonmortar) side is independent of the coefficients.

BIBTEX reference:

@ARTICLE{Bjorstad:2002:ASM,
         AUTHOR="Petter E. Bj{\o}rstad, Maksymilian Dryja and Talal Rahman",
         Title="Additive Schwarz Methods for Elliptic Mortar Finite Element Problems",
         Year="2002",
         Journal="Submitted to Numerische Mathematik",
         URL="http://www.ii.uib.no/~petter/reports/pbmdtr2002AddSchMort.ps.gz"
         }

My list of technical reports and preprints.

Table of Contents: