# SAGA - Scientific Computing with Algebraic and Generative Abstractions

The SAGA project is an activity at Bergen Language Design Laboratory (BLDL). This project demonstrates the benefits of using algebraic software methodologies as a breakthrough technology in the area of computational modelling. Available positions related to this project are announced on the BLDL web pages.

## General Background

Computational modelling on high performance computers (HPC) is becoming a valuable aid for industry. For the potential benefit consider such areas as oil exploration where the cost of drilling a well may be 25 million ECU. Here seismic modelling provides background evidence needed to decide where to drill. More modest examples, but still important to European industry, are the processes of injection moulding, e.g. of plastics or foods, and coating, e.g., of wires and surfaces. The former problems are characterised by the need for expensive and time consuming testing where many injection schemes must be tried before an effective one may be chosen. The latter problems are characterised by the need for fine tuning of the production process itself. In both these cases computational modelling can be used, either to find the optimal mould design before they are built, or to find the optimal setting of the production parameters for coating, thereby reducing waste and improving product quality.

At the heart of computational modelling lies the development of mathematical equations, typically partial differential equations (PDEs) for the actual models that describe the problem and its solution. These must be turned into efficient computer programs that will solve the specific problem at hand. Many of these equations, such as the Navier-Stokes equation defining the flow of fluids for Computational Fluid Dynamics (CFD) problems, are so complex and general that they must be simplified and adapted to the specific problem to be solved. The development of a computational model in the form of a program typically proceeds through the following stages:

1. mathematical modelling: an iterative process that requires the development of prototype implementations of the model to test the assumptions made,
2. programming of the final model as an efficient HPC program,
3. mesh generation: an iterative process that, via a sequence of trial executions, finds an optimal representation of the data for the computation, and
4. execution of program on data sets to solve the actual industrial problem.
Going from a mathematical model to an algorithm is an open-ended process with many options and possibilities. Currently only a few of these options are investigated, and even then only with great effort, requiring time and highly skilled researchers. Working with such code adds to the development time and cost, especially in the prototype development phase and when rewriting the prototype program for an HPC environment. It is virtually impossible to test an algorithm on one computer architecture and then easily move to an HPC platform where actual production runs are to be performed. Worse yet, as new architectures emerge, higher execution speeds can be achieved at a fraction of the previous computer costs, but moving from one HPC architecture to another is very costly. The effect is that computational modelling currently is worthwhile in high profile cases, but not for run of the mill problems. This also precludes small and medium sized enterprises (SMEs) from benefitting from computer modelling using HPC.

What is needed are breakthrough technologies that shorten the mathematical modelling cycle times, and dramatically reduce the cost of porting the prototype implementation onto the appropriate HPC architectures. The aim of the SAGA project is to demonstrate and quantify the benefits of using algebraic programming techniques as a breakthrough technology in the area of computational modelling.

## Algebraic Software Methodologies for PDEs

Algebraic software methodologies are a result of the last 20-30 years investigation into abstract data types and algebraic development techniques. They include concepts like algebra, many-sorted universal algebra and category theory - concepts which originally were developed from the abstract characterisation of rings, fields and free vector spaces, the fundamental concepts in the theory of PDEs. The algebraic concepts also abstract modern program structuring mechanisms like classes and generic (or template) modules of object-oriented programming languages such as C++, Generic Java and Fortran-2000.

Using the abstract theory of PDEs to structure PDE solving software seems a natural way of utilising this correspondence (see Hans Munthe-Kaas and Magne Haveraaen: "Closing the gap between pure and applied mathematics?", ZAMM vol. 76, supplement 1, 1996, pp. 487-488). Experience from the SAGA-oriented projects indicate that

• using algebraic techniques for software structuring purposes gives more mature software structures which are less prone to redesigns than conventionally developed systems, providing a general framework for solvers of most kinds of PDEs, with
• a high degree of reuse even within the library itself, so that advanced concepts are defined by combinations of basic modules, which provides the ability that
• a numerical solution strategy may be replaced with another just by substituting a few modules and without changing the system structure,
• prototype code development times are reduced since central numerical modules are reused rather than readapted, and
• only a few minor modules need to be replaced when porting to and between HPC architectures.
This shows that the effort needed to move from mathematical models to prototype software, and from prototype software to efficient HPC code can be significantly reduced. These are critical parts of the development cycles, and parts where conventional techniques have obvious shortfalls.

## Industrial Case Studies

Two industrial areas have been chosen for the practical tests of the SAGA methodologies: seismic modelling and computational fluid dynamics. These represent a varied challenge for numerical software.

Seismic exploration is based on recording and analysing seismic waves that have passed through a geological medium, e.g., a hydrocarbon reservoir. Seismic modelling is the simulation of such waves. The scales of such models range from several kilometer long profiles to a few decimeters around a borehole. The governing equation for this is the elastic wave equation, which is relevant for most major problems related to seismics. This is a hyperbolic PDE, normally solved using a finite difference scheme. The seismic case study has been developed into a full industrial product, SeisMod.

Computational fluid dynamics (CFD) is the study of fluid flow using computers and computational models. This encompasses both high-speed flows of gases (Reynolds numbers larger than 100), such as those around aeroplanes, and slow flows of highly viscous liquids (Reynolds numbers less than 1), such as that of syrup and glass. CFD problems occur in everything from the mixing of dough for breads and biscuits (where the quality of the mixing of the initial ingredients can have a significant effect on the quality of the final product), to co-injection of multiple components of foods or polymers and coating of wires and optical fibres. CFD problems are governed by the Navier-Stokes PDE, which has both elliptic and hyperbolic properties. Typically they may be solved using finite element methods. As a case study the coating of wires with polymers was chosen. This is an example of viscous incompressible non-isothermal and non-Newtonian flows of specific relevance to a wide range of the processing industry. A parallel version of a reference code developed using traditional means was adapted, and the detailed design of a code using the SAGA methodologies was developed. This allowed a direct comparison of the different approaches to computational modelling, see Experience using SAGA methodologies.

## SAGA Technologies

The study of algebraic software methodologies for PDEs is a vast area. Through the SAGA initiative we have so far developed prototypes of a selected set of tools.
• Sapphire: For the quick prototyping of mathematical models we have developed an algebraic programming language and a compiler that translates recursive functions into non-recursive, imperative code. This allows us to code the recursive equations of the mathematical formulation of a solver directly as recursive functions and compile this for both sequential and parallel HPC computers. The compiled code has the same execution time complexity as conventional imperative code:
• a recursively written solver for the heat equation using finite difference has linear complexity in the number of grid nodes and the number of time steps,
• a recursively written fast Fourier transform (FFT) has n logn complexity for input data size n, and
• a recursive formulation of a dynamic programming problem, such as the stagecoach problem, has linear complexity in the number of connecting arcs of the corresponding graph.
This is achieved by letting the data dependencies of the problem be an explicit structure in the code, thus giving the programmer full control over how recursive calls are reused within the computation.
• Sophus: This is a software library written in C++ and carefully designed to mimic the abstract structure of the PDE mathematics. By requiring strict adherence to specified interfaces, we have been able to achieve that different implementations of the same mathematical concepts are basically interchangeable. Sophus library components can be presented as different layers of abstractions.
• Application layer: solvers for PDEs such as the seismic simulator and the coating problem.
• Tensor layer: handles coordinate systems, matrices and vectors and general differentiation operators.
• Scalar field layer: numerical discretisations such as finite differences and finite elements with partial derivatives.
• Mesh layer: implements grids for sequential and parallel HPC machines.
The interchangeability of modules within a layer allows software developers to experiment with different solution strategies and HPC architectures without the need for extensive reprogramming. For example, the change from a sequential to a parallel version of a seismic simulator does not involve any reprogramming of the rest of the solver application.
• CodeBoost: This is a software transformation system being developed to address the gap between well formed code (from a software engineering point of view) and efficient code (from a run-time point of view).

For example, code written using expressions and functions seems to be easier to understand and maintain than code written as long sequence of statements. However, the latter often seems to generate better code with current compilers. CodeBoost provides a transformation, mutification, that transforms code from the expression form to sequence form. Thus it becomes possible to write well formed code without sacrificing efficiency.

A first version of CodeBoost was written in ASF+SDF at CWI. This version only supported simple transformations like mutification, but showed the viability of the concept. The current version of CodeBoost is developed in Stratego at the University of Bergen. Stratego is a strategic rewrite language being developed at Utrecht University. The second version of CodeBoost is more and more becoming a generative programming support tool, sponsoring things such as user-defined rules. User-defined rules allows a programmer to embed domain-specific transformation rules in the program code they apply to.

More information on CodeBoost, and a full distribution of the system, is available at http://www.codeboost.org/.

## Current project: SAGA-GEO

The current activity is focused on solving geophysical problems and comparing the SAGA solvers with traditional software for the same problems. This means we need to
• reformulate the chosen problems into coordinate-free form, including newer methods for handling difficult boundary conditions,
• adapt and build solvers for these problems in the Sophus library,
• provide means for checking that the traditional and the SAGA solvers deliver similar answers, and
• improve the efficincy of the SAGA solvers to meet those of the traditional solvers, a software transformation task utilising domain specific optimisation possibilities and parallelisation.
The software will be developed in cooperation with geophysical experts.

### Some publications

```The SAGA project
Magne Haveraaen
Department of Informatics
University of Bergen
Høyteknologisenteret
N-5020 BERGEN
Norway
```
E-mail saga-inquire@ii.uib.no.

Contact person: Magne Haveraaen, phone +47 55 58 41 54
Phone front office: +47 55 58 42 00
Fax: +47 55 58 41 99

There is also a SAGA interest group e-mail distribution list.

### Partners on the SAGA project

The SAGA project and related subprojects have received support from

These pages last updated 2011-02-20 © 1999-2010 Magne Haveraaen, University of Bergen, Norway.