The SAGA project has explored the use of high-level abstractions and algebraic software methodologies for scientific computing.
As part of SAGA, the Sophus C++ library has been built with abstractions suitable for PDE mathematics. The high level of modularity in the Sophus code makes it easy to factor out code suitable for parallelization. Parallel versions of Sophus applications have been built simply by replacing one or two modules dealing with the underlying mesh data structures.
While previous work on the Sophus infrastructure has focuses on large super-computers, we are now moving to simpler, off-the-shelf hardware, like the AMD64 architecture, the Cell processor on the Playstation 3, and programmable graphics processors, like Nvidia's GPU with the Cuda infrastructure.
A preliminary attempt at porting Sophus to the Cell processor has been successful, with the SeisMod application
being ported in around 20 hours (including time to learn Cell
programming), with a very simple parallel version running 2-3 times
faster than the sequential version.
Also under the wings of SAGA, the Grid-DDA project investigates the possibility of arbitrary depth, nested parallel programming concepts based on multi-level Data Dependency Algebras (DDAs), from microprocessors to e-grids.
Previous research on DDAs - in the framework of Saphire project- provided a theory on how to program parallel machines, where the explicit utilization of the parallel computer's internal network topology is fully programmable as an independent aspect of the computation.
As such, the run-time parallel distribution and global communication pattern of a hardware layout, whether a parallel computer, a highly parallel graphics processor unit (GPU), a many-core CPU or a chip, can be defined by a separate data type, a space-time DDA.
The data dependency structure of a computation is also defined by a DDA, and the algorithm for solving a problem is given by a recursive function on this DDA.
The embedding of a computation into the underlying hardware then becomes a task of ﬁnding an eﬃcient mapping of the DDA of the computation into the space-time DDA of the hardware layout. This allows full control of the computation and explicit handling of the underlying hardware resource at a very high abstraction level.
A prototype compiler was built to provide a simple way to generate parallel code from high level DDA descriptions for high performance computing architectures using the MPI message passing library. We are now planning to enhance this compiler to generate parallel code for other architectures as well, e.g., for NVIDIA's CUDA and Cell. This also promises an easy way to test the efficiency of different embeddings, since they can be reformulated on a high level, new parallel code being generated by Sapphire.
This semester within a regular departmental course (inf329) we are taking a closer look at different Programming Models for Non-Traditional Architectures. Reading list is now available.
- Anya Helene Bagge, PhD Student (Programming Theory)
- Magne Haveraaen, Professor (Programming Theory)
- Daniele Mary de Jesus, Master Student (Algorithms)
- Fredrik Manne, Professor (Algorithms)
- Mary Sheeran, Professor (Chalmers University of Technology, Göteborg)
- Eva Burrows, PhD Student (Programming Theory)
- Workshop on Multi-Core Technology at NIK 2007
- Learning to Program the Cell/BE on the Playstation 3: Anya, Daniele and Eva all attended the Cell Programming Seminar at NTNUwhere IBM's Duc Vianney gave a two day introduction to the Cell architecture and how to program it. Many thanks to NTNU and IBM for organizing the seminar, and to IBM Norway for taking all the attendees out to a very good dinner. The seminar was fast paced, and a lot of information was covered (everything from basic overview and release road maps, to specifics on assembly instructions and profiling tools), so we all felt a need to look closer at those things that are most relevant for us. Therefore, we organized a seminar series on Cell programming Autumn semester 2007. Our department has several PS3 units available (distributed among the various groups), and we have also installed IBM's Cell SDK on some of our computers so we can run programs in simulated mode.