h048z: Preliminary
performance results on a Hitatchi SR8000
Overview:
In order to mitigate the effect of the growing gap
between CPU speed and main memory performance current
computer architectures use hierarchical memory
designs comprising one or even more levels of cache
memory. This is especially true for the Hitachi
SR8000-F1, which is based on a Power3-like processor
using a 128 kByte on-chip L1 cache.
The general goal of our project h048z is
to investigate how numerical algorithms can be
designed and/or restructured such that the
hierarchical memory structure of the underlying
architecture is respected. In particular we focus on
the processing of algorithms on regular meshes.
Our research efforts concerning code performance
optimizations on the Hitachi architecture yield
techniques which are immediately applied by the
KONWIHR
projects gridlib as well as
ParEXPDE.
Summary of results:
- Performance results for blocked Gauss-Seidel
code on regular grids
- Performance results for hierarchical hybrid
grids
- Porting the DiMEPACK library
A detailed description of the results in
Postscript format can be found here.
For more information on the DiME project you may
visit our Project homepage
|