banner image

Data Local Iterative Methods For The Efficient Solution of Partial Differential Equations

logo
home
staff
coorperations
publications
talks
tutorials
software
results
contact

A cooperation
between
lss logo
and
lrr logo.

Funded by
dfg logo.

rb8.F Program Description

File Name:

rb8.F

Description:

The data access behaviour of rb8.F is equal to the behaviour of rb6.F. Since rb6.F suffers from register dependencies the innermost loops were unrolled to enable better instruction scheduling. To help the compiler, the update statements were scheduled by hand.

Comment:

The cache behaviour should be equivalent too that of rb6.F.

Results:

Memory access behaviour
Size MBytes
/sec
% of all access which go into
± 1. Level 2. Level 3. Level Memory
2 sweeps performed together
16 1535.4 21.4 78.6 0.1 0.0 0.0
32 1626.8 22.2 76.7 1.1 0.0 0.0
64 1588.8 22.0 66.4 11.4 0.3 0.0
128 1333.1 21.9 53.1 22.3 2.7 0.0
256 1185.2 21.9 27.6 47.7 2.8 0.0
512 713.6 21.5 27.7 47.4 1.6 1.8
1024 625.2 21.3 25.2 47.6 4.1 1.8
2048 405.0 20.9 31.9 33.5 11.9 1.8
3 sweeps performed together
16 1551.3 20.5 79.4 0.0 0.0 0.0
32 1639.3 21.4 77.5 1.0 0.1 0.0
64 1598.8 21.5 66.9 11.4 0.3 0.0
128 1338.5 21.7 53.4 22.2 2.7 0.0
256 1190.8 21.6 27.8 47.8 2.8 0.0
512 716.9 21.4 28.0 47.2 1.6 1.8
1024 628.9 21.2 25.3 47.7 4.0 1.8
2048 406.2 21.0 31.8 33.5 11.9 1.9

Runtime behaviour
Size MFlops
/sec
% of cycles used for
± Base Exec Cache DTB Branch R dep Nops
2 sweeps performed together
16 348.8 8.1 106.6 63.5 0.3 4.4 5.1 18.4 6.8
32 373.2 8.2 106.1 65.9 0.2 4.8 6.0 14.2 6.8
64 363.5 7.8 105.5 65.5 0.7 4.1 7.5 14.4 5.5
128 304.6 7.0 103.6 56.1 14.1 3.5 6.2 12.0 4.7
256 271.1 6.4 103.2 50.5 18.0 8.0 5.4 10.7 4.2
512 162.4 4.4 101.1 30.9 49.1 3.2 4.5 6.6 2.4
1024 141.8 5.0 101.0 25.9 50.5 9.6 2.1 5.4 2.5
2048 91.4 5.6 100.4 17.6 58.9 9.3 1.6 6.9 0.5
3 sweeps performed together
16 348.7 8.3 106.3 61.9 0.2 4.2 5.7 19.6 6.4
32 372.3 8.6 106.0 65.1 0.2 5.0 6.0 14.4 6.7
64 363.6 8.0 105.5 65.7 0.7 4.0 7.2 14.4 5.5
128 305.1 7.2 103.4 56.1 13.7 3.5 6.2 12.0 4.7
256 271.2 6.0 102.0 50.3 19.4 4.9 7.3 10.1 4.0
512 162.8 4.6 101.5 29.8 49.8 5.1 3.1 6.6 2.5
1024 142.5 4.9 101.0 26.0 50.6 9.5 2.1 5.4 2.5
2048 91.8 5.6 100.4 17.6 58.9 9.3 1.6 6.9 0.5
Table explanation

cs10-dime@fau.de
Last Modified: 10 January 2008
Valid HTML 4.01! Powered by vim