banner image

Data Local Iterative Methods For The Efficient Solution of Partial Differential Equations

logo
home
staff
coorperations
publications
talks
tutorials
software
results
contact

A cooperation
between
lss logo
and
lrr logo.

Funded by
dfg logo.

rb7.F Program Description

File Name:

rb7.F

Description:

The data access behaviour of rb7.F is equal to the behaviour of rb6.F. Since rb6.F suffers from register dependencies the innermost loops were unrolled to enable better instruction scheduling.

Comment:

The cache behaviour should be equivalent too that of rb6.F.

Results:

Memory access behaviour
Size MBytes
/sec
% of all access which go into
± 1. Level 2. Level 3. Level Memory
2 sweeps performed together
16 1598.0 19.7 80.3 0.0 0.0 0.0
32 1674.4 20.9 77.8 1.2 0.0 0.0
64 1605.4 20.9 66.8 12.1 0.2 0.0
128 844.3 21.1 61.5 14.6 2.7 0.0
256 1219.6 21.3 32.3 43.7 2.8 0.0
512 699.0 21.2 32.4 43.1 1.5 1.8
1024 611.2 21.1 30.8 42.2 4.1 1.8
2048 378.0 20.9 34.6 30.5 12.2 1.8
3 sweeps performed together
16 1589.8 19.9 80.1 0.0 0.0 0.0
32 1668.6 20.8 66.5 12.7 0.0 0.0
64 1605.2 20.8 66.9 12.1 0.1 0.0
128 851.1 21.1 61.5 14.6 2.7 0.0
256 1210.4 21.4 32.1 43.7 2.8 0.0
512 720.8 21.3 32.2 43.1 1.5 1.8
1024 612.4 21.1 30.7 42.2 4.1 1.8
2048 383.3 21.0 34.6 30.5 12.2 1.8

Runtime behaviour
Size MFlops
/sec
% of cycles used for
± Base Exec Cache DTB Branch R dep Nops
2 sweeps performed together
16 355.3 7.5 108.7 64.6 0.3 4.6 5.1 19.6 7.0
32 378.0 7.9 108.2 65.6 0.6 5.2 6.3 15.8 6.8
64 362.5 7.2 107.4 66.7 0.8 4.1 7.3 15.7 5.6
128 191.1 9.7 100.2 34.9 8.6 2.2 3.6 40.2 1.0
256 276.7 5.0 105.4 51.2 19.5 5.5 6.2 11.3 6.7
512 158.3 3.8 102.5 29.8 51.4 3.5 3.2 6.9 3.9
1024 138.3 3.9 101.9 25.2 51.0 11.1 2.3 5.9 2.5
2048 85.4 3.1 100.1 16.7 56.1 3.7 1.5 17.7 1.3
3 sweeps performed together
16 354.2 7.0 108.7 63.6 0.3 4.2 5.8 21.0 6.8
32 376.0 7.1 108.9 67.6 0.1 4.6 6.2 16.3 7.0
64 362.1 7.4 107.8 67.0 0.6 4.0 7.4 15.8 5.6
128 192.7 9.6 100.2 34.9 9.1 1.9 3.5 40.2 1.0
256 275.0 5.1 105.5 51.5 19.2 5.6 5.8 11.5 6.8
512 163.5 3.1 102.6 29.4 51.7 4.6 2.7 7.0 4.1
1024 138.7 3.9 101.9 25.5 51.5 10.2 2.4 5.9 2.5
2048 86.6 3.0 100.1 16.9 55.7 3.7 1.6 17.9 1.3
Table explanation

cs10-dime@fau.de
Last Modified: 10 January 2008
Valid HTML 4.01! Powered by vim