Cray T3D Applicatiors Programming      Single PE Optimization Techniques
------------------------------------------------------------------------

                 T3D Load Times/BWs
                                                       Bandwidth
                                                          64bit
        Source                    Latency (cp)      Wds/cp        MB/s
Cache                                        3         1/1        1200
DRAM -> Cache(in page)                      24        4/24         200
DRAM -> Cache (out of page)                 39        4/39         123
DRAM Read Ahead -> Cache                    15        4/15         320
DRAM -> Register (in page)                  22        1/22          55
DRAM -> Register (out of page)              37        1/37          32
Remote DRAM -> Cache (in page)             107       4/111          43
Remote DRAM -> Cache (out of page)         122       4/126          37
Remote DRAM -> Register (in page)           86        1/83          15
Remote DRAM -> Register (out of page)      101        1/98          12
Remote DRAM -> Prefetch (in page)          ~86        1/21          67
Remote DRAM -> Prefetch (out of page)     ~101        1/23          50

All remote DRAM values assume no network contention and nearest
neighbor (not same node) communication. For more distant neighbors,
add 1 cp for each network switch passed and 2 cp's for each change of
direction (maximum of 2).


TR-T3DAPPL B            Cray Reaearch, Inc.             16-15


Was a written page, given to me by Ed Segall, I though I scan it for
further reference.
