You can read the most current version of the paper here .
The sparc lets you do weird loads and stores from "alternate address spaces". With this functionality, you can theoretically do non-faulting speculative loads and stores in order to keep your pipelines full and such, or do automatic byte-endian conversions and all sorts of wonderful things like this. An example bit of code that makes use of the endian-conversions and speculative loads can be had here. Speculative stores seem not to be implemented on the UltraSPARC IIi, but some sparc out there might implement them.
I redid the cache-timing programming project that we had earlier
in the semester in CS282. The premise was that you walk over an
array in memory, and you can make empirical guesses about the cache
architecture of the system. The difference is that instead of
the system timing functions, I read the time out of the Sparc
%TICK register. This register basically increments
about once per instruction cycle...thus you can time things in units
of cycles, as opposed to seconds. Mildly interesting.
The data from the program seemed to mimic the data that I collected
the first time around using the UNIX
clock() function to
gather timing information, except for a couple of caveats. Apparently,
as the amount of time that it took to walk over the array grew, the odds
that we were context switched in the middle grew significantly, and a
number of sample points in the test were really outliers.
You can grab