Subject: Re: The Pitiful Performance of Java Binary I/O Methods. Newsgroups: comp.lang.java.api Date: Tue, 19 Nov 1996 16:30:13 -0700 Organization: University of Utah -- Center for Design Systems Thomas A. McGlynn wrote: > ... > the I/O packages seem to be quite crippled. In particular > methods like DataInputStream.readFloat() are extremely > inefficient. I've finally managed to bring performance > up to something reasonable but I have to bypass these > methods completely. ... > ds = new DataInputStream(new FileInputStream(...)); > > byte[] buffer[4096]; > ds.readFully(buffer); > for (i=...) > x[i] = Float.intBitsToFloat( > (buffer[4*i]<<24 & -.....)| > (buffer[4*i+1]<<16 & ....)| > (buffer[4*i+2]<<8 & 256)| > (buffer[4*i+3)); > > Amazingly this was 6 times faster than using the buffered stream > bringing the speed up to 90000 words/s which is adequate if ... > What gives? Is the DataInputStream class so poorly crafted > that this kludge can beat it so dramatically? Are there gotcha's > that I should be worrying about? I just find it amazing that > the I/O library can be the limiting factor even when reading > a file over the Net. Calling an instance method is a relatively slow operation, on par with float operations. Calling a synchronized method is abysmally slow, sometimes slower than creating objects, usually about 50 times slower than an unsynchronized method in a JITC, or about 9 times slower without a JITC. The problem is not with the way DataInputStream is crafted, but with the crappy performance of synchronized methods. The lesson here is NEVER CALL SYNCHRONIZED METHODS IN A TIGHT LOOP! Even unsynchronized methods are good to avoid in a tight loop if you can help it. If you're reading a few floats, then some Strings, a couple of ints, from the same stream, DataInputStream was designed for you. For reading floats exclusively, you want to write your own code. DataInputStream.readFloat() results in a call to Float.intBitsToFloat(), a call to readInt(), FOUR SYNCHRONIZED calls to read(), which occasionally result in some other calls to actually read more bytes into the buffer. Now, multiply this number of calls by the 1024 floats you read. That's about 3,100 regular and 4,100 synchronized method calls. Using our factor of 9 for synchronized calls, that's an effective 40,000 unsynchronized calls. With your "kludge" (which is actually really smart), you make one synchronized call to DataInputStream.readFully(), which might result in, say, two synchronized calls to FileInputStream.read(). Then you call intBitsToFloat 1024 times. That's about 1000 regular and only 3 synchronized method calls, or effectively 1,100 unsynchronized calls. Just going by these numbers, you'd expect a 38x speed improvement with your technique. Unfortunately, these numbers are kind of hokey. Since, when you get down the nitty-gritty, your technique uses a lot of array accesses, which are generally about 3 times slower than the local variables used in DataInputStream, it loses a little. Then, you're repeating that 4*i calculation 4 times instead of storing it in a local variable and incrementing it. Taking these factors into account, your 15x improvement is pretty reasonable. Real improvements should come from JIT makers tackling synchronized methods, not from rewriting DataInputStream. In the meantime, the best solution is to do what you've done -- implement your own more efficient technique. This is the traditional tradeoff of size vs. speed, and coding effort vs. runtime performance, which effects any language and any library. Good luck, -- Doug Erickson (http://www.mech.utah.edu/~erickson)