Consider the following code to copy one file to another:

import java.io.*;

public class FileCopy {

    public static void main(String[] args) throws Exception {
        try (InputStream is = new FileInputStream(args[0]);
             OutputStream os = new FileOutputStream(args[1])) {
           int octet;
           while ((octet = is.read()) != -1) {
               os.write(octet);
           }
        }
    }
}

(We have deliberated omitted normal argument checking, error reporting and so on because they are not relevant to point of this example.)

If you compile the above code and use it to copy a huge file, you will notice that it is very slow. In fact, it will be at least a couple of orders of magnitude slower than the standard OS file copy utilities.

(Add actual performance measurements here!)

The primary reason that the example above is slow (in the large file case) is that it is performing one-byte reads and one-byte writes on unbuffered byte streams. The simple way to improve performance is to wrap the streams with buffered streams. For example:

import java.io.*;

public class FileCopy {

    public static void main(String[] args) throws Exception {
        try (InputStream is = new BufferedInputStream(
                     new FileInputStream(args[0]));
             OutputStream os = new BufferedOutputStream(
                     new FileOutputStream(args[1]))) {
           int octet;
           while ((octet = is.read()) != -1) {
               os.write(octet);
           }
        }
    }
}

These small changes will improve data copy rate by at least a couple of orders of magnitude, depending on various platform-related factors. The buffered stream wrappers cause the data to be read and written in larger chunks. The instances both have buffers implemented as byte arrays.

What about character-based streams?

As you should be aware, Java I/O provides different APIs for reading and writing binary and text data.

For text I/O, BufferedReader and BufferedWriter are the equivalents for BufferedInputStream and BufferedOutputStream.

Why do buffered streams make this much difference?

The real reason that buffered streams help performance is to do with the way that an application talks to the operating system:

1. Put the syscall arguments into registers.
2. Execute a SYSENTER trap instruction.
3. The trap handler switched to privileged state and changes the virtual memory mappings.  Then it dispatches to the code to handle the specific syscall.
4. The syscall handler checks the arguments, taking care that it isn't being told to access memory that the user process should not see.
5. The syscall specific work is performed.  In the case of a `read` syscall, this may involve:
   1. checking that there is data to be read at the file descriptor's current position
   2. calling the file system handler to fetch the required data from disk (or wherever it is stored) into the buffer cache,
   3. copying data from the buffer cache to the JVM-supplied address
   4. adjusting thstream pointerse file descriptor position
6. Return from the syscall.  This entails changing VM mappings again and switching out of privileged state.