14

In every Java implementation I see of reading from a file, I almost always see a file reader used to read line by line. My thought would be that this would be terribly inefficient because it requires a system call per line.

What I'd been doing instead is to use an input stream and grab the bytes directly. In my experiments, this is significantly faster. My test was a 1MB file.

    //Stream method
    try {
        Long startTime = new Date().getTime();

        InputStream is = new FileInputStream("test");
        byte[] b = new byte[is.available()];
        is.read(b);
        String text = new String(b);
        //System.out.println(text);

        Long endTime = new Date().getTime();
        System.out.println("Text length: " + text.length() + ", Total time: " + (endTime - startTime));

    }
    catch (Exception e) {
        e.printStackTrace();
    }

    //Reader method
    try {
        Long startTime = new Date().getTime();

        BufferedReader br = new BufferedReader(new FileReader("test"));
        String line = null;
        StringBuilder sb = new StringBuilder();
        while ((line = br.readLine()) != null) {
            sb.append(line);
            sb.append("\n");
        }
        String text = sb.toString();

        Long endTime = new Date().getTime();
        System.out.println("Text length: " + text.length() + ", Total time: " + (endTime - startTime));

    }
    catch (Exception e) {
        e.printStackTrace();
    }

This gives a result of:

Text length: 1054631, Total time: 9
Text length: 1034099, Total time: 22

So, why do people use readers instead of streams?

If I have a method that takes a text file and returns a String that contains all of the text, is it necessarily better to do it using a stream?

3
  • Your code is not correct. It is not guaranteed that it will read the whole file, see the documentation of the read and available methods. Commented Apr 22, 2012 at 16:49
  • 1
    Had you tried your hands on java.nio.File package's Files.readAllLines(...) method. Commented Apr 22, 2012 at 16:52
  • +1 for learned something new Commented Feb 7, 2013 at 20:29

3 Answers 3

9

You are comparing apples to bananas. Reading one line at a time is going to be less efficient even with a bufferedReader than grabbing data as fast as possible. Note that use of available is discouraged, as it is not accurate in all situations. I found this out myself when I started using cipher streams.

Sign up to request clarification or add additional context in comments.

6 Comments

That's very interesting. Is available dangerous when reading from a plain text file that exists on the local file system?
@Jeremy It is never correct to use available to allocate a buffer for the entirety of a stream.
@Jeffrey If you have it, I'd love to see any resources you have on that. Before now I had been using available quite happily without running into any issues. I believe you, but I wonder if there really is a situation where available is appropriate.
@Jeremy Read the documentation for available. I more or less quoted the second paragraph of the documentation in my last statement.
@Jeremy The problem with available is that it can only return the number of bytes available without blocking. If you are 100% sure that your InputStream's buffer contains your entire file and that your InputStream will return the correct number from available, then by all means use it. But if your file is larger than the InputStreams buffer or your InputStream does not return the correct number, using it will fail.
|
3

FileReader is generally used in conjunction with a BufferedReader because frequently it makes sense to read a file line by line, specially if the file has a well-defined record structure where each record corresponds to a line.

Also, FileReader can simplify some of the work for dealing with character encodings and conversions, as stated in the javadocs :

Convenience class for reading character files. The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate ... FileReader is meant for reading streams of characters.

Comments

3

Try to increase BufferedReader buffer size. For example:

BufferedReader br = new BufferedReader(new FileReader("test"),2000000);

If you choose the right buffer size you will be faster.

Then in your sample with Reader you spend time filling the StringBuilder. You have to read file line by line if you need to process lines. But if you only need to read a text in a string then read bigger chunk of text with public int read(char[] cbuf) and write the chunks in a StringWriter initialized with a proper size.

Choose to use InputStream or Reader does not depends on performance. Generally you use Reader when you read text data, because with reader you can handle more easily the charset.

Another point, your code here

byte[] b = new byte[is.available()];
is.read(b);
String text = new String(b);

it is not correct. The documentation tells

Note that while some implementations of InputStream will return the total number of bytes in the stream, many will not. It is never correct to use the return value of this method to allocate a buffer intended to hold all data in this stream.

so pay attention, you need to fix it.

2 Comments

Manually supplying a buffer size only seemed to negatively impact performance for me.
How big is your file? How much heap do you dedicate to your JVM?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.