0

I have to read a URLConnection response containing 2MB of pretty printed JSON in java.

2mb is not "small" but by no means large. It contains JSON. However, it is pretty printed JSON with around 60k lines. A

while ((line = bufferedReader.readLine()) != null) {
    lineAllOfIt += line;
}

takes around 10 minutes to read this response. There must be something wrong with my approach, but I cannot picture a better approach.

9
  • I assume you mean MB, otherwise your file would be tiny at 2 milliBit :P Commented Jun 1, 2016 at 12:04
  • 1
    lineAllOfIt += line; is "wrong" since strings are immutable and you create new ones with increasing size over and over again. Use a string builder or do it like stackoverflow.com/a/37079572/995891 Commented Jun 1, 2016 at 12:07
  • do you want to write an answer? this is the solution Commented Jun 1, 2016 at 12:14
  • what do you want to do with your JSON? parse it no? Commented Jun 1, 2016 at 12:15
  • I don't believe that it is a good idea to load a file of 2 Mo into memory anyway even in a StringBuilder unless you only do it only once and this operation cannot be done in parallel otherwise you will fill up your heap Commented Jun 1, 2016 at 12:19

1 Answer 1

1

For this particular case, I would cache the file locally using java you can have a low memory transfer of the file to your computer, then you can go through it line by line without loading the file into memory as well and pull out the data you need or loading it all at once.

EDIT: Made changes on variable names i pulled this from my code and forgot to neutralize the variables. Also FileChannel transferTo/transferFrom can be much more efficient as there is potentially less copies and depending on operation could go from the SocketBuffer -> Disk. FileChannel API

    String urlString = "http://update.domain.com/file.json" // File URL Path
    Path diskSaveLocation = Paths.get("file.json"); // This will be just help place it in your working directory

    final URL url = new URL(fileUrlString);
    final URLConnection conn = url.openConnection();
    final long fileLength = conn.getContentLength();
    System.out.println(String.format("Downloading file... %s, Size: %d bytes.", fileUrlString, fileLength));
    try(
            FileOutputStream stream = new FileOutputStream(diskSaveLocation.toFile(), false);
            FileChannel fileChannel = stream.getChannel();
            ReadableByteChannel inChannel = Channels.newChannel(conn.getInputStream());
    ) {
        long read = 0;
        long readerPosition = 0;
        while ((read = fileChannel.transferFrom(inChannel, readerPosition, fileLength)) >= 0 && readerPosition < fileLength) {
            readerPosition += read;
        }
        if (fileLength != Files.size(diskSaveLocation)) {
            Files.delete(diskSaveLocation);
            System.out.println(String.format("File... %s did not download correctly, deleting file artifact!", fileUrlString));
        }
    }
    System.out.println(String.format("File Download... %s completed!", fileUrlString));
    ((HttpURLConnection) conn).disconnect();

You can now read this same file using a NIO2 method that allows you to read line by line without loading into memory. Using Scanner or RandomAccessFile methods you can prevent reading lines into the heap. If you want to read the whole file in you can also do so locally from the cached file using many of the methods from Javas Files utility methods.

Java Read Large Text File With 70million line of text

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.