4

I am implementing the LZW algorithm. I have implemented it for strings and text files successfully and am currently modifying my code to work with binary files, such as images or executables (since I cannot read these files as strings).

I have replaced the String type in my code with the ArrayList<Byte> type. My code now compresses and decompresses binary files correctly, however it is at least 10x slower! This is unacceptable in a compression application where speed is a critical element.

Have I made the correct substitution of ArrayList<Byte> for String. Is there a faster alternative with similar functionality? Note that the LZW algorithm requires array resizing hence standard arrays[] are not suitable.

Regards.

1
  • you should show your code, there may be other obvious reasons for the performance hit. Commented Jan 21, 2014 at 0:22

4 Answers 4

6

Using a List<Byte> will box every byte into a separate object instance.
In bulk, that is one of the worst possible things you can do for performance.

By contrast, an array or string can occupy a solid block of memory.

Instead, you should use ByteArrayOutputStream, or use byte[] directly and resize as needed (you can make a wrapper class for that)

Sign up to request clarification or add additional context in comments.

1 Comment

+1 The "solid block of memory" is probably key here. A computationally intensive operation like compression is likely going to benefit a lot from the increased memory locality.
1

You are boxing bytes within an ArrayList, which uses much more memory than simple Strings. What this means is each byte is wrapped in a whole object, and referred to by a reference. Note that such a reference is itself 4 to 8 times larger than the original byte!

You would be much better off using primitive byte [] arrays, or perhaps a primitive collections library (which properly abstracts primitive arrays as collections) such as this or this.

Comments

0

You need to locate the section of the code that is causing the slow down. There is not enough information in the question to gain any useful answers.

You should use a profiler. See this thread: Which Java Profiling tool do you use and which tool you think is the best?

Comments

0

ArrayList implements an array so it is not ideal for lots of resizing. LinkedList should give better performance if resizing is creating the bottleneck.

https://stackoverflow.com/a/322742/1487030

2 Comments

If memory locality is the culprit (as I suspect), a LinkedList would perform even worse.
When I mentioned resizing in the question. I meant the equivalent of String = String + char; so ArrayList<Byte>.add(byte);

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.