Java custom serialization best practices

Question

I am attempting to improve our current serialization performance by switching from the Serializable interface to Externalizable, but have not found a lot of documentation on best practice for creating custom and performant serialzations. My current solution is about twice as fast as the stock Java serialization, which while good, doesn't seem like the vast improvement I was expecting (Benchmark of serialization techniques/libraries)

For anything but primitives I've taken the approach of writing a 0 or 1 to show the field exists, then reading the field if the value is 1:

if (in.read() == 1) {
    name = in.readUTF();
}

Does that sound about right? Are there better encodings to use? What about Maps, Lists, and other complex data structures. Is the default serialization for Enums fine?

Thanks.

Jon Skeet · Accepted Answer · 2012-04-20 16:35:05Z

2

Any reason not to use an existing serialization framework - but a rather better one than Java has built-in? My own preference is Protocol Buffers, but there are alternatives as well, such as Thrift. I'd try to avoid doing your own low-level serialization unless you really can't avoid it. The page you've linked to shows lots of alternatives.

You should consider both performance and maintainability. While Externalizable can give you great performance, it depends on how you implement it, in the end - and you could do a good job, or a bad job... but it'll all be manual.

answered Apr 20, 2012 at 16:35

Jon Skeet

1.5m893 gold badges9.3k silver badges9.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Gandalf Over a year ago

The plan is actually to evaluate a few possible ways and see which we like the best based on performance and maintainability. We don't need the cross language support that many of them offer, so custom Java code seemed like one possibility.

Peter Lawrey · Accepted Answer · 2012-04-20 17:53:05Z

0

From a maintainability point of view, I try to use generated Data Transfer Objects. This way you generate the toString, hashCode, equals, readObject, writeObject and possibly their Builder classes as well from a single definition.

In terms of speed, it depends on what your raw data types are. There are three main costs in deserialization/deserialization

use of reflection, this is the main benefit of custom serialization because you can hard code the fields and types
creating new objects. You can use recycled objects, but this can be tricky.
number of bytes you read/write. Using more compact forms can help.

answered Apr 20, 2012 at 17:53

Peter Lawrey

535k83 gold badges770 silver badges1.2k bronze badges

Collectives™ on Stack Overflow

Java custom serialization best practices

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related