In plain old Java, if I implement a class with Serializable interface, Java takes care of serializing an instance of the class and vice versa. Why do we need to supply a separate Serializer and Deserializer class if we use Kafka? Why can't Kafka use the same mechanism as Java Runtime Engine for serializing and deserializing an instance if the class implements Serializable? Can you please explain this?
-
7because Java serialization is badMichael– Michael2019-08-11 19:17:08 +00:00Commented Aug 11, 2019 at 19:17
-
For refernce to @Michael's comment: Oracle plans to dump risky Java serializationTuring85– Turing852019-08-11 19:20:09 +00:00Commented Aug 11, 2019 at 19:20
-
Java serialisation is slow, bloatedand insecure. It lacks backwards and forwards compatibility features, and has minimal versioning support. It's implementation as reflection on named private methods is questionable at best. Why would anyone building a highly scalable high throughput messaging system use such a system??Boris the Spider– Boris the Spider2019-08-11 19:22:23 +00:00Commented Aug 11, 2019 at 19:22
-
5Keep in mind that Kakfa is not limited to java. It can be used with any other programming language, which will not understand java's serialization. However, you can use java's serialization for your Serializer and Deserializer if you want.Progman– Progman2019-08-11 19:22:56 +00:00Commented Aug 11, 2019 at 19:22
-
In addition to the above: I usually send JSON over Kafka, which is not supported by Java serialization.daniu– daniu2019-08-12 13:42:28 +00:00Commented Aug 12, 2019 at 13:42
1 Answer
To answer your questions, Kafka utilizes a separate Serializer and Deserializer class so that applications can choose which format of serialization to use. It is a pluggable method of adding flexibility.
Kafka does "come with" two default serializers, a byte and string deserializer, as you note neither of these utilizes Java serialization. As other comments have mentioned java serialization is often seen when the writer and reader applications are tightly coupled, use the same versions of java etc. it's issues become more apparent in environments where these assumptions can't be made. Environments like those Kafka is used in. There is also the matter of Kafka supporting other programming languages in its API's.
If this was a hard requirement you could use the default byte deserializer to read the messages from java as bytes, and post process these. The messages you'd write would be a byte string corresponding to your java serialized object. This basically uses Kafka as just a byte pass through.