Performance of short-running Java CLI application

Question

I'm building a java CLI utility application that processes some data from a file.

Apart from reading from a file, all the operations are done in-memory. The in-memory processing part is taking a surprisingly long time so I tried profiling it but could not pinpoint any specific function that performed particularly bad.

I was afraid that JIT was not able to optimize the program during a single run, so I benchmarked how the runtime changes between the consecutive executions of the function with all the program logic (including reading the input file) and sure enough, the runtime for the in-memory processing part goes down for several executions and becomes almost 10 times smaller already on the 5th run.

I tried shuffling the input data before every execution, but it doesn't have any visible effect on this. I'm not sure if some caching may be responsible for this improvement or the JIT optimizations done during the program run, but since usually the program is ran once at time, it always shows the worst performance.

Would it be possible to somehow get a good performance during the first run? Is there a generic way to optimize performance for a short-running java applications?

Stephen C · Accepted Answer · 2021-06-06 12:22:27Z

6

You probably cannot optimize startup time and performance by changing your application^{1, 2}. And especially for a small application³. And I certainly don't think there are "generic" ways to do it; i.e. optimizations that will work for all cases.

However, there are a couple of JVM features that should improve performance for a short-lived JVM.

Class Data Sharing (CDS) is a feature that allows JIT compiled classes to be cached in the file system (as a CDS archive) and which is then reused by later of runs of your application. This feature has been available since Java 5 (though with limitations in earlier Java releases).

The CDS feature is controlled using the -Xshare JVM option.

-Xshare:dump generates a CDS archive during the run
-Xshare:off -Xshare:on and -Xshare:auto control whether an existing CDS archive will be used.

The other way to improve startup times for a HotSpot JVM is (was) to use Ahead Of Time (AOT) compilation. Basically, you compile your application to a native code binary using the jaotc command, and then run the executable it produces rather than the java command. The jaotc command is experimental and was introduced in Java 9.

It appears that jaotc was not included in the Java 16 builds published by Oracle, and is scheduled for removal in Java 17. (See JEP 410: Remove the Experimental AOT and JIT Compiler).

The current recommended way to get AOT compilation for Java is to use the GraalVM AOT Java compiler.

^{1 - You could convert into a client-server application where the server "up" all of the time. However, that has other problems, and doesn't eliminate the startup time issue for the client ... assuming that is coded in Java.

2 - According to @apangin, there are some other application tweaks that may could make you code more JIT friendly, though it will depend on what you code is currently doing.

3 - It is conceivable that the startup time for a large (long running) monolithic application could be improved by refactoring it so that subsystems of the application can be loaded and initialized only when they are needed. However, it doesn't sound like this would work for your use-case.}

edited Jun 6, 2021 at 12:22

answered Jun 6, 2021 at 11:20

Stephen C

723k95 gold badges849 silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

apangin Over a year ago

"You probably cannot optimize startup time and performance by changing your application" - It's almost always possible, especially when you know that JITted code performs much better. There are techniques to write JIT-friendly code, but they are highly application dependent; it's hard to advise without seeing the actual code. E.g. one typical reason of slow application startup is computation in static initializer.

Stephen C Over a year ago

Yea ... but the 10-fold improvement after 5 runs sounds >to me< like the difference in performance of JITed versus interpreted code.

apangin Over a year ago

Right. The post I linked above shows exactly this kind of problem - that JVM interpretes the code that should have been compiled. And it shows how to rewrite the application code to help JIT make its work.

apangin Over a year ago

Another counterintuitive example of making code JIT-friendly is un-inlining, i.e. extracting the small methods out of long nested loops (shown in this answer).

Stephen C Over a year ago

I suggest you write about all of this in another answer (to this question).

|

Matthieu · Accepted Answer · 2021-06-06 11:42:55Z

0

You could have the small processing run as a service: when you need to run it, "just" make a network call to that service (easier if it's HTTP because there are easy way to do it in Java). That way, the processing itself stays in the same JVM and will eventually get faster when JIT kicks in.

Of course, because it could require significant development, that is only valid if the processing itself:

is called often
has arguments that are easy to pass to the service (usually serialized as strings)
has arguments that don't require too much data to pass to the service (e.g. several MB binary content)

answered Jun 6, 2021 at 11:42

Matthieu

3,1456 gold badges63 silver badges90 bronze badges

Collectives™ on Stack Overflow

Performance of short-running Java CLI application

2 Answers 2

6 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related