Big execution time difference between java Lambda vs Anonymous class

Question

I was curious about performance of creation of java8 lambda instances against the same anonymous class. (Measurement performed on win32 java build 1.8.0-ea-b106). I've created very simple example and measured if java propose some optimization of new operator while create lambda expression:

static final int MEASURES = 1000000;
static interface ICallback{
    void payload(int[] a);
}
/**
* force creation of anonymous class many times
*/
static void measureAnonymousClass(){
    final int arr[] = {0};
    for(int i = 0; i < MEASURES; ++i){
        ICallback clb = new ICallback() {
            @Override
            public void payload(int[] a) {
                a[0]++;
            }
        };
        clb.payload(arr);
    }
}
/**
* force creation of lambda many times 
*/
static void measureLambda(){ 
    final int arr[] = {0};
    for(int i = 0; i < MEASURES; ++i){
        ICallback clb = (a2) -> {
            a2[0]++;
        };
        clb.payload(arr);
    }
}

(Full code can be taken there: http://codepad.org/Iw0mkXhD) The result is rather predictable - lambda wins 2 times.

But really little shift to make closure shows very bad time for lambda. Anonymous class wins 10 times! So now anonymous class looks like:

ICallback clb = new ICallback() {
        @Override
        public void payload() {
            arr[0]++;
        }
    };

And lambda does as follow:

ICallback clb = () -> {
            arr[0]++;
        };

(Full code can be taken there: http://codepad.org/XYd9Umty ) Can anybody explain me why exists so big (bad) difference in handling of closure?

That's a quite naïve approach to microbenchmarking. At the very least use System.nanoTime and introduce throwaway executions to warm up the JVM. Several System.gc() calls between executions would alse be a good idea. Ideally, do this with Google Caliper or Oracle jmh. — Marko Topolnik
– Marko Topolnik, Commented Sep 25, 2013 at 9:39
@MarkoTopolnik - actually I've foreseen this note, that is why I performed 2 measurements when measureLambda run first and when measureLambda runs after measureAnonymousClass - without any impact at all! And nanoTime can show difference in precise measurement, but not when I'm talking about 10 times — Dewfy
– Dewfy, Commented Sep 25, 2013 at 9:41
The accuracy of currentTimeMillis is often at the level of a tenth of a second (platform-dependent). The accuracy of nanoTime is typically on the level of a microsecond. Also, just reordering executions doesn't prove anything: each code path must be warmed up on its own. Warm-up executions is the way to do it and garbage collection must be controlled for. — Marko Topolnik
– Marko Topolnik, Commented Sep 25, 2013 at 9:53
Maybe you are missing the point of my comments so far: it is falsifying a number of standard hypotheses about the common sources of error while benchmarking on the JVM. Only when you have those solidly cleared can you enter a serious discussion of the results. — Marko Topolnik
– Marko Topolnik, Commented Sep 25, 2013 at 12:03
Note that, besides the fact that this benchmark is far away from the intended use case, just specifying the -server option at JVM start will make the recorded overhead go away entirely. — Holger
– Holger, Commented Nov 21, 2013 at 14:40

Stuart Marks · Accepted Answer · 2020-08-03 17:02:35Z

30

UPDATE

A few comments wondering if my benchmark at the bottom was flawed - after introducing a lot of randomness (to prevent the JIT from optimising too much stuff), I still get similar results so I tend to think it is ok.

In the meantime, I have come across this presentation by the lambda implementation team. Page 16 shows some performance figures: inner classes and closures have similar performance / non-capturing lambda are up to 5x times faster.

And @StuartMarks posted this JVMLS 2013 talk from Sergey Kuksenko on lambda performance. The bottom line is that post JIT compilation, lambdas and anonymous classes perform similarly on current Hostpot JVM implementations.

YOUR BENCHMARK

I have also run your test, as you posted it. The problem is that it runs for as little as 20 ms for the first method and 2 ms for the second. Although that is a 10:1 ratio, it is in no way representative because the measurement time is way too small.

I have then taken modified your test to allow for more JIT warmup and I get similar results as with jmh (i.e. no difference between anonymous class and lambda).

public class Main {

    static interface ICallback {
        void payload();
    }
    static void measureAnonymousClass() {
        final int arr[] = {0};
        ICallback clb = new ICallback() {
            @Override
            public void payload() {
                arr[0]++;
            }
        };
        clb.payload();
    }
    static void measureLambda() {
        final int arr[] = {0};
        ICallback clb = () -> {
            arr[0]++;
        };
        clb.payload();
    }
    static void runTimed(String message, Runnable act) {
        long start = System.nanoTime();
        for (int i = 0; i < 10_000_000; i++) {
            act.run();
        }
        long end = System.nanoTime();
        System.out.println(message + ":" + (end - start));
    }
    public static void main(String[] args) {
        runTimed("as lambdas", Main::measureLambda);
        runTimed("anonymous class", Main::measureAnonymousClass);
        runTimed("as lambdas", Main::measureLambda);
        runTimed("anonymous class", Main::measureAnonymousClass);
        runTimed("as lambdas", Main::measureLambda);
        runTimed("anonymous class", Main::measureAnonymousClass);
        runTimed("as lambdas", Main::measureLambda);
        runTimed("anonymous class", Main::measureAnonymousClass);
    }
}

The last run takes about 28 seconds for both methods.

JMH MICRO BENCHMARK

I have run the same test with jmh and the bottom line is that the four methods take as much time as the equivalent:

void baseline() {
    arr[0]++;
}

In other words, the JIT inlines both the anonymous class and the lambda and they take exactly the same time.

Results summary:

Benchmark                Mean    Mean error    Units
empty_method             1.104        0.043  nsec/op
baseline                 2.105        0.038  nsec/op
anonymousWithArgs        2.107        0.028  nsec/op
anonymousWithoutArgs     2.120        0.044  nsec/op
lambdaWithArgs           2.116        0.027  nsec/op
lambdaWithoutArgs        2.103        0.017  nsec/op

edited Aug 3, 2020 at 17:02

Stuart Marks

133k39 gold badges215 silver badges275 bronze badges

answered Sep 25, 2013 at 13:15

assylias

330k84 gold badges680 silver badges806 bronze badges

Sign up to request clarification or add additional context in comments.

25 Comments

Marko Topolnik Over a year ago

That result would imply that the JIT completely eliminated the allocation of the actual lambda/anonymous class instances. However, if OP is having different results, then I'd proceed as described: separate out the allocation from invocation, see if the discrepancy is still there.

Marko Topolnik Over a year ago

Anyway, I'm pretty sure that OP's large discrepancy is due to one code path having the advantage of EA and the other going through the full dynamic allocation. I don't see anything else which would explain a factor of 10 and more.

Stuart Marks Over a year ago

Thanks for writing down your performance investigations. +1. The original (mostly prototype) JDK 8 implementation of lambda was exactly an anonymous inner class, to get something working early so that we could explore language and library evolution. This seems to have spawned a myth that lambdas are nothing more than anonymous inner classes. More recently the implementation has been optimized so that lambda is almost always faster than the "equivalent" anonymous inner class.

Stuart Marks Over a year ago

Also, there are two great JVM Language Summit talks from the performance guys. First, Alexey Shipilev (author of jmh) talks about benchmarking and its many pitfalls. (This was voted best talk at JVMLS this year.) Second, Sergey Kuksenko talks about what he's been doing to optimize lambda performance. [1] medianetwork.oracle.com/video/player/2630310904001 [2] medianetwork.oracle.com/video/player/2623576348001

Stuart Marks Over a year ago

@Tuntable Links updated. Thanks for mentioning this.

|

Collectives™ on Stack Overflow

Big execution time difference between java Lambda vs Anonymous class

1 Answer 1

UPDATE

YOUR BENCHMARK

JMH MICRO BENCHMARK

25 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

UPDATE

YOUR BENCHMARK

JMH MICRO BENCHMARK

25 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related