Why, when checking compile-time constant, one branch is enough for resolving variable assignation, but is not enough for returning from that branch?

Question

We know, that if if statement's boolean expression/condition contains compile-time constant (or a variable, holding compile-time constant), then compiler can resolve this constant expression and:

public void ctc1() {

    final int x = 1;
    String text;

    if (x > 0) text = "some text";

    System.out.println(text); //compiles fine, as compile-time constant is resolved during compilation phase;

}

would compile and work fine, without "variable might not have been initialized" compiler error. No initialisation of text in "else" branch (or after "if") is required, as compiler "understands" the constant is always going to result in true, while evaluating x > 0 (which ends up being 1 > 0).

Why, however, same resolution does not work (or works differently) when we want to return from the method, as:

public int ctc2() {

    final int x = 1;

    if (x > 0) return 1;

    //requires another explicit "return" for any other condition, than x > 0

}

or, moreover, as:

public int ctc2() {

    final int x = 1;

    if (1 > 0) return 1;

}

?

Why compiler cannot infer/understand/resolve absolutely identical semantics and cannot be sure, that the return is always executed and code is OK to be compiled?

In case of initialisation in the branch containing compile-time constant, compiler can resolve the constant value(s), and as it knows they will never change, it is sure, variable is going to be initialised. So, it allows usage of the variable after if statement.

Why resolving constant expression works differently for the return case, though? what is the point behind this difference/limitation?

To me, this looks like "two identical logical semantics" work differently.

I'm using:

openjdk version "11.0.2" 2019-01-15
OpenJDK Runtime Environment 18.9 (build 11.0.2+9)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.2+9, mixed mode)

Frankly, I'm surprised the first one compiles. The answer is going to be "because the Java Language Specification says so", are you looking for references to that, or something deeper (like what technical limitation in the compiler might allow one but not the other, or the authors' intent)? — kaya3
– kaya3, Commented Aug 22, 2021 at 9:19
See these two great answers by Eric Lippert to understand why "why" questions like these are not good questions: 1, 2. — Sweeper
– Sweeper, Commented Aug 22, 2021 at 9:26
We are concentrating on the question being asked, we are trying to get you to make the question specific enough that it can be answered without wasting a lot of time producing an answer that is not what you want. Because as written, there are multiple different things you might want by asking this question, and each of them would be a lot of work to write. — kaya3
– kaya3, Commented Aug 22, 2021 at 9:31
Consider that I could have spent 10 minutes writing an answer with references to the parts of the JLS which specify these behaviours, but instead I spent 10 seconds writing a comment to check whether such an answer would be what you were looking for, and eventually you clarified that it would not. I have proposed at least two possibilities: are you asking what technical limitation of the compiler might require it to be this way, are you asking what the intentions of the language designers were when designing it this way, or something else? The Eric Lippert links have more possibilities still. — kaya3
– kaya3, Commented Aug 22, 2021 at 9:36
Nobody is asking you to clarify what language behaviour you are asking about, we are asking you to clarify what you mean by "why", because whenever a "why" question is answered directly, you can always keep asking "why" again to look for a deeper explanation. So how deep are you wanting to go with this? — kaya3
– kaya3, Commented Aug 22, 2021 at 9:38

Andy Turner · Accepted Answer · 2021-08-22 14:39:25Z

2

This is a difference (mismatch? inconsistency?) between the rules around definite assignment and the particular normal completion rules of the if statement.

Specifically, the definite assignment rules say:

In all, there are four possibilities for a variable V after a statement or expression has been executed:

V is definitely assigned and is not definitely unassigned.

(The flow analysis rules prove that an assignment to V has occurred.)

...

The "flow analysis rules" are not clearly specified with regard to branch pruning, but it doesn't seem unreasonable to assume that the flow analysis is able to take into account constant values when deciding whether to follow a branch, meaning it is able to determine there is only one of the 4 states possible (definitely assigned after the if statement).

However, the reachability rules for an if statement say that:

An if-then statement can complete normally iff it is reachable.

Nothing about the expression value or flow analysis here. It's perhaps worth pointing out that this is itself different to the reachability rules for while, do and basic for loops, which do explicitly mention the case of a constant true expression. Any of these returns would be accepted in ctc2():

while (true) return 1;
do { return 1; } while (true);
for (;;) return 1;

So, the language is specified in such a way that it overlooks the fact that your if statement cannot complete normally because of a) the constant expression, b) the return statement, despite that being "obvious" to a human reader.

An example of this difference actually being desirable (or, at least, the reachability rules being desirable) is if you have a DEBUG boolean (as in, a constant-valued to trigger debug-only behaviour). You can imagine a method something like:

if (!DEBUG) {
  return value;
}
return otherValue;

If the "conditional" return were treated in the same way as definite assignment, at least one of the return statements unreachable.

This would be a pain for debugging-time alternate behaviour like this.

Ofc one might argue that you could instead do something that isn't compile-time constant, e.g. invoke a method. I guess you can do that, but I would argue that not allowing use of the dirt-simplest method is.... unnecessarily restrictive, for the sake of avoiding a pretty rare "head-scratcher" in code.

edited Aug 22, 2021 at 14:39

answered Aug 22, 2021 at 11:08

Andy Turner

141k11 gold badges169 silver badges263 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

kaya3 Over a year ago

The rules for if statements don't mention anything about when the condition is a constant expression, so this section is also relevant, specifically "V is [un]assigned after any constant expression (§15.29) whose value is true when false.".

Giorgi Tsiklauri Over a year ago

Thanks, @Andy for your answer. I just have some busy time now, and will definitely examine this answer a bit later.

Collectives™ on Stack Overflow

Why, when checking compile-time constant, one branch is enough for resolving variable assignation, but is not enough for returning from that branch?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related