The Java compiler often produces a large number of error messages, even if the cause is a single error, such as an undeclared variable. Why does this compiler continue to process the source file after an error has been detected, rather than just stopping?
2 Answers
For large projects, compilation can be quite slow; so it saves the programmer's time for the compiler to inform them about multiple errors, rather than having to fix one error, recompile, fix another error, recompile, and so on.
It's true that often a single mistake can cause many compiler errors, but it usually doesn't hurt to report many errors at once even if there is only one "real" mistake in the code. And sometimes there will be more than one "real" mistake.
By default, javac gives up compiling after 100 errors. If you really want it to stop after one error, you can set the command-line argument -Xmaxerrs 1.
For the vast majority of programmers this isn't an issue at all, because if you use an IDE then the errors reported by javac will be highlighted in the code editor, and you can hover over each highlight to see the error message for that part of the code. This makes it much more manageable to deal with a higher quantity of error messages. It's rare that you would have to run javac on the command-line and read those error messages directly from the console.
2 Comments
This is a fair question. Traditionally, furnishing succinct information about as many errors as possible has been a goal of compilers because compilation has been expensive: initially in machine time and later in human time waiting for the compiler.
It's arguable that hardware improvements have made compilation so fast that the importance of finding more than one error per compiler run is gone. That's language specific (e.g. Python type checking is currently quite slow), and it's not a debate for SO.
Reporting more than one error per compiler run is a hard problem. All the logic of a compiler is designed to exploit the strict correctness of the input in order to transform it to a correct output. When the input contains an error, the only choice available for the compiler engineer is to print the error, then use a heuristic algorithm to "guess" a fix that makes the input correct. In compiler design literature, this is called error recovery for errors in the parser. But semantic analysis needs similar techniques.
Lots can go wrong with error recovery. A couple of examples:
The guess doesn't correct the mistake. The compiler tries again and "discovers" a new error and repeats the process. This is what causes several messages for one error, known as an error cascade.
The guess fixes the immediate problem but creates one elsewhere. For example, an a type can be guessed for an undeclared variable, which later causes a type error because the guess wasn't the programmer's intended type.
The upshot is that designers tend to focus on emitting at least one good message per error and not too many useless error cascades. At some point the heuristics are working well enough so that time is better spent on other things. Given that a stable compiler is likely to be a fairly old program, error behavior tends to be tailored to development norms quite a few years back.
2 Comments
javac, is missing the actual problem during the recovery.
-Xmaxerrsto control this.javacmanages to spit out one hundred error messages about generic signature mismatches or the infamous “static method can not referenced from nonstatic context” error, but forget to mention that it didn’t find the variable on which the entire construct is invoked. Sure, if the problem is two missing variables and the compiler manages to report both as missing, it would be great, but unfortunately, that’s not what happens. You can spent several minutes trying to fix reported generic signature mismatches, until you realize, the real problem is a missing comma…