1

I’m working on a DSL parser using Java CUP, and I’m getting this error when trying to compile my .cup grammar:

Error: Syntax error @ Symbol: PARSER (unknown:19/-5(-1) - unknown:19/1(-1))
Error : Internal error: Unexpected exception
Exception in thread "main" java.lang.NullPointerException
        at java_cup.runtime.lr_parser.symbl_name_from_id(lr_parser.java:456)
        at java_cup.runtime.lr_parser.report_expected_token_ids(lr_parser.java:446)
        at java_cup.runtime.lr_parser.syntax_error(lr_parser.java:433)
        at java_cup.runtime.lr_parser.parse(lr_parser.java:725)
        at java_cup.Main.parse_grammar_spec(Main.java:496)
        at java_cup.Main.main(Main.java:196)

This happens during grammar compilation, not at runtime.
CUP fails to report the real syntax error and triggers an NPE inside symbl_name_from_id, which makes it harder to locate the issue.

Below is the exact .cup grammar I’m compiling:

package edu.uelbosque.dsl.parser;

import java_cup.runtime.*;
import edu.uelbosque.dsl.ast.*;

terminal KW_PROTOCOLO, KW_META, KW_INCLUSION, KW_EXCLUSION, KW_OBJETIVOS, KW_PASOS,
         KW_NOMBRE, KW_CUANDO, KW_ACCIONES, KW_MEDIDA, KW_EDUCACION, KW_TAREA,
         KW_ORDEN_MED, KW_SEGURIDAD, KW_SEGUIMIENTO, KW_AJUSTE_DOSIS, KW_AGREGAR_MED,
         KW_LABS, KW_ALERTAS_GLOBALES, KW_AUTOR, KW_VERSION, KW_VIGENCIA, KW_EVIDENCIA, KW_INTERACCION,
         LBRACE, RBRACE, LSQUARE, RSQUARE, OPEN_BRACKET, CLOSE_BRACKET, COLON, COMMA,
         DOT, ARROW, EQ, LT, GT, LE, GE, AND, OR, STRING, NUMBER, BOOLEAN, IDENT;

non terminal Document, Protocol, ProtocolBody, ProtocolItem, Meta, Inclusion, Exclusion, Objectives, Steps,
             StepList, Step, StepBody, StepTail, ActionList, Actions, Action, Measure, Education, Task,
             OrderMed, OrderMedTail, FollowUp, AdjustDose, AddMed, Labs, GlobalAlerts, AlertList, Alert,
             Obj, Pairs, Pair, Key, Value, List, Values, StringList, Strings, CondList, Exprs, Expr,
             OrExpr, AndExpr, RelExpr, Primary, CallOrIdent, CallTail, IdentChain, ArgList;

parser code {:
  SymbolTable st = new SymbolTable();
:};


start with Document;

Document ::= Protocol EOF {: RESULT = new Document($1); :} ;

Protocol ::= KW_PROTOCOLO STRING LBRACE ProtocolBody RBRACE {: RESULT = new Protocol($2, $4); :} ;

ProtocolBody ::= ProtocolItem ProtocolBody {: RESULT = $2.prepend($1); :}
               | /* empty */ {: RESULT = new ProtocolBody(); :} ;

ProtocolItem ::= Meta | Inclusion | Exclusion | Objectives | Steps | GlobalAlerts ;

Meta ::= KW_META COLON Obj {: RESULT = new Meta($3); :} ;

Inclusion ::= KW_INCLUSION COLON CondList {: RESULT = new Inclusion($3); :} ;

Exclusion ::= KW_EXCLUSION COLON CondList {: RESULT = new Exclusion($3); :} ;

Objectives ::= KW_OBJETIVOS COLON StringList {: RESULT = new Objectives($3); :} ;

Steps ::= KW_PASOS COLON LSQUARE StepList RSQUARE {: RESULT = new Steps($4); :}
        | KW_PASOS COLON LSQUARE RSQUARE {: RESULT = new Steps(null); :} ;

StepList ::= Step {: RESULT = new StepList($1); :}
           | Step COMMA StepList {: RESULT = $3.prepend($1); :} ;

Step ::= LBRACE StepBody RBRACE {: RESULT = new Step($2); :} ;

StepBody ::= KW_NOMBRE COLON STRING StepTail {: RESULT = new StepBody($3, $4); :} ;

StepTail ::= COMMA KW_CUANDO COLON CondList COMMA KW_ACCIONES COLON ActionList {: RESULT = new StepTail($4, $8); :}
           | COMMA KW_ACCIONES COLON ActionList {: RESULT = new StepTail(null, $4); :} ;

ActionList ::= LSQUARE Actions RSQUARE {: RESULT = new ActionList($2); :}
             | LSQUARE RSQUARE {: RESULT = new ActionList(null); :} ;

Actions ::= Action {: RESULT = new Actions($1); :}
          | Action COMMA Actions {: RESULT = $3.prepend($1); :} ;

Action ::= Measure | Education | Task | OrderMed | FollowUp | AdjustDose | AddMed | Labs ;

Measure ::= LBRACE KW_MEDIDA COLON STRING RBRACE {: RESULT = new Measure($4); :} ;

Education ::= LBRACE KW_EDUCACION COLON STRING RBRACE {: RESULT = new Education($4); :} ;

Task ::= LBRACE KW_TAREA COLON STRING RBRACE {: RESULT = new Task($4); :} ;

OrderMed ::= LBRACE KW_ORDEN_MED COLON Obj OrderMedTail RBRACE {: RESULT = new OrderMed($4, $5); :}
           | LBRACE KW_ORDEN_MED COLON Obj RBRACE {: RESULT = new OrderMed($4, null); :} ;

OrderMedTail ::= COMMA KW_SEGURIDAD COLON CondList {: RESULT = new OrderMedTail($4); :} ;

FollowUp ::= LBRACE KW_SEGUIMIENTO COLON Obj RBRACE {: RESULT = new FollowUp($4); :} ;

AdjustDose ::= LBRACE KW_AJUSTE_DOSIS COLON Obj RBRACE {: RESULT = new AdjustDose($4); :} ;

AddMed ::= LBRACE KW_AGREGAR_MED COLON Obj RBRACE {: RESULT = new AddMed($4); :} ;

Labs ::= LBRACE KW_LABS COLON StringList RBRACE {: RESULT = new Labs($4); :} ;

GlobalAlerts ::= KW_ALERTAS_GLOBALES COLON LSQUARE AlertList RSQUARE {: RESULT = new GlobalAlerts($4); :}
               | KW_ALERTAS_GLOBALES COLON LSQUARE RSQUARE {: RESULT = new GlobalAlerts(null); :} ;

AlertList ::= Alert {: RESULT = new AlertList($1); :}
            | Alert COMMA AlertList {: RESULT = $3.prepend($1); :} ;

Alert ::= Expr ARROW STRING {: RESULT = new Alert($1, $3); :}
        | KW_INTERACCION COLON STRING ARROW STRING {: RESULT = new AlertInteraction($3, $5); :} ;

Obj ::= LBRACE Pairs RBRACE {: RESULT = new Obj($2); :}
     | LBRACE RBRACE {: RESULT = new Obj(null); :} ;

Pairs ::= Pair {: RESULT = new Pairs($1); :}
       | Pair COMMA Pairs {: RESULT = $3.prepend($1); :} ;

Pair ::= Key COLON Value {: RESULT = new Pair($1, $3); :} ;

Key ::= IDENT {: RESULT = new Key($1); :}
     | KW_AUTOR {: RESULT = new Key($1); :}
     | KW_VERSION {: RESULT = new Key($1); :}
     | KW_VIGENCIA {: RESULT = new Key($1); :}
     | KW_EVIDENCIA {: RESULT = new Key($1); :} ;

Value ::= STRING {: RESULT = new Value($1); :}
       | NUMBER {: RESULT = new Value($1); :}
       | BOOLEAN {: RESULT = new Value($1); :}
       | Obj {: RESULT = new Value($1); :}
       | List {: RESULT = new Value($1); :} ;

List ::= LSQUARE Values RSQUARE {: RESULT = new List($2); :}
      | LSQUARE RSQUARE {: RESULT = new List(null); :} ;

Values ::= Value {: RESULT = new Values($1); :}
        | Value COMMA Values {: RESULT = $3.prepend($1); :} ;

StringList ::= LSQUARE Strings RSQUARE {: RESULT = new StringList($2); :}
            | LSQUARE RSQUARE {: RESULT = new StringList(null); :} ;

Strings ::= STRING {: RESULT = new Strings($1); :}
         | STRING COMMA Strings {: RESULT = $3.prepend($1); :} ;

CondList ::= LSQUARE Exprs RSQUARE {: RESULT = new CondList($2); :}
           | LSQUARE RSQUARE {: RESULT = new CondList(null); :} ;

Exprs ::= Expr {: RESULT = new Exprs($1); :}
       | Expr COMMA Exprs {: RESULT = $3.prepend($1); :} ;

Expr ::= OrExpr {: RESULT = $1; :} ;

OrExpr ::= AndExpr {: RESULT = $1; :}
        | AndExpr OR OrExpr {: RESULT = new OrExpr($1, $3); :} ;

AndExpr ::= RelExpr {: RESULT = $1; :}
         | RelExpr AND AndExpr {: RESULT = new AndExpr($1, $3); :} ;

RelExpr ::= Primary {: RESULT = $1; :}
         | Primary EQ Primary {: RESULT = new RelExpr($1, "==", $3); :}
         | Primary LT Primary {: RESULT = new RelExpr($1, "<", $3); :}
         | Primary GT Primary {: RESULT = new RelExpr($1, ">", $3); :}
         | Primary LE Primary {: RESULT = new RelExpr($1, "<=", $3); :}
         | Primary GE Primary {: RESULT = new RelExpr($1, ">=", $3); :} ;

Primary ::= NUMBER {: RESULT = new Primary($1); :}
         | STRING {: RESULT = new Primary($1); :}
         | BOOLEAN {: RESULT = new Primary($1); :}
         | CallOrIdent {: RESULT = $1; :}
         | OPEN_BRACKET Expr CLOSE_BRACKET {: RESULT = $2; :} ;

CallOrIdent ::= IdentChain CallTail {: RESULT = new CallOrIdent($1, $2); :}
             | IdentChain {: RESULT = new CallOrIdent($1, null); :} ;

CallTail ::= OPEN_BRACKET ArgList CLOSE_BRACKET {: RESULT = new CallTail($2); :}
          | OPEN_BRACKET CLOSE_BRACKET {: RESULT = new CallTail(null); :} ;

IdentChain ::= IDENT {: RESULT = new IdentChain($1); :}
            | IDENT DOT IdentChain {: RESULT = $3.prepend($1); :} ;

ArgList ::= Expr {: RESULT = new ArgList($1); :}
         | Expr COMMA ArgList {: RESULT = $3.prepend($1); :} ;

What I’ve tried:

  1. Verified that all terminals and non-terminals are declared.

  2. Checked that all {: :} action blocks are balanced.

  3. Checked for missing semicolons or commas in production lists.

  4. Tried running CUP with -expect 1 to get better messages.

  5. Renamed some nonterminals (e.g., List, Document) to avoid collisions, but the error persists.

What kind of grammar issue causes CUP to emit:

  • Syntax error @ Symbol: PARSER

  • followed by a NullPointerException in symbl_name_from_id?

How can I locate the real syntax problem inside this grammar?

Any guidance on how to debug or a pointer to the likely broken production would be greatly appreciated.

1 Answer 1

1

You have a code part (parser code {: ... :};) after the lists of terminal and non-terminal symbols. Try moving it to before the list of terminal and non-terminal symbols.

The Java CUP grammar is itself parsed by a parser generated using Java CUP. I found a copy of the grammar at https://github.com/ultimate-pa/javacup/blob/master/cup/parser.cup , although this may be for a later version of CUP, but at line 231, you can see the top-level production rule lists code parts before symbols.

Java CUP is very old: I'm not sure what version you are using, but I used version 10k, the only version of it I could find from Maven (here), and that's 15 years old. I didn't get an NPE, and I can't explain why you did, but CUP version 10k did at least reject your grammar with a message that was similar to yours. After moving the code part before the lists of symbols (and also removing EOF from one of your production rules, although I'm not sure why I needed to do that), CUP generated the parser.

Sign up to request clarification or add additional context in comments.

4 Comments

Appreciate the time you took for this!! I'm using 11b which is the last one that's on the website, rarely it works for an example that I have from a friend and I literally tried to copy it for my grammar, but it doesn't seem to work I don't know why. The EOF appears to be like an automatic production rule generated by CUP, so it appears like it's duplicated. I tried to use the version you left on the answer but I'm not really sure in how to use this, should I use the same command that I used for 11b? This is just showing that there's no attribute of manifest into java-cup-10k.jar
The 10k version of CUP did complain about a missing Main-Class attribute in the MANIFEST.MF file, but I worked around that by running the CUP main class directly, using java -cp java-cup-10k.jar java_cup.Main grammar.cup. I wasn't too sure about the EOF thing: CUP complained that the symbol EOF wasn't declared, but when I tried to declare it, it complained about a duplicate terminal. Anyway, I've been able to reproduce the NPE with version 11b, so maybe I'll look into fixing it.
I doubt I'll bother looking into fixing the NPE. As I've already written, CUP is very old. The NPE is thrown from code added to 11b that isn't in 10k, in what appears to be an incomplete attempt to list what tokens the parser was expecting at the point it encountered a syntax error.
Yay, just figured it out! Thank you so so much! As you said it weird that this little changes made it work, but hey it what I needed!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.