0

I have a very simple grammar for Antlr4:

grammar settings;

query
    : COLUMN OPERATOR (SETTING|SCALAR)
    ;

COLUMN
    : [a-z_]+
    ;

OPERATOR
    : ('='|'>'|'<')
    ;

SETTING
    : 'setting(' [a-z_]+ ')'
    ;

SCALAR
    : [a-z_]+
    ;

I would like for input strings like total_sales>setting(min_total_sales) (they represent database column name, operator and value) define what is column name, operator, value. For that some python code was developed:

import re

from antlr4 import InputStream, CommonTokenStream

from settingsLexer import settingsLexer
from settingsParser import settingsParser

settings = {
    'min_total_sales': 1000
}

conditions = 'total_sales>setting(min_total_sales)'

lexer = settingsLexer(InputStream(conditions))
stream = CommonTokenStream(lexer)
parser = settingsParser(stream)
tree = parser.query()

regex = re.compile('^setting\((?P<setting_name>[a-z_]+)\)$')

column = None
operator = None
value = None

for child in tree.getChildren():
    text = child.getText()

    # how to match what is child: column or operator or value???

    # this for value defining
    if match := regex.match(text):
        setting_name = match.group('setting_name')
        print(f'We should get value from setting named `{setting_name}`')
        min_total_sales = settings['min_total_sales']
    else:
        print(f'We got a simple scalar value: {text}')
        min_total_sales = int(text)

How to match what is child: column name or operator or value?

1 Answer 1

1

Why are you involving regex? When you have parsed the input, the tree structure will contain methods that correspond to the rules it matched. So, the object returned by parser.query(), which is the parser rule:

query
    : COLUMN OPERATOR (SETTING|SCALAR)
    ;

will have 4 methods: COLUMN(), OPERATOR(), SETTING() and SCALAR()

Use them to extract the data you want:

tree = parser.query()

column = tree.COLUMN()
operator = tree.OPERATOR()
setting = tree.SETTING()

print(f"column={column}, operator={operator}, setting={setting}")

And I'd not glue the setting and min_total_sales into 1 big token, but let this be done by the parser instead. Otherwise input like total_sales>setting ( min_total_sales ) will not be matched because of the spaces.

grammar settings;

query
    : COLUMN OPERATOR value EOF
    ;

value
    : setting
    | SCALAR
    ;

setting
    : SETTING '(' SCALAR ')'
    ;

COLUMN
    : [a-z_]+
    ;

OPERATOR
    : ('='|'>'|'<')
    ;

SETTING
    : 'setting'
    ;

SCALAR
    : [a-z_]+
    ;

SPACES
    : [ \t\r\n] -> skip
    ;
Sign up to request clarification or add additional context in comments.

2 Comments

Oh, thanx very much. Could you say please whether that worked for you? I'm getting error (Python 3): mismatched input 'setting' expecting {'setting', SCALAR}. I tried different inputs: "total_sales>setting(min_total_sales)", "total_sales>val", etc. The error message all the same
The code snippet I posted works with your grammar. The code will not work with my suggested grammar changes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.