4

I'd like a library that can take the string representation of a regexp and convert that into a syntax tree for easy programmatic manipulation. Something that would transform:

(\s?)bla[a-z]

into something like:

PARENTHESIS
  CHAR:SPACE
    WILD
WORD:bla
CHAR:a-z

2 Answers 2

2

Looks like what you're looking for is a syntax parser, right?

I would give a look on antlr (http://www.antlr.org/), you can create grammars and it will generate a syntax tree which you can walk, translate etc.

Sign up to request clarification or add additional context in comments.

7 Comments

I am looking for a syntax parser FOR regular expressions, not WITH regular expression. I modified the question to make it more clear.
And there are some regex grammars available.
@jp Nobody said "with", it has some regex grammars.
@DaveNewton Yes I could do a with a regex grammar that I can pass to javacc or antlr. But it's hard to search for it as search results are all about using regex in antlr or javacc...
@jp There's a page with all the antlr grammars--maybe look there.
|
0

Parboiled looks like a good choice for what you want to make.

It allows easy writing of grammars, way more easily than antlr or javacc.

Sample:

Rule Digit()
{
    return CharRange('0', '9');
}

Rule Integer()
{
    return OneOrMore(Digit());
}

Rule WhiteSpace()
{
    return ZeroOrMore(AnyOf(" \t"));
}

Rule NToMQuantifier()
{
    return Sequence(
        '{',
        WhiteSpace(),
        Integer(),
        Optional(
            WhiteSpace(),
            Integer()
        ),
        '}'
    );
}

Rule OtherQuantifiers()
{
    return Sequence(AnyOf("+?*"), Optional(AnyOf("+?")));
}

Rule Quantifier()
{
    return FirstOf(OtherQuantifiers(), NToMQuantifier());
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.