6

As part of a software package I'm working on, I need to implement a parser for application specific text files. I've already specified the grammar for these file on paper, but am having a hard time translating it into easily readable/updatable code (right now just it passes each line through a huge number of switch statements).

So, are there any good design patterns for implementing a parser in a Java style OO environment?

1
  • 1
    Parsers are one of the examples where a functional style really shines. Parser combinator libraries are under the most expressive ones out there. Have a look at Haskell's Parsec or Boost.Spirit. Commented Jan 23, 2012 at 23:12

3 Answers 3

7

Any easy way to break a massive switch into an OO design would be to have

pseudo code

class XTokenType {
     public bool isToken(string data);
}

class TokenParse {
     public void parseTokens(string data) {
          for each step in data {
               for each tokenType in tokenTypess {
                    if (tokenType.isToken(step)) {
                         parsedTokens[len] = new tokenType(step);
                    }
                    ...
               }
          }
          ...
     }
}

Here your breaking each switch statement into a method on that token object to detect whether the next bit of the string is of that token type.

Previously:

class TokenParse {
     public void parseTokens(string data) {
          for each step in data {
               switch (step) {
                    case x: 
                         ...
                    case y:
                         ...
                    ...
               }
          }
          ...
     }
}
Sign up to request clarification or add additional context in comments.

3 Comments

Just what I was looking for :)
@zergylord if your a GoF fan this may be called the command pattern but that might be a lie
This isn't object oriented, it's wrapping iterative programming in objects. "isToken" is a method on what object? The XTokenType? Sorry, but if it "is a token" shouldn't it have the class of a token, as classes represent the "is-a" relationship? Addtionally, "TokenParse" is an action, not a class. If the program was a collection of actions, then I'd be ok with the Gerund of treating an action like a noun, but this is the only one. This is iterative design, wrapped in Object clothing.
1

One suggestion is to create property file where you define rules. Load it during run time and use if else loop (since switch statements also does the same internally). This way if you want to change some parsing rules you have to change .property file not code. :)

Comments

0

You need to learn how to express context free grammars. You should be thinking about the GoF Interpreter and parser/generators like Bison, ANTRL, lex/yacc, etc.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.