5

Suppose I need a parser to handle tree-like input (e.g. scala.xml.Elem) instead of strings. I would like to use the parser combinators from this article. If I linearize the input tree I can write such a parser easily.

type Parser[A] = seq: Seq[Elem] => List[(A,Seq[Elem])] 

I can add parsers return, failure, item, etc. and finally write my parsers on top of them.

Now I wonder if I can make a parser without linearizing the input tree. Is it possible ?

5
  • Can you share why you want to avoid linearizing the input tree? You can always create a linear view of a tree structure. Also, parsing is commonly used to convert a linear structure into a tree!...what do you want out of your parsing that isn't already present in the tree structure? Commented Aug 25, 2014 at 19:55
  • @RexKerr I would like to check an XML structure: e.g. suppose I have functions that check if the input is an element with a certain given label. Now I would like to compose those functions to check if the input element is <a><a1/><a2/></a>. Commented Aug 25, 2014 at 20:29
  • @RexKerr Linearizing is Ok. I am just checking alternatives. Commented Aug 25, 2014 at 20:32
  • You might be more lucky if you search for "tree matching" instead of parsing. A quick Google search for "tree matching" revealed quite a few papers and articles. I don't know how much tree matching is related to pattern matching, though. It's probably some special case? Commented Aug 26, 2014 at 12:33
  • @IonuțG.Stan Thanks. I will search for "tree matching". I do not know how it is related to pattern matching though. Commented Aug 26, 2014 at 13:26

1 Answer 1

3

Great question. It is absolutely possible to do this, and I've been looking for a tool that does it for a while.

I think the key is that the structure of your tree will be reflected in your primitive combinators. For instance, the primitive item parser is tied to the [] container type, and provides the ability to sequentially walk down a list by first/resting it. return and failure do not depend on the [] container type, so they shouldn't have to be changed to support tree parsing.

You'll need to replace it with one or more combinators that allow you to traverse a tree. I would guess that you'd want one combinator to let you move between siblings (i.e. children of the same parent node, at the same depth), and a second combinator to allow you to go deeper into the tree.

What I'm not sure about is whether you'll need duplicate combinators to correspond to capture the patterns of sequencing, alternation, lookahead, and so on. Having to implement each of those twice could get pretty obnoxious.

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you ! Two combinators -- nextbro to move to the next sibling and oldson to move to the "oldest" child -- are what I am thinking of.
@Michael exactly! I hope to get some time to implement this soon; if so, will update accordingly.
:) Maybe I will do it too. In Scala.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.