0

Is there a way to recursively parse the string to get the dict?

string:

string = 'a {\
    b: text;\
    c {\
        d: text;\
    }\
}';

out:

{
    'a' : {
        'b': 'text',
        'c': {
            'd' : 'text';
        }
    }
}

upd: I'm new in Python, and I don't have a ready solution in the form of a library and etc, I want to understand the logic (any code if it possible or theory) of the algorithm for this problem

6
  • Can you show us the code you have problems so this turns into a specific programming-related question? Commented Feb 6, 2012 at 18:44
  • @Niklas B., I'm new in Python, but i want to write a small parser which transforms a string into dict. And I want to understand the algorithm Commented Feb 6, 2012 at 18:52
  • So you haven't tried anything yourself? That's not a good starting point for asking a question on this site. Commented Feb 6, 2012 at 18:53
  • @Yeah, I tried to solve this problem myself, but I don't have the skills to understand the logic of the algorithm Commented Feb 6, 2012 at 18:55
  • It's no problem to add non-functional code to the question. We will help you fix and adjust it, if that's at all possible. Commented Feb 6, 2012 at 18:56

3 Answers 3

3

You should have a look at the pyparsing module.

The pyparsing module is an alternative approach to creating and executing simple grammars, vs. the traditional lex/yacc approach, or the use of regular expressions. The pyparsing module provides a library of classes that client code uses to construct the grammar directly in Python code.

For a case like this, you would write the grammar that describes the string you are trying to parse, and avoid writing the parser.

Update (per comment): If one was interested in learning about the theoretical basis behind parsing strings such as these, then you need to understand what your goal is. This problem, in a more general form, is parsing a context-free language. You have a set of rules, known as a grammar that dictates the hierarchy of the data structure from the input. A good place to start reading (from an educational perspective) is the Backus–Naur Form.

Sign up to request clarification or add additional context in comments.

4 Comments

If it's really practical problem, I'd use JSON or YAML.
@Hooked, Thanks, but I don't have a ready solution in the form of a library, I want to understand the logic of the algorithm for this problem
@tomas I've found that almost always, writing your own parser is difficult, buggy and often not-maintainable. What you want to write is a grammar, a set of rules for parsing your data and let a parsing library take care of it. You'll end up learning more if you can make the distinction between the two!
@Hooked, My main goal - a theoretical basis for how to do it myself. I need more practice :)
2

Writing a recursive descent parser for this shouldn't be too difficult. There are tutorials out there that describe how to do it.

Comments

1

Using lepl parser library:

#!/usr/bin/env python
from lepl import AnyBut, Delayed, Drop, DroppedSpace

def Parser():
    dict_ = Delayed()
    str_ = AnyBut('{}:;')[1::'n',...]

    with DroppedSpace():
        pair = str_ & Drop(':') & str_ & Drop(';') > tuple
        value = str_ & dict_ > tuple
        dict_ += Drop('{') & (pair | value)[:] & Drop('}') > dict
        return value > dict

print(Parser().parse("a {  b: text;  c { d: text; }}")[0])

Output

{'a': {'c': {'d': 'text'}, 'b': 'text'}}

See also Python parsing tools.

To understand the theory behind the code you could read a book that talks about lexical analysis (regular expressions and finite automata), syntax analysis (EBNF, context-free grammars, LL parsers).

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.