0

For my python program I have an input that represents serialized object, that can contain primitive types, arrays and structures.

Sample input can look like this:

Struct(1.5, false, Struct2(“text”), [1, 2, 3])

Sample output would be:

{
    type: "Struct",
    args: [
        1.5,
        False,
        {
            type: "Struct2",
            args: [ "text" ]
        },
        [ 1, 2, 3 ]
    ]
}

So, the input string can have:

  • Primitive types (integers, floats, boolean and string literals)
  • Arrays
  • Structures (structure name and a list of arguments)

Input format is quite logical, but I couldn't find any readily available libraries/code snippets to parse such format.

1 Answer 1

2

This isn't a very clean implementation, and I'm not 100% sure if it does exactly what you're looking for, but I would recommend the Lark library for doing this.

Instead of using a ready-made parser for the job, just make your own small one, and to save time, Lark has it's "save" and "load" features, so you can save a serialized version of the parser and load that each time instead of re-creating the entire parser each runtime. Hope this helps :)

from lark import Lark, Transformer

grammar = """
%import common.WS
%import common.ESCAPED_STRING
%import common.SIGNED_NUMBER

%ignore WS

start : struct

struct  : NAME "(" [element ("," element)*] ")"
element : struct | array | primitive

array : "[" [element ("," element)*] "]"
primitive : number
          | string
          | boolean

string : ESCAPED_STRING
number : SIGNED_NUMBER

boolean : TRUE | FALSE

NAME : /[a-zA-Z][a-zA-Z0-9]*/

TRUE : "true"
FALSE : "false"
"""

class T(Transformer):
    def start(self, s):
        return s[0]

    def string(self, s):
        return s[0][1:-1].replace('\\"', '"')

    def primitive(self, s):
        return s[0]

    def struct(self, s):
        return { "type": s[0].value, "args": s[1:] }

    def boolean(self, s):
        return s[0].value == "true"

    def element(self, s):
        return s[0]
    
    array = list

    def number(self, s):
        try:
            return int(s[0].value)
        except:
            return float(s[0].value)

parser = Lark(grammar, parser = "lalr", transformer = T())

test = """
Struct(1.5, false, Struct2("text"), [1, 2, 3])
"""

print(parser.parse(test))
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you, this looks very promising!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.