2

I'm trying to find way to parse string that can contain variable, function, list, or dict written in python syntax separated with ",". Whitespace should be usable anywhere, so split with "," when its not inside (), [] or {}.

Example string: "variable, function1(1,3), function2([1,3],2), ['list_item_1','list_item_2'],{'dict_key_1': "dict_item_1"}"

Another example string: "variable,function1(1, 3) , function2( [1,3],2), ['list_item_1','list_item_2'],{'dict_key_1': "dict_item_1"}"

Example output ["variable", "function1(1,3)", "function2([1,3],2)", "['list_item_1','list_item_2']", "{'dict_key_1': "dict_item_1"}"]

edit: Reason for the code is to parse string an then run it with exec("var = &s" % list[x]). (yes i know this might not be recommended way to do stuff)

0

3 Answers 3

2

I guess the main problem here is that the arrays and dicts also have commas in them, so just using str.split(",") wouldn't work. One way of doing it is to parse the string one character at a time, and keep track of whether all brackets are closed. If they are, we can append the current result to an array when we come across a comma. Here's my attempt:

s = "variable, function1(1,3),function2([1,3],2),['list_item_1','list_item_2'],{'dict_key_1': 'dict_item_1'}"

tokens = []
current = ""
open_brackets = 0

for char in s:
    current += char

    if char in "({[":
        open_brackets += 1
    elif char in ")}]":
        open_brackets -= 1
    elif (char == ",") and (open_brackets == 0):
        tokens.append(current[:-1].strip())
        current = ""

tokens.append(current)

for t in tokens:
    print(t)

"""
    variable
    function1(1,3)
    function2([1,3],2)
    ['list_item_1','list_item_2']
    {'dict_key_1': 'dict_item_1'}
"""
Sign up to request clarification or add additional context in comments.

2 Comments

I was thinking of the same idea using a list as a stack of brackets, but the open_brackets counter works the same and is simpler.
Yeah i just thought that regex/python would have had way of doing it instead of writing algorithm my self. I will have to do that then i suppose.
0

Regular expressions aren't very good for parsing the complexity of arbitrary code. What exactly are you trying to accomplish? You can (unsafely) use eval to just evaluate the string as code. Or if you're trying to understand it without evaling it, you can use the ast or dis modules for various forms of inspection.

6 Comments

Reason for the code is to parse string into list and then run it with exec(var = list[x]).
@SacredCoconut: Is there a reason not to bulk parse it all at once?
Sorry i don't know what you mean by bulk parse.
@SacredCoconut: As in, why not just do allvals = eval(thestring), then index the resulting tuple for the results?
Oh that might work, i have most likely misunderstood how eval works
|
0

Have you tried using split?

>>> teststring = "variable, function1(1,3), function2([1,3],2), ['list_item_1','list_item_2'],{'dict_key_1': 'dict_item_1'}"
>>> teststring.split(", ")
['variable', 'function1(1,3)', 'function2([1,3],2)', "['list_item_1','list_item_2'],{'dict_key_1': 'dict_item_1'}"]

1 Comment

Oh yeah i forgot to mention it might or might not have whitespace after ",". For example "['variable', function1(1, 3)'" would not work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.