1

So, I would like to convert my string input

'f(g,h(a,b),a,b(g,h))' 

into the following list

['f',['g','h',['a','b'],'a','b',['g','h']]]

Essentially, I would like to replace all '(' into [ and all ')' into ].

I have unsuccessfully tried to do this recursively. I thought I would iterate through all the variables through my word and then when I hit a '(' I would create a new list and start extending the values into that newest list. If I hit a ')', I would stop extending the values into the newest list and append the newest list to the closest outer list. But I am very new to recursion, so I am struggling to think of how to do it

word='f(a,f(a))'
empty=[]
def newlist(word):
    listy=[]
    for i, letter in enumerate(word):
        if letter=='(':
            return newlist([word[i+1:]])
        if letter==')':
            listy.append(newlist)
        else:
            listy.extend(letter)
        
    return empty.append(listy)

 
4
  • Hello and welcome to StackOverflow. Your idea looks quite good. Can you add some output you are getting here. Commented Oct 15, 2022 at 8:23
  • Note: list.append() doesn't return anything. Commented Oct 15, 2022 at 8:25
  • note that your expected output is not a valid list ('h'['a', 'b']) Commented Oct 15, 2022 at 8:49
  • Thank you for your feedback! gog, you are correct, I wrote a typo in the expected list and updated it. Daaran, The output I am getting is [['a', ',', 'f', '(', 'a', ')', ')']]. I am having trouble appending my new list to a list created in the previous function call, so that is where i am having most trouble. Commented Oct 15, 2022 at 9:11

3 Answers 3

1

Assuming your input is something like this:

a = 'f,(g,h,(a,b),a,b,(g,h))'

We start by splitting it into primitive parts ("tokens"). Since your tokens are always a single symbol, this is rather easy:

tokens = list(a)

Now we need two functions to work with the list of tokens: next_token tells us which token we're about to process and pop_token marks a token as processed and removes it from the list:

def next_token():
    return tokens[0] if tokens else None


def pop_token():
    tokens.pop(0)

Your input consist of "items", separated by a comma. Schematically, it can be expressed as

items = item ( ',' item )*

In the python code, we first read one item and then keep reading further items while the next token is a comma:

def items():
    result = [item()]
    while next_token() == ',':
        pop_token()
        result.append(item())
    return result

An "item" is either a sublist in parentheses or a letter:

def item():
    return sublist() or letter()

To read a sublist, we check if the token is a '(', the use items above the read the content and finally check for the ')' and panic if it is not there:

def sublist():
    if next_token() == '(':
        pop_token()
        result = items()
        if next_token() == ')':
            pop_token()
            return result
        raise SyntaxError()

letter simply returns the next token. You might want to add some checks here to make sure it's indeed a letter:

def letter():
    result = next_token()
    pop_token()
    return result

You can organize the above code like this: have one function parse that accepts a string and returns a list and put all functions above inside this function:

def parse(input_string):
    
    def items():
        ...

    def sublist():
        ...
    
    ...etc
    
    tokens = list(input_string)
    return items()
 
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for your help! Sorry, I am a noob to coding, so how do I start using these functions? Do I type item(tokens) for this code to start processing or something?
So, in other words, what do I need to input to start processing my string? Is the start function items(tokens) for example?
0

Quite an interesting question, and one I originally misinterpreted. But now this solution works accordingly. Note that I have used list concatenation + operator for this solution (which you usually want to avoid) so feel free to improve upon it however you see fit.

Good luck, and I hope this helps!

# set some global values, I prefer to keep it
# as a set incase you need to add functionality
# eg if you also want {{a},b} or [ab<c>ed] to work
OPEN_PARENTHESIS = set(["("])
CLOSE_PARENTHESIS = set([")"])
SPACER = set([","])

def recursive_solution(input_str, index):

    # base case A: when index exceeds or equals len(input_str)
    if index >= len(input_str):
        return [], index
    
    char = input_str[index]

    # base case B: when we reach a closed parenthesis stop this level of recursive depth
    if char in CLOSE_PARENTHESIS:
        return [], index

    # do the next recursion, return it's value and the index it stops at
    recur_val, recur_stop_i = recursive_solution(input_str, index + 1)

    # with an open parenthesis, we want to continue the recursion after it's associated
    # closed parenthesis. and also the recur_val should be within a new dimension of the list
    if char in OPEN_PARENTHESIS:
        continued_recur_val, continued_recur_stop_i = recursive_solution(input_str, recur_stop_i + 1)
        return [recur_val] + continued_recur_val, continued_recur_stop_i
    
    # for spacers eg "," we just ignore it
    if char in SPACER:
        return recur_val, recur_stop_i
    
    # and finally with normal characters, we just extent it 
    return [char] + recur_val, recur_stop_i

1 Comment

This works pretty well! Thank you so much! Just need to find a way to remove the index in the final iteration. Thank you man
0

You can get the expected answer using the following code but it's still in string format and not a list.

import re

a='(f(g,h(a,b),a,b(g,h))' 
ans=[]
sub=''
def rec(i,sub):
    if i>=len(a):
        return sub
    if a[i]=='(':
        if i==0:
            
            sub=rec(i+1,sub+'[')
        else:
            sub=rec(i+1,sub+',[')
            
    elif a[i]==')':
        sub=rec(i+1,sub+']')
    else:
        sub=rec(i+1,sub+a[i])
    return sub


b=rec(0,'')
print(b)
b=re.sub(r"([a-z]+)", r"'\1'", b)
print(b,type(b))

Output

[f,[g,h,[a,b],a,b,[g,h]]
['f',['g','h',['a','b'],'a','b',['g','h']] <class 'str'>

2 Comments

would it be possible to turn "[f,[g,h,[a,b],a,b,[g,h]]" into a list somehow? seems very close to what I want but it needs to be a list
yes i was trying to do that using json package in python but it's giving an error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.