split string row wise with condition in python

Question

I have some strings in a column and I want to explode the words out only if they are not within brackets. The column looks like this

pd.DataFrame(data={'a': ['first,string','(second,string)','third,string (another,string,here)']})

and I want the output to look like this

pd.DataFrame(data={'a': ['first','string','(second,string)','third','string','(another,string,here)']})

This sort of works, but i would like to not have to put the row number in each time

re.split(r',(?![^()]*\))', x['a'][0])
re.split(r',(?![^()]*\))', x['a'][1])
re.split(r',(?![^()]*\))', x['a'][2])

i thought i could do with a lmbda function but i cannot get it to work. Thanks for checking this out

x['a'].apply(lambda i: re.split(r',(?![^()]*\))', i))

nikeros · Accepted Answer · 2021-12-13 11:17:20Z

1

It is not clear to me if the elements in your DataFrame may have multiple groups between brackets. Given that doubt, I have implemented the following:

import pandas as pd
import re

df = pd.DataFrame(data={'a': ['first,string','(second,string)','third,string (another,string,here)']})

pattern = re.compile("([^\(]*)([\(]?.*[\)]?)(.*)", re.IGNORECASE)

def findall(ar, res = None):
    if res is None:
        res = []
    m = pattern.findall(ar)[0]
    if len(m[0]) > 0:
        res.extend(m[0].split(","))
    if len(m[1]) > 0:
        res.append(m[1])
    if len(m[2]) > 0:
        return findall(ar[2], res = res)
    else:
        return res
    
res = []
for x in df["a"]:
    res.extend(findall(x))
    
print(pd.DataFrame(data={"a":res}))

Essentially, you recursively scan the last part of the match until you find no more words between strings. If order was not an issue, the solution is easier.

edited Dec 13, 2021 at 11:17

answered Dec 13, 2021 at 11:09

nikeros

3,3792 gold badges12 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

split string row wise with condition in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related