1

I have a string in python in this format:

The "" above is to show that it is a string. Now I want to parse this string until I hit the first '' (an empty element after col3 in above example) and form a list with elements before this. So based on the sample example above the list that I should get will be ['col1', 'col2', 'col3']. How can I do this in python?

3 Answers 3

3

Convert the string to a list and use list operations to slice the string at the empty string:

>>> s = "['col1', 'col2', 'col3','', 'row1', 'row2']"
>>> import ast
>>> L = ast.literal_eval(s)
>>> L
['col1', 'col2', 'col3', '', 'row1', 'row2']
>>> L.index('')
3
>>> L[:L.index('')]
['col1', 'col2', 'col3']
Sign up to request clarification or add additional context in comments.

6 Comments

First time I see that ast library. Seems useful.
@GLHF it is the "abstract syntax tree" module and understands Python's grammar. ast.literal_eval is a safe form of eval that can evaluate a string containing a Python expression. It won't execute "del /s c:\*" for example.
is ast.literal_eval also converting them to set,tuple etc. ?
Just the answer I needed.
@GLHF yes, "{1,2,3}" and "(1,2,3)" would create a set and a tuple.
|
0
import re
import numpy as np
l  ="['col1', 'col2', 'col3','', 'row1', 'row2']"
pattern = r"'([A-Za-z0-9_\./\\-]*)'"
m = re.findall(pattern, l)
mn = np.array(m)
rslt = list(np.split(mn, np.where(mn==''))[0])

Output:

rslt
Out[75]: ['col1', 'col2', 'col3']

Explanation:

In [78]: pattern = r"'([A-Za-z0-9_\./\\-]*)'"
    ...: m = re.findall(pattern, l)
    ...: 

In [79]: m
Out[79]: ['col1', 'col2', 'col3', '', 'row1', 'row2']

In [80]: mn = np.array(m)

In [81]: [list(x) for x in np.split(mn, np.where(mn==''))]
Out[81]: [['col1', 'col2', 'col3'], ['', 'row1', 'row2']]

3 Comments

@downvoter, could you please explain why you downvote?
I'm not the downvoter, but perhaps they were balking at the use of numpy and re when ast.literal_eval is sufficient.
@PaulMcGuire, I know you are not, otherwise I would downvote you back(haha, just kidding). Anyway, re works for general cases of string parsing. While for you, in this particular case, ast is the most suitable indeed! I also learnt something new through your question! Thanks!
0

I wish I could parse your string directly using json but when I do, I get this error:

>>> import json
>>> json.loads("['col1', 'col2', 'col3','', 'row1', 'row2']")

...
ValueError: No JSON object could be decoded

So, I first replaced the single quotes with double quotes:

>>> s = "['col1', 'col2', 'col3','', 'row1', 'row2']"
>>> m = json.loads(s.replace("'", '"'))
>>> m
[u'col1', u'col2', u'col3', u'', u'row1', u'row2']  

# find first index of empty string and the splice the list
>>> m[:m.index('')]
[u'col1', u'col2', u'col3']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.