Convert and slice Python string to list?

Question

I am given a raw string which is a path or "direction" to a string in JSON. I need the following string converted to a list containing dictionaries..

st = """data/policy/line[Type="BusinessOwners"]/risk/coverage[Type="FuelHeldForSale"]/id"""

The list should look like this

paths = ['data','policy','line',{'Type':'BusinessOwners'},'risk','coverage',{"Type":"FuelHeldForSale"},"id"]

I then iterate over this list to find the object in the JSON (which is in a Spark RDD)

I attempted st.split(\) which gave me

st.split('/')
Out[370]: 
['data',
 'policy',
 'line[Type="BusinessOwners"]',
 'risk',
 'coverage[Type="FuelHeldForSale"]',
 'CalculationDisplay']

But how do I convert and split items like 'line[Type="BusinessOwners"]' to 'line',{'Type':'BusinessOwners'} ?

Hi. Did you try using eval()? Can you try this out: st_new=eval(st) Then print st_new. I hope this works.! — Shrinivas Deshmukh
– Shrinivas Deshmukh, Commented Mar 16, 2018 at 4:35
Hi! That did not work @ShrinivasDeshmukh data/policy/line[Type="BusinessOwners"]/risk/coverage[Type="FuelHeldForSale"]/id ^ SyntaxError: invalid syntax — mdeonte001
– mdeonte001, Commented Mar 16, 2018 at 4:37
Please refer to this link, a similar problem has been discussed here: stackoverflow.com/questions/36068779/… — Shrinivas Deshmukh
– Shrinivas Deshmukh, Commented Mar 16, 2018 at 4:43
@mdeonte001 --- You should be a lot more specific as to what you want if you want people to use their time to solve your problem. If you want a dictionary in your list then state it instead of leaving others to read your mind! — Michael Swartz
– Michael Swartz, Commented Mar 16, 2018 at 5:25
@MichaelSwartz please see above, i state 'list containing dictionaries..' and in my example i show a dictionary. — mdeonte001
– mdeonte001, Commented Mar 16, 2018 at 15:37

Rahul · Accepted Answer · 2018-03-16 05:21:37Z

1

import json

first_list = st.replace('[', '/{"').replace(']', '}').replace('="', '": "').split('/')
[item if not "{" in item  else json.loads(item) for item in first_list]

or using ast.literal_eval

import ast

[item if not "{" in item  else ast.literal_eval(item) for item in first_list]


out:
['data',
 'policy',
 'line',
 {'Type': 'BusinessOwners'},
 'risk',
 'coverage',
 {'Type': 'FuelHeldForSale'},
 'id']

edited Mar 16, 2018 at 5:21

answered Mar 16, 2018 at 4:42

Rahul

11.7k5 gold badges63 silver badges100 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

mdeonte001 Over a year ago

Hi, I ran this and I got the error AttributeError: 'str' object has no attribute 'literal_eval' - what is ast.literal_eval(item) should that be st.literal?

Rahul Over a year ago

sorry. you need to import ast first. please check again.

Mad Physicist Over a year ago

Not a huge fan of using literal_eval, but this is much better.

mdeonte001 Over a year ago

This works! Thank you. @MadPhysicist care share why you dislike literal_eval?

Mad Physicist Over a year ago

It certainly does work. I am wary of literal_eval because of things like this gist. I am not 100% sure if it can be done exactly with literal_eval, but I would rather not take a chance.

|

TTT · Accepted Answer · 2018-03-16 04:38:04Z

1

Would be more efficient if it wasn't a 1 liner, but I'll let you figure it out from here. Probably wanna come up with a more robust regex based parsing engine if your input varies more than your given schema. Or just use a standardized data model like JSON.

[word if '=' not in word else {word.split('=')[0]:word.split('=')[1]} for word in re.split('[/\[]', st.replace(']','').replace('"',''))]

['data', 'policy', 'line', {'Type': 'BusinessOwners'}, 'risk', 'coverage', {'Type': 'FuelHeldForSale'}, 'id']

answered Mar 16, 2018 at 4:38

TTT

2,0401 gold badge20 silver badges35 bronze badges

Comments

Aaditya Ura · Accepted Answer · 2018-03-16 07:16:48Z

0

Let's do it in one line :

import re

pattern=r'(?<=Type=)\"(\w+)'
data="""data/policy/line[Type="BusinessOwners"]/risk/coverage[Type="FuelHeldForSale"]/id"""


print([{'Type':re.search(pattern,i).group().replace('"','')} if '=' in i else i for i in re.split('\/|\[',data)])

output:

['data', 'policy', 'line', {'Type': 'BusinessOwners'}, 'risk', 'coverage', {'Type': 'FuelHeldForSale'}, 'id']

answered Mar 16, 2018 at 7:16

Aaditya Ura

12.8k7 gold badges60 silver badges96 bronze badges

Comments

Mad Physicist · Accepted Answer · 2018-03-16 17:01:50Z

Regular expressions may be a good tool here. It looks like you want to transform elements that look like text1[text2="text3"] with `text1, {text2: text3}. The regex would look something like this:

(\w+)\[(\w+)=\"(\w+)\"\]

You can modify this expression in any number of ways. For example, you could use something other than \w+ for the names, and insert \s* to allow optional whitespace wherever you want.

The next thing to keep in mind is that when you do find a match, you need to expand your list. The easiest way to do that would be to just create a new list and append/extend it:

import re

paths = []
pattern = re.compile(r'(\w+)\[(\w+)=\"(\w+)\"\]')
for item in st.split('/'):
    match = pattern.fullmatch(item)
    if match:
        paths.append(match.group(1))
        paths.append({match.group(2): match.group(3)})
    else:
        paths.append(item)

This makes a paths that is

['data', 'policy', 'line', {'Type': 'BusinessOwners'}, 'risk', 'coverage', {'Type': 'FuelHeldForSale'}, 'id']

[IDEOne Link]

I personally like to split the functionality of my code into pipelines of functions. In this case, I would have the main loop accumulate the paths list based on a function that returned replacements for the split elements:

def get_replacement(item):
    match = pattern.fullmatch(item)
    if match:
        return match.group(1), {match.group(2): match.group(3)}
    return item,

paths = []
for item in st.split('/'):
    paths.extend(get_replacement(item))

The comma in return item, is very important. It makes the return value into a tuple, so you can use extend on whatever the function returns.

[IDEOne Link]

Collectives™ on Stack Overflow

Convert and slice Python string to list?

4 Answers 4

6 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

6 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related