0

I have a csv file with columns like this

enter image description here

I need to separate column (B) values into separate columns and multiple rows like this enter image description here

This is what I tried (the data here in below code is same as csv data above) and did not work

data = [{"latlong":'{lat: 15.85173248  ,  lng: 78.6216129},{lat: 15.85161765  ,  lng: 78.61982138},{lat: 15.85246304  ,  lng: 78.62031075},{lat: 15.85250474  ,  lng: 78.62034441},{lat: 15.85221891  ,  lng: 78.62174507},', "Id": 1},
        {"latlong": '{lat: 15.8523723  ,  lng: 78.62177758},{lat: 15.85236637  ,  lng: 78.62179098},{lat: 15.85231281  ,  lng: 78.62238316},{lat: 15.8501259  ,  lng: 78.62201676},', "Id":2}]
df = pd.DataFrame(data)
df
df.latlong.apply(pd.Series)

This works in this case

data1 = [{'latlong':[15.85173248, 78.6216129, 1]},{'latlong': [15.85161765, 78.61982138, 1]},{'latlong': [15.85246304, 78.62031075, 1]},
    {'latlong': [15.85250474, 78.62034441, 1]}, {'latlong': [15.85221891, 78.62174507, 1]},{'latlong': [15.8523723, 78.62177758, 2]},
    {'latlong': [15.85236637, 78.62179098, 2]}, {'latlong': [15.85231281, 78.62238316, 2]},{'latlong': [15.8501259,78.62201676, 2]}]
df1 = pd.DataFrame(data1)
df1

df1 = df1['latlong'].apply(pd.Series)
df1.columns = ['lat', 'long', 'Id']
df1

How can I achieve this with Python ?

New to python. I tried following links... Could not understand how to apply it to my case. Splitting dictionary/list inside a Pandas Column into Separate Columns

python split data frame columns into multiple rows

4
  • the first link you have provided has the answer to your question, can you be more specific about what you don't understand? Commented May 16, 2017 at 5:33
  • @maxymoo: Here in my example case, each single cell (csv/ excel) has multiple pairs of lat, lngs and each pair of lat, lng is enclosed in {}. No clue, how do I separate each pair of lat, lng into to separate columns named lat, long. Thanks Commented May 19, 2017 at 12:25
  • df.latlong.apply(pd.Series) Commented May 20, 2017 at 23:17
  • @ maxymoo: Thanks for your response. I have tried this before posting my problem. This didn’t work in my case. Also, edited my problem with the code what I tried .... where it worked & it didn't. Thanks Commented May 23, 2017 at 11:10

1 Answer 1

0

Your data is in a very strange format ... the entries of latlong aren't actually valid JSON (there is a trailing comma at the end, and there are no quotes around the field names), so I would probably actually use a regular expression to split out the columns, and a list comprehension to split out the rows:

In [39]: pd.DataFrame(
             [{'Id':r['Id'], 'lat':lat, 'long':long}  
                for r in data 
                for lat,long in re.findall("lat: ([\d.]+).*?lng: ([\d.]+)",
                                           r['latlong'])])
Out[39]:
   Id          lat         long
0   1  15.85173248   78.6216129
1   1  15.85161765  78.61982138
2   1  15.85246304  78.62031075
3   1  15.85250474  78.62034441
4   1  15.85221891  78.62174507
5   2   15.8523723  78.62177758
6   2  15.85236637  78.62179098
7   2  15.85231281  78.62238316
8   2   15.8501259  78.62201676
Sign up to request clarification or add additional context in comments.

2 Comments

@ maxymoo: Thanks. It gives what I want. Being a beginner the solution looks complex for me. I need to learn how the code is working.
read about list comprehension and regular expressions, you'll be able to work it out

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.