1

I have CSV file which has multiple "values" in complex form for single "id" and i want that multiple values to split in different rows with respect to its "id".

My CSV file:

# To read df1=pandas.read_csv('krish.csv',encoding="ISO-8859-1")
# File have data even like 1.50% (P,KR,AU) 0.2¢/kg (AX,AU)
id  value
100.3   Free (A+,BH,CA) 0.1¢/kg (AX)
200.1   Free (MA, MX,OM)
321.5   Free (BH,CA) 1.70% (P) 7% (PE) 12.3% (KR)

OUTPUT I WANT FOR MY INPUT GIVEN ABOVE :

Required output

OUTPUT WHICH GOT ON MY CODE AND FOR WHAT I TRIED Required output

4
  • is it only the Free tag that can have multiple elements? Commented May 31, 2019 at 11:08
  • No it can be even like 1.70% (P,KR,AU) Commented May 31, 2019 at 11:15
  • ok, have a look at my answer and let me know if it fit your needs Commented May 31, 2019 at 11:30
  • Hey SS would you accept my answer if you don't have anything more to add? Thanks Commented Jun 3, 2019 at 10:56

1 Answer 1

1

I'm pretty sure there are more efficient/elegant ways, but this should work

def split_elements(s):
    elements = s[s.find('(')+1:-1].split(',')
    key = s[:s.find('(')]
    return ['{} ({})'.format(key, el) for el in elements]

input_data = {'values': ['Free (A+,BH,CA) 0.1¢/kg (AX)', 'Free (MA, MX,OM)', 'Free (BH,CA) 1.70% (P) 7% (PE) 12.3% (KR)'], 'ids': [100.3, 200.1, 321.5]}
df = pd.DataFrame(input_data)

temp_values = []
temp_ids = []
# iterate through rows
for idr, r in df.iterrows():
    # extract elements
    elements = [el.strip()+')' for el in r['values'].split(')') if el != '']
    # split subelements
    for element in elements:
        split_el = split_elements(element)
        temp_values.extend(split_el)
        temp_ids.extend([r['ids']]*len(split_el))
# create dataset
df1 = pd.DataFrame({'ids': temp_ids, 'values': temp_values})
df1.set_index('ids')

Which gives

ids     values
100.3   Free (A+)
100.3   Free (BH)
100.3   Free (CA)
100.3   0.1¢/kg (AX)
200.1   Free (MA)
200.1   Free ( MX)
200.1   Free (OM)
321.5   Free (BH)
321.5   Free (CA)
321.5   1.70% (P)
321.5   7% (PE)
321.5   12.3% (KR)
Sign up to request clarification or add additional context in comments.

7 Comments

I have my ID column which as i have mentioned in my input
I don't understand what you mean
In your code you have used input with only my "value " column but i need my "id" data be to same Just now i edited my Id values it will be similar to that can you please have a look sir
Please have a look at my input sir
I still cannot understand. If I look at the picture you posted after the message OUTPUT I WANT FOR MY INPUT GIVEN ABOVE : this is what I achieved with my code. Update that picture to reflect the result you're looking after and then I'll update my answer accordingly
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.