1

How are you?

I have a database where some lines have more than one product and they are separated by a comma, as in the example below (there are other columns, but to make it more practical I only took these three).

id produdct value
47 product1, product 2 12000.0
48 product3 48000.0
49 product4, product1, product2 28800.0
50 product1 2000.0
51 product5, product2 32000.0
53 product3 128000.0
54 product2 15000.0
55 product4, product2, product5 96000.0

I need to separate each product, making a copy of that line for each one. I tried using some functions like explode, json_normalize, I tried creating a list of lists but nothing worked. Can you help me?

1 Answer 1

5

Just use str.split and explode

df['produdct'] = df['produdct'].str.split(', ')
new_df = df.explode('produdct')

   id   produdct     value
0  47   product1   12000.0
0  47  product 2   12000.0
1  48   product3   48000.0
2  49   product4   28800.0
2  49   product1   28800.0
2  49   product2   28800.0
3  50   product1    2000.0
4  51   product5   32000.0
4  51   product2   32000.0
5  53   product3  128000.0
6  54   product2   15000.0
7  55   product4   96000.0
7  55   product2   96000.0
7  55   product5   96000.0
Sign up to request clarification or add additional context in comments.

2 Comments

Dude, it worked perfectly, thank you very much!
No, problem and good luck. You were on the right track just explode works with list-likes - i.e., lists, tuples, sets, Series and np.ndarray. You currently have strings so you just needed to convert before using explode.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.