6

I have a DataFrame like this:

Name asn
Org1 asn1,asn2
org2 asn3
org3 asn4,asn5

I would like to convert my DataFrame to look like this:

Name asn
Org1 asn1
Org1 asn2
org2 asn3
org3 asn4
Org3 asn5

Does anybody know how can I do that?

2 Answers 2

6

Assuming your starting DataFrame is named df, you could write:

>>> df2 = df.asn.str.split(',').apply(pd.Series)          # break df.asn into columns
>>> df2.index = df.Name                                   # set the index as df.Name
>>> df2 = df2.stack().reset_index('Name')                 # stack and reset_index
>>> df2
    Name       0
0   Org1    asn1
1   Org1    asn2
0   org2    asn3
0   org3    asn4
1   org3    asn5

All that's left to do is rename the column:

df2.rename(columns={0: 'asn'}, inplace=True)

Depending on your next move, you may also want to set a more useful index.

Sign up to request clarification or add additional context in comments.

3 Comments

Nice. You can also avoid the drop('level_1', axis=1) by using reset_index('Name').
Thanks @unutbu, that looks a lot neater.
@ajcr.Thanks. ONe question, what if I have three columns? a third column which I like to be like the 'Name' column
3

Just spent hours dealing with this and discovered that the explode function is a much simpler solution.

First replace the strings in the multi-valued cells with lists like this:

    asn_lists = df.asn.str.split(',')         # split strings into list
    df.asn = asn_lists                        # replace strings with lists in the dataframe

And the just use the explode function:

    df2 = df.explode('asn') # explode based on the production_companies column

This solution will also work for larger dataframes with extra columns

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.