1

I have a dataframe with several columns that contains a list inside. I want to split this list to different columns. I currently found this question here in stackoverflow, but it seem that it is only splitting the list inside 1 column, which I want to apply to multiple columns containing unequal number of objects in the list.

My df looks something like this:

     ID |  value_0  |  value_1  |  value_2  | value_3   | value_4
0   1001|[1001,1002]|   None    |   None    |   None    |  None 
1   1010|[1010,2001]|[2526,1000]|   None    |   None    |  None  
2   1100|[1234,5678]|[9101,1121]|[3141,5161]|[1718,1920]|[2122,2324]

I want to transform it to:

     ID | 0  | 1  |  2   |  3   | 4
0   1001|1001|1002| None | None | None 
1   1010|1010|2001| 2526 | 1000 | None  
2   1100|1234|5678| 9101 | 1121 | 3141 ....etc.

Currently this is my code but it only outputs a dataframe containing "None" value. I'm not sure how to fix it cause it seem that it is only getting the last column and not really splitting the list.

length = len(list(df.columns.values))-1

for i in range(length):
    temp = "value_" + str(i)
    x = df[temp]
    new_df = pd.DataFrame(df[temp].values.tolist())

The result the new_df that I got is:

   | 0
  0| None
  1| None
  2| [2122,2324]

However if I just focus of only 1 column (ie. value_0) it splits the list just fine.

new_df = pd.DataFrame(df['value_0'].values.tolist())

Any help is very much appreciated

2 Answers 2

0

Idea is reshape values by DataFrame.stack for remove None values, so possible use DataFrame constructor, then reshape back by Series.unstack, sorting column and set default columns names:

import ast
#if strings in columns instead lists
#df.iloc[:, 1:] = df.iloc[:, 1:].applymap(ast.literal_eval)

s = df.set_index('ID', append=True).stack()

df = pd.DataFrame(s.values.tolist(), index=s.index).unstack().sort_index(axis=1, level=1)
df.columns = np.arange(len(df.columns))

df = df.reset_index(level=1)
print (df)
     ID       0       1       2       3       4       5       6       7  \
0  1001  1001.0  1002.0     NaN     NaN     NaN     NaN     NaN     NaN   
1  1010  1010.0  2001.0  2526.0  1000.0     NaN     NaN     NaN     NaN   
2  1100  1234.0  5678.0  9101.0  1121.0  3141.0  5161.0  1718.0  1920.0   

        8       9  
0     NaN     NaN  
1     NaN     NaN  
2  2122.0  2324.0  

Solution for pandas 0.24+ for missing values with integers:

df = df.astype('Int64').reset_index(level=1)
print (df)
     ID     0     1     2     3     4     5     6     7     8     9
0  1001  1001  1002   NaN   NaN   NaN   NaN   NaN   NaN   NaN   NaN
1  1010  1010  2001  2526  1000   NaN   NaN   NaN   NaN   NaN   NaN
2  1100  1234  5678  9101  1121  3141  5161  1718  1920  2122  2324
Sign up to request clarification or add additional context in comments.

2 Comments

Hi when I tried to view the whole dataframe, there seem to be NaN values in between columns that have a value. For example there would be a value in column 0 and 1, then 'NaN' value in column 3-4, then there's value again in 5-6. How can I remove this NaN values in between?
@Funky - Sorry, I was offline. So added answer now. Btw, accepted answer is not recommended - check this. Only if performance is not important or small DataFrame it does not matter.
0

First using pd.concat and pd.Seriesto expand the list into separate columns and append to the original df,then just dropping the original columns

for i in df.columns:
    df = pd.concat([df, df[i].apply(pd.Series)], axis=1)

df.drop(['ID','value_0','value_1','value_2','value_3','value_4'], axis=1, inpalce=True)

Output

          0     0     1       0       1       0       1       0       1  \
   0   1001  1001  1002     NaN     NaN     NaN     NaN     NaN     NaN   
   1   1010  1010  2001  2526.0  1000.0     NaN     NaN     NaN     NaN   
   2   1100  1234  5678  9101.0  1121.0  3141.0  5161.0  1718.0  1920.0   

           0       1  
   0     NaN     NaN  
   1     NaN     NaN  
   2  2122.0  2324.0 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.