0

I have the following code.

I want to go through the 'outlierdataframe' dataframe row by row and explode the values in the 'x' and 'y' columns.

For each exploded row, I then want to store this exploded row as its own dataframe, with columns 'newID', 'x' and 'y'.

However, the following code prints everything in one column rather than printing the exploded 'x' values in one column, the exploded 'y' values in another column?

I would be so grateful for a helping hand!

individualframe = outlierdataframe.iloc[0]
individualoutliers = individualframe.explode(list('xy'))
newframe = pd.DataFrame(individualoutliers)
print(newframe)

outlier dataframe first line:

enter image description here

indexing first line of outlier dataframe:

outlierdataframe.iloc[0]

index                                                      24
subID                                         Prolific_610020
level                                                       1
complete                                                False
duration                                            20.015686
map_view                                            12.299759
distance                                           203.426697
x           [55, 55, 55, 60, 60, 60, 65, 70, 70, 75, 80, 8...
y           [60, 60, 60, 60, 65, 65, 70, 70, 75, 75, 80, 8...
r           [10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 1...
batch                                                       1
newID                                                  610020
Name: 24, dtype: object

newframe = pd.DataFrame(individualoutliers)
print(newframe)

                24
0                 24
1    Prolific_610020
2                  1
3              False
4          20.015686
..               ...
121               55
122               55
123               55
124                1
125           610020

2 Answers 2

1

You can use pandas.DataFrame.apply with pandas.Series.explode to explode your selected (list) columns (e.g, x and y).

Try this :

out = (
        df
          .loc[:, ["newID", "x", "y"]]
          .apply(lambda x: pd.Series(x).explode())
      )

# Output :

print(out)

    newID    x    y
0  610020  100   60
0  610020   55   60
0  610020   55   60
0  610020   60   60
0  610020   60   65
0  610020   60   65
0  610020   65   70
0  610020   70   70
0  610020   70   75
0  610020   75   75
0  610020   80   80

If you need to assign a single dataframe (with a patter name, df_newID) for each group, use this:

for k, g in out.groupby("newID"):
    globals()['df_' + str(k)] = g
    
print(df_610020, type(df_610020))

    newID    x    y
0  610020  100   60
0  610020   55   60
0  610020   55   60
0  610020   60   60
0  610020   60   65
0  610020   60   65
0  610020   65   70
0  610020   70   70
0  610020   70   75
0  610020   75   75
0  610020   80   80 <class 'pandas.core.frame.DataFrame'>
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks so much - that works too! :) What is the need for the 'globals()'?
To create a variable with a name based on a string. And that’s what you need, you need a single dataframe for each newID.
0

The following solution works:

individualframe = outlierdataframe.iloc[0]
individualoutliers1 = individualframe[['x']].explode('x')
individualoutliers2 = individualframe[['y']].explode('y')
newIDs = individualframe[['newID']][0]
individualoutliers1 = pd.DataFrame(individualoutliers1)
individualoutliers2 = pd.DataFrame(individualoutliers2)
data = [individualoutliers1,individualoutliers2]
newframe = pd.concat(data,axis=1)
newframe = newframe.rename(columns={newframe.columns.values[0]:'x',newframe.columns.values[1]:'y'})
newframe['newID'] = newIDs 
print(newframe)


Output exceeds the size limit. Open the full output data in a text editor
      y    y   newID
0    55   60  610020
1    55   60  610020
2    55   60  610020
3    60   60  610020
4    60   65  610020
5    60   65  610020
6    65   70  610020

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.