Create a new column based on previous row value and delete the current row

Question

I have an input dataframe which can be generated from the code given below

  df = pd.DataFrame({'subjectID' :[1,1,2,2],'keys': 
  ['H1Date','H1','H2Date','H2'],'Values': 
  ['10/30/2006',4,'8/21/2006',6.4]})

The input dataframe looks like as shown below

This is what I did

s1 = df.set_index('subjectID').stack().reset_index()

s1.rename(columns={0:'values'}, 
             inplace=True)
d1 = s1[s1['level_1'].str.contains('Date')]
d2 = s1[~s1['level_1'].str.contains('Date')]

d1['g'] = d1.groupby('subjectID').cumcount()
d2['g'] = d2.groupby('subjectID').cumcount()

d3 = pd.merge(d1,d2,on=["subjectID", 'g'],how='left').drop(['g','level_1_x','level_1_y'], axis=1)

Though it works, I am afraid that this may not be the best approach. As we might have more than 200 columns and 50k RECORDS. Any help to improve my code further is very helpful.

I expect my output dataframe to look like as shown below

anky · Accepted Answer · 2019-06-29 07:47:27Z

1

may be something like:

s=df.groupby(df['keys'].str.contains('Date').cumsum()).cumcount()+1

final=(df.assign(s=s.astype(str)).set_index(['subjectID','s']).
       unstack().sort_values(by='s',axis=1))
final.columns=final.columns.map(''.join)
print(final)

           keys1     Values1 keys2 Values2
subjectID                                  
1          H1Date  10/30/2006    H1       4
2          H2Date   8/21/2006    H2     6.4

answered Jun 29, 2019 at 7:47

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Create a new column based on previous row value and delete the current row

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related