How to combine rows with the same timestamp?

Question

I'm trying to combine all rows of a dataframe that have the same time stamp into a single row. The df is 5k by 20.

             A      B      ...
 timestamp
    11:00    NaN    10     ...
    11:00    5      NaN    ...
    12:00    15     20     ...
    ...      ...    ...

group the 2 11:00 rows as follows

             A      B        ...
timestamp
    11:00    5      10       ...
    12:00    15     20       ...
    ...      ...    ...

Any help would be appreciated. Thank you.

I have tried

df.groupby( df.index ).sum()

You're asking if the NaN's in the above were values instead? In my case, for each unique time stamp (the 2 rows of 11:00 in the above ex.) there will only be 1 value per column. I had initially tried a group by index and sum but this left me with all NaNs. — pmdaly
– pmdaly, Commented May 28, 2015 at 20:41

selwyth · Accepted Answer · 2015-05-29 01:39:01Z

You could melt ('unpivot') the DataFrame to convert it from wide form to long form, remove the null values, then aggregate via groupby.

import pandas as pd

df = pd.DataFrame({'timestamp' : ['11:00','11:00','12:00'],
               'A' : [None,5,15],
               'B' : [10,None,20]
              })

    A   B   timestamp
0   NaN 10  11:00
1   5   NaN 11:00
2   15  20  12:00

df2 = pd.melt(df, id_vars = 'timestamp') # specify the value_vars if needed

    timestamp   variable    value
0   11:00       A           NaN
1   11:00       A           5
2   12:00       A           15
3   11:00       B           10
4   11:00       B           NaN
5   12:00       B           20

df2.dropna(inplace=True)
df3 = df2.groupby(['timestamp', 'variable']).sum()

                        value
timestamp   variable    
11:00       A           5
            B           10
12:00       A           15
            B           20

df3.unstack()

            value
variable    A   B
timestamp       
11:00       5   10
12:00       15  20

selwyth · Accepted Answer · 2015-05-29 01:44:02Z

2

groupby after replacing the NaN values with 0's.

df.fillna(0, inplace=True)
df.groupby(df.index).sum()

answered May 29, 2015 at 1:44

selwyth

2,50719 silver badges19 bronze badges

Comments

Alexander · Accepted Answer · 2015-05-28 21:08:04Z

1

Try using resample:

>>> df.resample('60Min', how='sum')
                      A   B
2015-05-28 11:00:00   5  10
2015-05-28 12:00:00  15  20

More examples can be found in the Pandas Documentation.

edited May 28, 2015 at 21:08

answered May 28, 2015 at 20:52

Alexander

111k32 gold badges212 silver badges208 bronze badges

Comments

J.J · Accepted Answer · 2015-05-28 20:56:35Z

0

You cannot sum a number and a NaN in python. You probably need to use .aggregate() :)

answered May 28, 2015 at 20:56

J.J

3,6172 gold badges33 silver badges37 bronze badges

1 Comment

pmdaly Over a year ago

yeah, i've been messing around with aggregate also but I can't seem to figure it out.

Collectives™ on Stack Overflow

How to combine rows with the same timestamp?

4 Answers 4

Comments

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related