2

Context

Lets say I have a pandas-DataFrame like this:

>>> data.head()
                            values  atTime
date        
2006-07-01 00:00:00+02:00   15.10   0000
2006-07-01 00:15:00+02:00   16.10   0015
2006-07-01 00:30:00+02:00   17.75   0030
2006-07-01 00:45:00+02:00   17.35   0045
2006-07-01 01:00:00+02:00   17.25   0100

atTime represents the hour and minute of the timestamp used as index. I want to transpose the atTime-column to a binary matrix (making it sparse is also an option), which will be used as nominal feature in a machine learning approach.

The desired result should look like:

>>> data.head()
                            values  0000  0015  0030  0045  0000
date        
2006-07-01 00:00:00+02:00   15.10   1     0     0     0     0
2006-07-01 00:15:00+02:00   16.10   0     1     0     0     0
2006-07-01 00:30:00+02:00   17.75   0     0     1     0     0
2006-07-01 00:45:00+02:00   17.35   0     0     0     1     0
2006-07-01 01:00:00+02:00   17.25   0     0     0     0     1

As might be anticipated, this matrix will be much larger when concidering all values in atTime.


My question

I can achieve the desired result with workarounds using apply and using the timestamps in order to create the new columns beforehand.

However, is there a build-in option in pandas (or via numpy, concidering atTime as numpy-array) to achieve the same without a workaround?

1 Answer 1

8

This is a use case for get_dummies:

pd.get_dummies(df, columns=["atTime"]) 
                           values  atTime_0  atTime_15  atTime_30  atTime_45  atTime_100
date                                                                                    
2006-07-01 00:00:00+02:00   15.10         1          0          0          0           0
2006-07-01 00:15:00+02:00   16.10         0          1          0          0           0
2006-07-01 00:30:00+02:00   17.75         0          0          1          0           0
2006-07-01 00:45:00+02:00   17.35         0          0          0          1           0
2006-07-01 01:00:00+02:00   17.25         0          0          0          0           1

Solution updated with OP's recommendation. Thanks!

Sign up to request clarification or add additional context in comments.

2 Comments

Oh.. I never thougth that something like this would be behind such a name.. Thank you very much for the fast reply!
Just to add since I needed this again: With the current pandas-package you can run pd.get_dummies(data, columns=["atTime"]) as well with the same result and saving the join :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.