0

I create a dataframe df_energy :

df_energy=pd.read_csv('C:/Users/Demonstrator/Downloads/power.csv', delimiter=';', parse_dates=[0], infer_datetime_format = True)

with this structure :

 df_energy.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 43229 entries, 0 to 43228
Data columns (total 6 columns):
TIMESTAMP        43229 non-null datetime64[ns]
P_ACT_KW         40376 non-null float64
PERIODE_TARIF    43209 non-null object
P_SOUSCR         37501 non-null float64
SITE             43229 non-null object
TARIF            43229 non-null object
dtypes: datetime64[ns](1), float64(2), object(3)
memory usage: 2.0+ MB


TIMESTAMP P_ACT_KW PERIODE_TARIF P_SOUSCR SITE TARIF 
2015-07-31 23:00:00 12.0 HC NaN ST GEREON TURPE_HTA5 
2015-07-31 23:10:00 466.0 HC 425.0 ST GEREON TURPE_HTA5 
2015-07-31 23:20:00 18.0 HC 425.0 ST GEREON TURPE_HTA5 
2015-07-31 23:30:00 17.0 HC 425.0 ST GEREON TURPE_HTA5

As I am starting learning python, I would like to know can I add three new columns : High_energy, Medium_energy and low_energy .

High_energy contains the P_ACT_KW value if P_ACT_KW > 400, Medium_energy contains the P_ACT_KW value if P_ACT_KW is between 200 and 400, Low_energy contains the P_ACT_KW value if P_ACT_KW < 200. For example :

TIMESTAMP P_ACT_KW PERIODE_TARIF P_SOUSCR SITE TARIF High_energy Medium_energy Low_energy
2015-07-31 23:00:00 12.0 HC NaN ST GEREON TURPE_HTA5 0 0 12
2015-07-31 23:10:00 466.0 HC 425.0 ST GEREON TURPE_HTA5 466 0 0
2015-07-31 23:20:00 18.0 HC 425.0 ST GEREON TURPE_HTA5 0 0 18
2015-07-31 23:30:00 17.0 HC 425.0 ST GEREON TURPE_HTA5 0 0 17

Thank you

Kind regards

1
  • No they are 6 : 2015-07-31 23:30:00 | 17.0 | HC | 425.0 | ST GEREON | TURPE_HTA5 Commented Aug 23, 2016 at 18:02

1 Answer 1

2

you can use np.where from numpy as:
Sample df:

Out[71]: 
             TIMESTAMP  P_ACT_KW PERIODE_TARIF  P_SOUSCR       SITE  \
0  2015-07-31 23:00:00      12.0            HC       NaN  ST GEREON   
1  2015-07-31 23:10:00     466.0            HC     425.0  ST GEREON   
2  2015-07-31 23:20:00      18.0            HC     425.0  ST GEREON   
3  2015-07-31 23:30:00      17.0            HC     425.0  ST GEREON   

        TARIF  
0  TURPE_HTA5  
1  TURPE_HTA5  
2  TURPE_HTA5  
3  TURPE_HTA5

df['high_energy']=np.where(df['P_ACT_KW']>400,df['P_ACT_KW'],0)

df['medium_energy']=np.where((df['P_ACT_KW']>200)&(df['P_ACT_KW']<400),df['P_ACT_KW'],0)

df['low_energy']=np.where(df['P_ACT_KW']<200,df['P_ACT_KW'],0)

Out[72]: 
             TIMESTAMP  P_ACT_KW PERIODE_TARIF  P_SOUSCR       SITE  \
0  2015-07-31 23:00:00      12.0            HC       NaN  ST GEREON   
1  2015-07-31 23:10:00     466.0            HC     425.0  ST GEREON   
2  2015-07-31 23:20:00      18.0            HC     425.0  ST GEREON   
3  2015-07-31 23:30:00      17.0            HC     425.0  ST GEREON   

        TARIF  high_energy  medium_energy  low_energy  
0  TURPE_HTA5          0.0            0.0        12.0  
1  TURPE_HTA5        466.0            0.0         0.0  
2  TURPE_HTA5          0.0            0.0        18.0  
3  TURPE_HTA5          0.0            0.0        17.0 
Sign up to request clarification or add additional context in comments.

2 Comments

Another question please : Is it possible to use "np.where" for more than 1 condition , for example if (x< ..) else if (y <...), else .....?
yes it is possible using nested np.where like np.where(cond,value,np.where(cond1,value,value)).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.