11

I have df:

                     Voltage
01-02-2017 00:00       13.1
01-02-2017 00:01       13.2
01-02-2017 00:02       13.3
01-02-2017 00:03       14.1
01-02-2017 00:04       14.3
01-02-2017 00:04       13.5

I would like the time (hh:mm) of the first instance of when the value in the Voltage column >=14.0. There should only be one time value in column 'Time of Full Charge'.

                     Voltage   Time of Full Charge
01-02-2017 00:00       13.1
01-02-2017 00:01       13.2
01-02-2017 00:02       13.3
01-02-2017 00:03       14.1         00:03
01-02-2017 00:04       14.3
01-02-2017 00:04       13.5

I am trying something along these lines, but cannot figure it out:

df.index = pd.to_datetime(df.index)
df.['Time of Full Charge'] = np.where(df.['Voltage'] >= 14.0), (df.index.hour:df.index.minute))    

2 Answers 2

11

You need idxmax for first index value by condition, only is necessary index has to be unique:

idx = (df['Voltage'] >= 14.0).idxmax()
df.loc[mask, 'Time of Full Charge'] = mask.idxmax().strftime('%H:%M')
print (df)
                     Voltage Time of Full Charge
2017-01-02 00:00:00     13.1                 NaN
2017-01-02 00:01:00     13.2                 NaN
2017-01-02 00:02:00     13.3                 NaN
2017-01-02 00:03:00     14.1               00:03
2017-01-02 00:04:00     14.3                 NaN
2017-01-02 00:04:00     13.5                 NaN

Or:

idx = (df['Voltage'] >= 14.0).idxmax()
df['Time of Full Charge'] = np.where(df.index == idx, idx.strftime('%H:%M'), '')
print (df)
                     Voltage Time of Full Charge
2017-01-02 00:00:00     13.1                    
2017-01-02 00:01:00     13.2                    
2017-01-02 00:02:00     13.3                    
2017-01-02 00:03:00     14.1               00:03
2017-01-02 00:04:00     14.3                    
2017-01-02 00:04:00     13.5     

For non unique index is possible use MultiIndex:

df.index = [np.arange(len(df.index)), df.index]

idx = (df['Voltage'] >= 14.0).idxmax()
df['Time of Full Charge'] = np.where(df.index.get_level_values(0) == idx[0], 
                                     idx[1].strftime('%H:%M'),
                                     '')

df.index = df.index.droplevel(0)
print (df)
                     Voltage Time of Full Charge
2017-01-02 00:00:00     13.1                    
2017-01-02 00:01:00     13.2                    
2017-01-02 00:02:00     13.3                    
2017-01-02 00:03:00     14.1               00:03
2017-01-02 00:04:00     14.3                    
2017-01-02 00:04:00     13.5                    
Sign up to request clarification or add additional context in comments.

8 Comments

Thanks @jezrael. I only need the first instance of when that column reaches 14 or above (there should only be one value in the new column. Is this possible?
Yes, index is essentially a 24 hour day, so will be unique. thanks!
Shouldn't this be idxmin() instead of idxmax()? Because when you say that df['Voltage'] >= 14, it means only those rows with values greater than or equal to 14 will be present. Now, amongst those rows, we just need the minimum one. Kindly let me know where am I wrong.
@ArchanJoshi - It is a bit different, because working with boolen mask - Trues with Falses. So for match first True need idxmax, because Trues is processes like 1, False like 0. So first index of '1' (True) is extracted by idxmax. And there is no filtering, (df['Voltage'] >= 14.0) does not filter.
Source doesn't matter. Just pick any good one and start. You'll be done in no time. It is not at all a difficult language.
|
2

You can use numpy.searchsorted() if Voltage column is sorted:

In [260]: df.index[np.searchsorted(df.Voltage, 14)]
Out[260]: DatetimeIndex(['2017-01-02 00:03:00'], dtype='datetime64[ns]', freq=None)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.