How to create new values in a pandas dataframe column based on values from another column

Question

I have a pandas dataframe of values I read in from a csv file. I have a column labeled 'SleepQuality' and the values are float from 0.0 - 100.0. I want to create a new column labeled 'SleepQualityGroup' where values from the original column btw 0 - 49 have a value of 0 in the new column, 50 - 59 = 1 , 60 - 69 = 2, 70 - 79 = 3, 80 - 89 = 4, and 90 - 100 = 5

What would be the best formula to use in order to do this? I am stuck on the logic needed to identify all values in each range and assign to the new value.

An example of what the output would like like below in the new 'SleepQualityGroup' column.

SleepQuality    SleepQualityGroup
80.4              4
90.1              5
66.4              2
50.3              1
86.2              4
75.4              3
45.7              0
91.5              5
61.3              2 
54                1
58.2              1

Bharath M Shetty · Accepted Answer · 2017-10-03 14:41:03Z

12

Use pd.cut i.e

df['new'] = pd.cut(df['SleepQuality'],bins=[0,50 , 60, 70 , 80 , 90,100], labels=[0,1,2,3,4,5])

Output:

        SleepQuality  SleepQualityGroup new
0           80.4                  4   4
1           90.1                  5   5
2           66.4                  2   2
3           50.3                  1   1
4           86.2                  4   4
5           75.4                  3   3
6           45.7                  0   0
7           91.5                  5   5
8           61.3                  2   2
9           54.0                  1   1
10          58.2                  1   1

answered Oct 3, 2017 at 14:41

Bharath M Shetty

30.6k6 gold badges65 silver badges111 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Bharath M Shetty · Accepted Answer · 2018-01-09 05:22:33Z

6

That's basically a binning operation. As such two tools could be used here.

Using np.searchsorted -

bins = np.arange(50,100,10)
df['SleepQualityGroup'] = bins.searchsorted(df.SleepQuality)

Using np.digitize -

df['SleepQualityGroup'] = np.digitize(df.SleepQuality, bins)

Sample output -

In [866]: df
Out[866]: 
    SleepQuality  SleepQualityGroup
0           80.4                  4
1           90.1                  5
2           66.4                  2
3           50.3                  1
4           86.2                  4
5           75.4                  3
6           45.7                  0
7           91.5                  5
8           61.3                  2
9           54.0                  1
10          58.2                  1

Runtime test -

In [921]: df
Out[921]: 
    SleepQuality  SleepQualityGroup
0           80.4                  4
1           90.1                  5
2           66.4                  2
3           50.3                  1
4           86.2                  4
5           75.4                  3
6           45.7                  0
7           91.5                  5
8           61.3                  2
9           54.0                  1
10          58.2                  1

In [922]: df = pd.concat([df]*10000,axis=0)

# @Dark's soln using pd.cut
In [923]: %timeit df['new'] = pd.cut(df['SleepQuality'],bins=[0,50 , 60, 70 , 80 , 90,100], labels=[0,1,2,3,4,5])
1000 loops, best of 3: 1.04 ms per loop

In [926]: %timeit df['SleepQualityGroup'] = bins.searchsorted(df.SleepQuality)
1000 loops, best of 3: 591 µs per loop

In [927]: %timeit df['SleepQualityGroup'] = np.digitize(df.SleepQuality, bins)
1000 loops, best of 3: 538 µs per loop

edited Jan 9, 2018 at 5:22

Bharath M Shetty

30.6k6 gold badges65 silver badges111 bronze badges

answered Oct 3, 2017 at 14:38

Divakar

222k19 gold badges273 silver badges374 bronze badges

2 Comments

Bharath M Shetty Over a year ago

How can you be so fast ?

Wiggy A. Over a year ago

He has hacks on.

Collectives™ on Stack Overflow

How to create new values in a pandas dataframe column based on values from another column

2 Answers 2

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related