Pandas: sum up multiple columns into one column without last column

Question

If I have a dataframe similar to this one

Apples   Bananas   Grapes   Kiwis
2        3         nan      1
1        3         7        nan
nan      nan       2        3

I would like to add a column like this

Apples   Bananas   Grapes   Kiwis   Fruit Total
2        3         nan      1        6
1        3         7        nan      11
nan      nan       2        3        5

I guess you could use df['Apples'] + df['Bananas'] and so on, but my actual dataframe is much larger than this. I was hoping a formula like df['Fruit Total']=df[-4:-1].sum could do the trick in one line of code. That didn't work however. Is there any way to do it without explicitly summing up all columns?

Look there. stackoverflow.com/questions/25748683/…

konstov
– konstov

2017-02-06 08:57:24 +00:00
Commented Feb 6, 2017 at 8:57 — konstov
– konstov, Commented Feb 6, 2017 at 8:57

jezrael · Accepted Answer · 2019-09-14 11:33:21Z

144

You can first select by iloc and then sum:

df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1)
print (df)
   Apples  Bananas  Grapes  Kiwis  Fruit Total
0     2.0      3.0     NaN    1.0          5.0
1     1.0      3.0     7.0    NaN         11.0
2     NaN      NaN     2.0    3.0          2.0

For sum all columns use:

df['Fruit Total']= df.sum(axis=1)

edited Sep 14, 2019 at 11:33

answered Feb 6, 2017 at 8:56

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Tuutsrednas Over a year ago

Excellent. iloc is the thing I was looking for.

Jinhua Wang Over a year ago

This answer doesn't add the last column, and therefore is a little confusing.

jezrael Over a year ago

@JinhuaWang - title was changed.

Jinhua Wang Over a year ago

Oh ok sorry about that

kelkka · Accepted Answer · 2021-07-28 14:36:52Z

This may be helpful for beginners, so for the sake of completeness, if you know the column names (e.g. they are in a list), you can use:

column_names = ['Apples', 'Bananas', 'Grapes', 'Kiwis']
df['Fruit Total']= df[column_names].sum(axis=1)

This gives you flexibility about which columns you use as you simply have to manipulate the list column_names and you can do things like pick only columns with the letter 'a' in their name. Another benefit of this is that it's easier for humans to understand what they are doing through column names. Combine this with list(df.columns) to get the column names in a list format. Thus, if you want to drop the last column, all you have to do is:

column_names = list(df.columns)
df['Fruit Total']= df[column_names[:-1]].sum(axis=1)

Ramon · Accepted Answer · 2022-05-14 09:04:30Z

19

It is possible to do it without knowing the number of columns and even without iloc:

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

cols_to_sum = df.columns[ : df.shape[1]-1]

df['Fruit Total'] = df[cols_to_sum].sum(axis=1)

print(df)
   Apples   Bananas Grapes  Kiwis   Fruit Total
0  2.0      3.0     NaN     1.0     5.0
1  1.0      3.0     7.0     NaN     11.0
2  NaN      NaN     2.0     3.0     5.0

edited May 14, 2022 at 9:04

answered May 3, 2020 at 4:07

Ramon

5387 silver badges16 bronze badges

1 Comment

ReneBt Over a year ago

I like this one as it doesn't require recoding if I decide to expand my dataframe

Francisco Dura · Accepted Answer · 2018-08-23 07:42:29Z

13

Using df['Fruit Total']= df.iloc[:, -4:-1].sum(axis=1) over your original df won't add the last column ('Kiwis'), you should use df.iloc[:, -4:] instead to select all columns:

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

df['Fruit Total']=df.iloc[:,-4:].sum(axis=1)

print(df)
   Apples  Bananas  Grapes  Kiwis  Fruit Total
0     2.0      3.0     NaN    1.0          6.0
1     1.0      3.0     7.0    NaN         11.0
2     NaN      NaN     2.0    3.0          5.0

answered Aug 23, 2018 at 7:42

Francisco Dura

1311 silver badge3 bronze badges

2 Comments

RAVI D PARIKH Over a year ago

Thanks for the answer. I, however, did not understand what is the benefit of having the negative sign in the iloc statement. iloc[:,1,5] seems to be a simpler and less confusing way. I am learning Python and Pandas. By trial and error, I realized that iloc[1:4] just sums the first 3 columns while iloc[:,1,5] sums the first 4

Francisco Dura Over a year ago

Using iloc[:,-4] you are telling it to take the last 4 columns. In this case iloc[:,-4] = iloc[:,1,5]. Which one you use depends on how specific or open you want to be in your statement.

JLK · Accepted Answer · 2021-02-05 22:17:59Z

I want to build on Ramon's answer if you want to come up with the total without knowing the shape/size of the dataframe. I will use his answer below but fix one item that didn't include the last column for the total. I have removed the -1 from the shape:

cols_to_sum = df.columns[ : df.shape[1]-1]

To this:

cols_to_sum = df.columns[ : df.shape[1]]

print(df)
   Apples  Bananas  Grapes  Kiwis
0     2.0      3.0     NaN    1.0
1     1.0      3.0     7.0    NaN
2     NaN      NaN     2.0    3.0

cols_to_sum = df.columns[ : df.shape[1]]

df['Fruit Total'] = df[cols_to_sum].sum(axis=1)

print(df)
   Apples   Bananas Grapes  Kiwis   Fruit Total
0  2.0      3.0     NaN     1.0     6.0
1  1.0      3.0     7.0     NaN     11.0
2  NaN      NaN     2.0     3.0     5.0

Which then gives you the correct total without skipping the last column.

Adrian Mole · Accepted Answer · 2023-03-04 16:26:15Z

1

This might be a much easier method to solve, and it will take care of the other datatypes too, which are not required:

df['Fruit Total'] = df.sum(axis=1, numeric_only= True)

print(df)
   Apples   Bananas Grapes  Kiwis   Fruit Total
0  2.0      3.0     NaN     1.0     6.0
1  1.0      3.0     7.0     NaN     11.0
2  NaN      NaN     2.0     3.0     5.0

edited Mar 4, 2023 at 16:26

Adrian Mole

52.2k193 gold badges61 silver badges101 bronze badges

answered Feb 28, 2023 at 16:55

iffy

111 bronze badge

Collectives™ on Stack Overflow

Pandas: sum up multiple columns into one column without last column

6 Answers 6

4 Comments

Comments

1 Comment

2 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

4 Comments

Comments

1 Comment

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related