In today’s data analysis tutorial we’ll learn how to sum across rows in Pandas DataFrame columns and add a total summary row.
Create Example Data
We will start by creating some sample data based on a fictitious human resources dataset.
Adding a Total Row to the DataFrame
We can use the sum() DataFrame method to quickly total all DataFrame columns. The result will be a Series.
row_sum = hr_df.sum() row_sum
The result looks as following:
This is a step forward but not exactly what we aimed for. Let’s first of all ensure that we are adding only numeric values.
row_sum = hr_df.sum(numeric_only = True) print (row_sum)
Our result will be:
interview_1 665 interview_2 428 interview_3 98 dtype: int64
You can easily export the series to a list for further processing if needed.
We can also use the iloc accessor to filter out the specific columns to sum:
row_sum = hr_df.iloc[:,1:4].sum()
Next we will append the resulting total Series as a new row to the DataFrame. We can do it using the loc accessor:
hr_df.loc['Total'] = row_sum hr_df.fillna('')
As you can see below, a new row, labeled Total was added to your DataFrame.
Adding a total column
For completeness, here is a short snippet that adds a total column to your DataFrame. Note the use of the axis = 1 parameter to ensure that the sum is made across the columns, rather than by default across rows.
hr_df['Total'] = hr_df.sum(numeric_only = True, axis =1)