Remove last column from pandas DataFrame
In a nutshell, you are able to drop the last column of your DataFrame using one of the following techniques:
By column index:
your_df.drop (columns = your_df.columns[-1], inplace= True)
or alternatively by column name:
your_df = your_df.drop('last_column_name')
Create example data
First, we will import pandas and create some data that you can use to follow along:
import pandas as pd
language = ['Go', 'Java', 'R', 'Python', 'Python']
signups = [120, 151, 186, 107, 156]
attendees = [80,22,134, 45, 76]
campaign = pd.DataFrame(dict (language = language, signups = signups, attendees = attendees))
campaign.head()
Let’s look at the data:
language | signups | attendees | |
---|---|---|---|
0 | Go | 120 | 80 |
1 | Java | 151 | 22 |
2 | R | 186 | 134 |
3 | Python | 107 | 45 |
4 | Python | 156 | 76 |
We can look into the column index:
campaign.columns
This return the following Index object:
Index(['language', 'signups', 'attendees'], dtype='object')
Remove the last column in pandas
Now that we have a DataFrame, we can slice the column index, in the same fashion that you would have done with a simple Python list.
The last column index is -1, hence we can write the following code to drop that column from our DF:
campaign.drop (columns = campaign.columns[-1], inplace=True)
This returns the following DataFrame:
language | signups | |
---|---|---|
0 | Go | 120 |
1 | Java | 151 |
2 | R | 186 |
3 | Python | 107 |
4 | Python | 156 |
Delete the last column by label (name)
We can accomplish a similar result by passing the last column name to the drop method:
campaign.drop(columns = 'attendees', inplace=True)
Drop the two last columns using iloc
We can also subset our DataFrame columns using the iloc indexer. Let’s take for example a case in which we want to save all columns but the last two into a new DataFrame. Using standard Python sequence slicing:
campaign_2 = campaign.iloc[:,:-2]
Remove only first and last columns
Last case for today, is to select all columns except the first and the last. In this case we’ll pass a list of indexes to drop. Index 0 is the first column, index -1 is the last one.
campaign.drop (columns = campaign.columns[[0,-1]])
Removing the first column of a Pandas DataFrame
Let’s assume, that we have a DataFrame that has a couple of columns as well as a sequential index:
# import Pandas library
import pandas as pd
# define example data
interview_dict = {'language': [ 'Python', 'R', 'Scala', 'Java', 'SQL'],
'salary':[130, 110, 85, 95, 77]}
interviews = pd.DataFrame(data=interview_dict)
interviews.head()
Here’s the DataFrame that we have just created:
Removing the index column
If we want to get rid of the index column (the first leftmost column that allows to label each and every one of the DataFrame rows) we’ll execute the following steps:
- First off, we will export and save our DataFrame as a Comma Separated Value (csv) file using the df.to_csv() method. We will ensure that we are not exporting the index along with the data columns by using the index=False parameter.
interviews.to_csv('interviews.csv', index=False)
- Next, we’ll import the csv file contents, but explicitly indicate other column in our dataset as the index:
iv1= pd.read_csv('interviews.csv', index_col = 'language')
iv1.head()
Here’s the result:
Drop the first data column in a DataFrame
You might as well want to drop the first column of your data table. You can refer to it by its label (name) or column index.
cols = interviews.columns[0]
iv2 = interviews.drop(columns = cols)
Or alternatively by label:
cols = ['language']
iv1 = interviews.drop(columns= cols)
Both will render the same result: