How to drop the first rows from your pandas dataframe?

In today’s quick data analysis tutorial we’ll learn how to remove the first or multiple few rows of a pandas DataFrame.

Remove first rows in pandas – Summary

Option 1: drop by row label

mydf.drop (labels = 'label_first_row', inplace= True)

Option 2: using the row index


mydf.drop (index = 0, inplace= True)

Option 3: using the iloc accessor

mydf.iloc[1:]

Delete the first row in pandas – Example

Let’s show some use cases in which you will need to drop some one or multiple rows.

Creating the DataFrame

We will start by importing the pandas library. Next we’ll define a sample dataset.

import pandas as pd

language = ['R', 'Javascript', 'R', 'Python', 'R', 'Javascript']
area = ['Paris', 'Rio de Janeiro', 'Buenos Aires', 'New York', 'Buenos Aires', 'London']
salary = [149.0, 157.0, 117.0, 146.0, 130.0, 191.0]
emp = dict(area=area, language  =language, salary = salary)

emp_df = pd.DataFrame(data = emp)

#visualize the dataframe first 5 rows
emp_df.head()

Here’s the output:

arealanguagesalary
0ParisR149.0
1Rio de JaneiroJavascript157.0
2Buenos AiresR117.0
3New YorkPython146.0
4Buenos AiresR130.0

Drop first row from the DataFrame

We’ll can use the DataFrame drop() method here (use same method to select and drop any single record) :

emp_df.drop(index=0)

Note: make sure to add the inplace = True parameter to ensure your changes are persisted.


emp_df.drop(index=0, inplace = True)

Alternatively:

emp_df.iloc[1:]

When using iloc you can persist a change by assigning to a new DataFrame:

emp_df_2 = emp_df.iloc[1:]

Remove first two rows

Using the iloc accessor:

emp_df.iloc[2:]

Here’s our output:


area
languagesalary
2Buenos AiresR117.0
3New YorkPython146.0
4Buenos AiresR130.0
5LondonJavascript191.0

Delete multiple rows off your DataFrame

Removing a few rows is quite easy as well. We saw already how to use the iloc accesor for that.

 emp_df.drop (index = [0,2,3])

# or alternatively

rows = [0,2,3]
emp_df.drop (index = rows)

The relevant rows are removed as can be seen below

arealanguagesalary
1Rio de JaneiroJavascript157.0
4Buenos AiresR130.0
5LondonJavascript191.0

Remove first duplicated row

This is a tangent question from a reader. In order to get rid of your first duplicated row. By default when using removing duplicates, the first occurrence is kept. THe trick is to use the keep = ‘last’ parameter. Note that using keep=False deletes all duplicated records.

emp_df.drop_duplicates(keep= 'last')

Get and Write Pandas DataFrame first rows

Next in this tutorial we’ll quickly find out how to extract the first row of a Pandas DataFrame to a list.

Create our example DataFrame

We will get started by quickly importing the Pandas library and creating a simple DataFrame that we can use for this example.

import pandas as pd

# Define data using lists
month = ['June', 'November', 'December', ]
language = ['R', 'Swift', 'Ruby', ]
first_interview = [71, 74, 76]
second_interview = [68, 53, 56]

#Constructing the DataFrame
hr_data = dict(language=language, interview_1=first_interview, interview_2=second_interview)
hr_df = pd.DataFrame(data=hr_data, index=month)


Get the first row of a Pandas DataFrame

To look into the first row of our data we’ll use the head function:

hr_df.head(1)
languageinterview_1interview_2
JuneR7168

Exporting the first DataFrame row as list

Several options here, we’ll focus on using the iloc and loc indexers.

Using iloc to fetch the first row by integer location (in this case 0):

first_rec = hr_df.iloc[0]

Using loc to select the first row using its label:

first_rec = hr_df.loc['June']

In both cases the first row values will be retrieved into a Pandas Series. We can then using the to_list() method to export the Series to a list:

first_rec.to_list()

And our result will be:

['R', 71, 68]

Exporting the first record to an array

We can use the to_numpy function in order to retrieve the row values to a Numpy array:

first_rec.to_numpy()

#This will result in

array(['R', 71, 68], dtype=object)

Get first DataFrame column to a list

For completeness, i have added a simple snippet that uses the iloc indexer to export the first column (location = 0) to a list.

hr_df.iloc[:,0].to_list()