How to make an empty Pandas DataFrames with Python and append data to it?

In today’s quick tutorial we’ll learn how to initialize Python Pandas DataFrames from scratch.

We;ll be focusing on several prevalent use cases that you might want to get familiar with as they’ll be very useful in your Data preparation process.

  1. New dataframe with column names
  2. Setting the size of the empty DataFrame
  3. Create dataframe with index
  4. Append data to the new DataFrame
  5. Create empty column

Preparation

Let’s get started by importing the Pandas library:

import pandas as pd

Note: If Pandas is not properly installed in your system, you will receive a modulenotfound error. If that’s the case you might need to install Pandas in your system first.

Now let’s define some data that we’ll use through the tutorial:


df_cols = ['city', 'month' , 'year', 'min_temp', 'max_temp']

1. Empty DataFrame with column names

Let’s first go ahead and add a DataFrame from scratch with the predefined columns we introduced in the preparatory step:

#with column names
new_df = pd.DataFrame(columns=df_cols)

We can now easily validate that the DF is indeed empty using the relevant attribute:

new_df.empty

2. Make a DF with specific size

num_rows = 5
new_df = pd.DataFrame(index=range(num_cols), columns = df_cols)
new_df

3. Save new DataFrame with index

In the snippet below we’ll define an index for the DataFrame and pass it to the pd.DataFRame constructor.

idx = ['station_id']
new_df = pd.DataFrame(index=idx, columns = df_cols)

4. Append Data to your DataFrame

Next we’ll append data. We are able to easily import data from a csv, json, text etc’. For the sake of simplicity, we’ll import a list as a row to the DataFrame:

new_row =['NYC', 12, 2022, 19, 65]
new_df =pd.DataFrame(columns = df_cols)

# using the loc indexer
new_df.loc[0] = new_row

5. New empty columns

We’ll wrap up this tutorial showing how to create a n empty colum into your DF:

import numpy as np
new_df['empty_col'] = np.nan