How to read one or multiple text files into a DataFrame with Python?

When data wrangling with Pandas you’ll eventually work with multiple types of data sources. We already covered how to get Pandas to interact with Excel spreadsheets, sql databases, so on. In today’s tutorial, we will learn how use Pyhton3 to import text (.txt) files into a Pandas DataFrames. The process as expected is relatively simple to follow.

Example: Reading a text file to a DataFrame in Pandas

Suppose that you have a text file named interviews.txt, which contains tab delimited data.

We’ll go ahead and load the text file using pd.read_csv():

import pandas as pd

hr = pd.read_csv('interviews.txt', names =['month', 'first', 'second'])

hr.head()

The result will look a bit distorted as you haven’t specified the tab as your column delimiter:

Specifying the /t escape string as your delimiter, will fix your DataFrame data:

hr = pd.read_csv('interviews.txt', delimiter='\t', names =['month', 'first', 'second'])

hr.head()

Importing multiple text files to Pandas

This is a more interesting case, in which you need to import several text files located in one directory into a DataFrame. Those could contain data extracted from a 3rd party system, database and so forth.

Before we go on we’ll need to import a couple of Python libraries:

import os, glob

Now use the following code:

# Define relative path to folder containing the text files

files_folder = "../data/"
files = []

# Create dataframe list by using a list comprehension

files = [pd.read_csv(file, delimiter='\t', names =['month', 'first', 'second'] ) for file in glob.glob(os.path.join(files_folder ,"*.txt"))]

# Concatenate dataframe list
files_df = pd.concat(files)

Once you have your DataFrame populated , you can further analyze and visualize your Dataset.

Additional learning