How to read one or multiple text files into a Pandas DataFrame?

When data wrangling with Pandas you’ll eventually work with multiple types of data sources. We already covered how to get Pandas to interact with Excel spreadsheets, sql databases, so on. In today’s tutorial, we will learn how use Pyhton3 to import text (.txt) files into a Pandas DataFrames. The process as expected is relatively simple to follow.

Example: Reading one text file to a DataFrame in Python

Suppose that you have a text file named interviews.txt, which contains tab delimited data.

We’ll go ahead and load the text file using pd.read_csv():

import pandas as pd

hr = pd.read_csv('interviews.txt', names =['month', 'first', 'second'])

hr.head()

The result will look a bit distorted as you haven’t specified the tab as your column delimiter:

Specifying the /t escape string as your delimiter, will fix your DataFrame data:

hr = pd.read_csv('interviews.txt', delimiter='\t', names =['month', 'first', 'second'])

hr.head()

Importing multiple text files to Python Pandas DataFrames

This is a more interesting case, in which you need to import several text files located in one directory in your operating system into a Pandas DataFrame. Your text files could contain data extracted from a 3rd party system, database and so forth.

Before we go on we’ll need to import a couple of Python libraries:

import os, glob

Now using the following code:

# Define relative path to folder containing the text files

files_folder = "../data/"
files = []

# Create a dataframe list by using a list comprehension

files = [pd.read_csv(file, delimiter='\t', names =['month', 'first', 'second'] ) for file in glob.glob(os.path.join(files_folder ,"*.txt"))]

# Concatenate the list of DataFrames into one
files_df = pd.concat(files)

Once you have your DataFrame populated , you can further analyze and visualize your data using Pandas.

Additional learning