When data wrangling with Pandas you’ll eventually work with multiple types of data sources. We already covered how to get Pandas to interact with Excel spreadsheets, sql databases, so on. In today’s tutorial, we will learn how use Pyhton3 to import text (.txt) files into a Pandas DataFrames. The process as expected is relatively simple to follow.
Example: Reading a text file to a DataFrame in Pandas
Suppose that you have a text file named interviews.txt, which contains tab delimited data.
We’ll go ahead and load the text file using pd.read_csv():
import pandas as pd hr = pd.read_csv('interviews.txt', names =['month', 'first', 'second']) hr.head()
The result will look a bit distorted as you haven’t specified the tab as your column delimiter:
Specifying the /t escape string as your delimiter, will fix your DataFrame data:
hr = pd.read_csv('interviews.txt', delimiter='\t', names =['month', 'first', 'second']) hr.head()
Importing multiple text files to Pandas
This is a more interesting case, in which you need to import several text files located in one directory into a DataFrame. Those could contain data extracted from a 3rd party system, database and so forth.
Before we go on we’ll need to import a couple of Python libraries:
import os, glob
Now use the following code:
# Define relative path to folder containing the text files files_folder = "../data/" files =  # Create dataframe list by using a list comprehension files = [pd.read_csv(file, delimiter='\t', names =['month', 'first', 'second'] ) for file in glob.glob(os.path.join(files_folder ,"*.txt"))] # Concatenate dataframe list files_df = pd.concat(files)
Once you have your DataFrame populated , you can further analyze and visualize your Dataset.