Our task for today will be to read the content of a text or csv file into a list. Each list item should contain one line of text.
Read a text file contents to a list
We’ll first define the file system path to the file we would like to read, then open it in read mode (‘r’). Next use the readlines method of the TextIOWrapper object to read the text file contents into the list object.
Here’s a simple snippet to accomplishes that:
from pathlib import Path
# define the path to the text file we would like to read
dir_path = Path('C:\WorkDir')
file_name = 'file_to_read.txt'
file_path = dir_path.joinpath(file_name)
# check if the file exists. Read it line by line if it does
if file_path.is_file():
with open (file_path, 'r') as f:
text_list = f.readlines()
print (text_list)
else:
print("Your input file doesn't exit")
As expected, the result is a list:
['This is a sample text file that i have just created.\n', 'This is the second line.\n', 'And this is the third.']
Removing the newlines
To get rid of the newlines (\n) in the different list elements, i can easily create a new list object using a list comprehension:
no_newlines_list = [element.strip() for element in text_list]
print(no_newlines_list)
Read multiple text files into list of lists
Our next task is to read more than one file. As we saw before, the readlines() method of our TextIOWrapper reads each line into a new list. As we’ll be accessing multiple files, we’ll append the content of each file into a list of lists.
Here’s a simple snippet that you can use:
import glob
#define work directory
path_dir = 'C:\WorkDir\WorkFolderTxt'
#define list of files to access in a specific directory
txt_file_list = glob.glob(path_dir+'\\*.txt')
# define list to hold all lines
mult_text_list = []
# read through all files and append content to the list of lists
for file in txt_file_list:
with open (file, 'r') as f:
s_text_list = f.readlines()
mult_text_list.append(s_text_list)
print (mult_text_list)
Additional learning
How can you get rid of newline characters in a Python string?