How to open and read an excel file with Python?

Problem

Working with spreadsheets is essential when dealing with data. In this tutorial we would like to show you how to open and read excel worksheets (xls / xlsx file formats) using Python 3.

Solution

We have two libraries in Python that can help us to work with the excel sheet. We will discuss more on how to utilize the libraries in our code to read the excel sheet.

Using xlrd module:

We can import xldr to open and read an excel file in python code. Make sure that we are not reading the .xlsx sheet as it is not supported. Using the rows and columns count, we can easily read data from an excel sheet. Let’s display the output of an excel using a sample example code.

Note:

  • pip install xlrd – Before we import xlrd in the code, make sure to install the xldr in the system.

Code:

import xlrd
list_of_rows = []
file_path = "D:/applications.xls"
excel_workbook = xlrd.open_workbook(file_path)
excel_tab = excel_workbook.sheet_by_index(0)
def getrowvalue(value,count): #function to get the value
    print("Row",count,":",value)
for i in range(excel_tab.nrows):
    for j in range(excel_tab.ncols):
        list_of_rows.append(excel_tab.cell_value(i,j))
    getrowvalue(list_of_rows,i)
    list_of_rows = []

Output:

Row 0 : ['Name', 'OS']
Row 1 : ['ABC-APP', 'Windows']
Row 2 : ['CDA-APP ', 'Unix']
Row 3 : ['LKUYKI', 'Windows']
Row 4 : ['CRYSTAL', 'Windows']

Using openpyxl module:

The openpyxl works the same way as xlrd. The difference between the xlrd is that we can read the .xlsx file using this module and the range for the number of columns and rows starts with 1.

Note:

  • pip install openpyxl before importing openpyxl in the code.

Code:

import openpyxl
list_of_rows = []
file_path = "D:/applications.xlsx"
excel_workbook = openpyxl.load_workbook(file_path)
excel_workbook.active = 0
excel_tab = excel_workbook.active
print(excel_tab)
def getrowvalue(value,count):
    print("Row",count,":",value)
for i in range(1, excel_tab.max_row+1):
    for j in range(1, excel_tab.max_column+1):
        list_of_rows.append(excel_tab.cell(row = i, column = j).value)
    getrowvalue(list_of_rows,i)
    list_of_rows = []

Output:

Row 1 : ['Name', 'OS']
Row 2 : ['ABC-APP', 'Windows']
Row 3 : ['CDA-APP ', 'Unix']
Row 4 : ['LKUYKI', 'Windows']
Row 5 : ['CRYSTAL', 'Windows']

Note: On top of the recipes shown above, the Pandas Data Analysis library offers very powerful functions for working with spreadsheets. Here’s a good example to get you started with Pandas and Excel.