Readers of this blog know that we are pretty big on finding ways to automate mundane and boring tasks. Python is great for automation, specially when it pertains to the Data Analysis domain.
Lately, we are increasingly spending more an more time with the Pandas data Analysis library, and using it for data munging and visualization. We rather work in Python instead of dealing with the complexity of Excel Visual Basic for Applications (VBA) when manipulating data.
That said, there are situations in which we choose to feed data from Pandas back into an Excel workbook. This allows us to combine Python and Pandas flexibility and speed with great visualization capabilities and the obvious ubiquity of Excel.
Today we’ll show you how to write and export data from a Pandas DataFrame to an Excel file (xlsx). We’ll deal with two scenarios:
- Save a Pandas DataFrame to one Excel worksheet.
- Write Pandas DataFrames to multiple worksheets in a workbook.
Note: This tutorial requires some basic knowledge of Python programming and specifically the Pandas library.
Export and Write Pandas DataFrame to Excel
Here’s the process in a nutshell:
- First off, ensure that you have installed the pandas, openpyxl and xlsxwriter libraries into your environment. Here’s how to install Pandas in your Python development environment.
- Initialize / Load data your Pandas DataFrame.
- Use the DataFrame.to_excel method to export your data
Load add-on libraries
Before we start, we’ll need to import a few libraries into Python as shown below. Go ahead and type this Python 3 code into you favorite Python editor.
import pandas as pd import openpyxl import xlsxwriter
Defining our DataFrame
Now let’s create the data that we’ll be using in this tutorial
Note: If you want to learn how to aggregate Data in Python you can look into our tutorial on grouping Python data according to one or multiple columns.
Pandas export to new and existing Excel xls and xlsx files
Now, we would like to export the DataFrame that we just created to an Excel workbook. Pandas has a very handy to_excel method that allows to do exactly that. Let’s use it:
The code will create the languages.xlsx file and export the dataset into Sheet1
Note: In case that you don’t have the openpyxl package installed in your Python environment, you’ll get the following error: ModuleNotFoundError: No module named ‘openpyxl’ . The solution is relatively simple: from your terminal or if you are using Anaconda, from the Anaconda Navigator, type: pip install openpyxl. Then re-run your code again.
If you want to get more fancy you can use the parameters of the to_excel method to customize your data transfer. In our case we set the sheet name, chose not to export the DataFrame index picked the columns to export and even define frozen panes in the worksheet.
df.to_excel("languages1.xlsx", sheet_name="Languages", index=False, , freeze_panes=(1,1), columns=["avg_salary", "language"])
Python to Export Pandas DataFrames to multiple sheets
So far so good, but what if we would like to import data into several worksheets? That’s possible as well, although a bit more elaborated. Note the usage of the xlsxwriter Python library.
We’ll first split our data to three different DataFrames:
#we define three lists S1= data["language"] S2= data["avg_salary"] S3= data["applications"] # We then create three dataframes df1=pd.DataFrame(S1, columns=["language"]) df2=pd.DataFrame(S2 , columns=["avg_salary"]) df3=pd.DataFrame(S3, columns=["applications"]) # We then group the dataframes into a list for more efficient processing dflist= [df1,df2,df3]
Now we’ll import multiple dataframes into Excel (in this case our three dfs to three different worksheets).
# We'll define an Excel writer object and the target file Excelwriter = pd.ExcelWriter("languages_multiple.xlsx",engine="xlsxwriter") #We now we'll loop the list of dataframes for i, df in enumerate (dflist): df.to_excel(Excelwriter, sheet_name="Sheet" + str(i+1),index=False) #And finally we save the file Excelwriter.save()
Voi’la, now go ahead to your directory and look into the languages_multiple.xlsx file. You are done :-).