How to plot dictionary data with Python and Pandas?

In this tutorial we will explain how you can easily plot a dictionary with multiple values per key using the very powerful pandas Data Analysis library that is a very popular third party module (not included in the Python standard library).

Step #1: Import Pandas

First and foremost we will enable the pandas library. If Pandas is not installed in your Python development environment, you can easily install it and avoid module not found errors.

One installed, import pandas into your Python program:

import pandas as pd

Step #2: Create a dictionary

We’ll now create the dictionary containing the key value pairs that we would like to plot. We will transform the dictionary to a DataFrame objects. The keys will be transformed to the columns. The values, that are represented by list objects, will become the respective column values.

sales_dict = {
             'area' : ['B2B', 'Online' , 'Retail', 'B2C'],
             'direct' : [441, 463, 382, 409],
            'telesales' : [324, 201, 184, 285]
}

Step # 3: Create a DataFrame

Next is to initialize a DataFrame:

data = pd.DataFrame(sales_dict)
data.head()

Let’s look at our dictionary values:

areadirect_salestele_sales
0B2B441324
1Online463201
2Retail382184
3B2C409285

Step #4: Plot a Line Chart

The pandas DataFrame object has lots of useful built-in methods. One of those is the plot() method. We can use the latter in order to render a quick chart showing our DataFrame data:

data.plot(x='area', title = 'Direct vs Tele sales', colormap = 'viridis');

Here’s our simple line plot:

Step #5: Plot a Bar Chart

Plotting a bar chart is similarly easy:

data.plot(kind='bar', x='area', title = 'Direct vs Tele sales', colormap = 'viridis');

Here’s the bar graph:

Creating a stacked bar chart is also possible by passing the stacked=True parameter:

data.plot(kind='bar', x='area', stacked=True, title = 'Direct vs Tele sales', colormap = 'viridis');

And here’s the stacked plot:

Case Study 2 : Create charts based on two Pandas DataFrame columns

We will start by importing the pandas library and then create an example DataFrame.

import pandas as pd

# create data for the DataFrame
biz_date = pd.Series(pd.date_range(start='2/10/23', end = '2/23/23', freq='B'))
revenue = [4746, 3371, 4747, 4601, 4501, 3438, 2899, 3860, 3960, 3337]
expense = [2268, 2863, 3298, 2339, 3072, 2491, 3446, 2733, 2894, 2673]

#Construct the DataFrame

revenue = pd.DataFrame(dict(revenues = revenue, expenses =expense), index = biz_date)

Plot two columns in the same scatter chart

Creating a scatter with pandas is relatively simple. We’ll first build the chart, then assign a title to the scatter using the set_title() method. Make sure to designate your specific column names (as strings) to the x and y parameters.

perf_scatter = revenue.plot.scatter(x='revenues', y='expenses')
perf_scatter.set_title("Revenue vs Expenses");

Here’s our chart:

Draw columns against each other in a bar chart

An effective way to compare two key performance indicators is simply plot them on the same axis one against the other in a bar chart.

We first create the chart and assign it a custom colormap, we then tweak a bit the x axis labels to make them a bit more legible. Last we add a title.

perf_bar = revenue.plot.bar(cmap='Dark2')
perf_bar.set_xticklabels(d.strftime('%m-%d-%y') for d in revenue.index)
perf_bar.set_title("Revenue vs Expenses");

Here’s our bar chart:

Plot two columns on different figure axes

Last example is to draw the two columns on different axes of a figure.

import matplotlib.pyplot as plt

fig, ax = plt.subplots(1,2, figsize = (10,6));
# rotates the xticks by 45 degrees
fig.autofmt_xdate(rotation=45)
# titles
ax[0].set_title('Revenue')
ax[1].set_title('Expenses')

# draw line charts
ax[0].plot(revenue.index, revenue['revenues']) 
ax[1].plot(revenue.index, revenue['expenses']);

Plotting two columns as histograms

In the same fashion we can plot our columns as two side-by-side histogram charts:

import matplotlib.pyplot as plt

fig, ax = plt.subplots(1,2, figsize = (10,6));
# rotates the xticks by 45 degrees
fig.autofmt_xdate(rotation=45)
# titles
ax[0].set_title('Revenue frequency')
ax[1].set_title('Expense frequency')

# draw histograms
ax[0].hist(revenue['revenues'])
ax[1].hist(revenue['expenses']);

Here’s our chart:

Case Study 3: Drawing Line Plots based on Pandas DataFrames

We’ll start by defining a simple example DataFrame that you can use to follow along.

import pandas as pd

dates = pd.Series(pd.date_range(start='3/1/23', end = '3/14/23', freq='B'))
interviews = [37, 33, 31, 34, 39, 35, 36, 32, 30, 38]
hired = [16, 12, 18, 19, 10, 11, 15, 13, 14, 17]

revenue = pd.DataFrame(dict(interviews=interviews, hired=hired), index = dates)

Pandas line plot example

Let’s start by drawing a simple graph. We will use the DataFrame plot() method:


# define the line properties
kwargs = dict (linestyle='dashed', color='green', linewidth=1.2)
# x axes will by default show the df index
line_plot = revenue.plot( y = 'interviews', figsize= (10,6),**kwargs ) 
line_plot.set_title('Python programming daily interviews')
line_plot.grid()
line_plot.set_xlabel('Date')
line_plot.set_ylabel('Sales');

Here we go:

Line plot for multiple columns and lines with pandas

Most probably you’ll need to draw graphs that represents multiple columns. The trick is to pass a list of column names to the y parameter (marked in bold).

# multiple columns line plot

kwargs= dict (linestyle='dashed', color=['red', 'green'], linewidth=1.2)
line_plot = revenue.plot( y = ['interviews', 'hired'], figsize= (10,6),**kwargs ) 
line_plot.set_title('Python programming daily interviews vs hires')
line_plot.grid()
line_plot.set_xlabel('Date')
line_plot.set_ylabel('Sales');

Here’s the chart:

Line plots with markers

If we would like to draw marker on our line, we will need to modify the line parameters. We will use the marker parameter and pass the value ‘x’, ‘o’ or any additional marker style.

kwargs= dict (linestyle='dashed', color=['red', 'green'], linewidth=1.2, marker='x')

Groupby and line plot with pandas

In this section we would like to group our interview data by week using the DateTime index and then plot the result.

rev_week = revenue.groupby(revenue.index.isocalendar().week).agg(number_of_interviews= ('interviews', 'sum'))
rev_week_plot = rev_week.plot()
rev_week_plot.set_xticks([9,10,11]);

Here we go:

More questions/queries:

What are the plot components that i can customize?

You can customize the look and feel of several of your plot components such as : title, background color, size, fonts, line style, markers and legend.

Can i save the plot to a file instead of showing it in a screen?

Use the following command to save your plot to an image file (jpg,bmp,png):

data.plot.savefig('my_plot.png')

What if columns representing numbers contain text or string data?

Before plotting your data you will need to convert it to non numeric:

my_df['my_col'] = pd.to_numeric(my_df['my_col'], errors = 'coerce')