How to plot string data in Python with Pandas and Matplotlib?

Plot string data in x axis in Python

Assume that we have the following two Python lists:

team = ['A', 'B', 'C', 'D']                   # list of strings
revenue = [140, 170, 180, 145]      # list of integers

We would like to create a chart based on the string data in the team list. We will use the matplotlib library to render our chart. The team list will be plotted in the x axis and the revenue list in the y axis.

import matplotlib.pyplot as plt

fig, ax = plt.subplots()
ax.plot(team,revenue)
ax.set_title ('Revenue by Team');

This will render the following chart:

Plot Python strings as numbers

Let’s now look at the following two lists of string objects:

team = ['A', 'B', 'C', 'D']
sales = ['140', '170', '180', '145']

We will try to render a bar chart using the Data:

fig, ax = plt.subplots()
ax.bar(team,sales);

Here’s is the result – the sales list elements were interpreted as strings instead of as numeric data, which doesn’t make much sense.

Related: No numeric data to plot in Python and Pandas.

We can create a new list using a list comprehension and plot it vs the teams list:

# define new list of integers
sales_numbers = [int(e) for e in sales]

# plot
fig, ax = plt.subplots()
ax.bar(team,sales_numbers)
ax.set_title ('Sales by Team');

Here’s the result:

Render a chart from a DataFrame containing strings

As we already learned, the pandas Data Analysis library encapsulates plotting capabilities from matplotlib that allows us to quickly render charts while wrangling your data. We will first import the pandas library, then create a DataFrame. Next step will be to use the plot() DataFrame function to render our chart.

import pandas as pd
sales_df = pd.DataFrame (dict(team=team, sales_numbers = sales_numbers))
sales_df.plot();