In this tutorial we’ll learn the basics of charting a bar graph out of a DataFrame using Python. If you are using Pandas for data wrangling, and all you need is a simple chart you can use the basic built-in Pandas plots.
Create Bar plot from Pandas DataFrame
Proceed as following to plot a bar chart in pandas:
- Create a pandas DataFrame from a file, database or dictionary.
- Use the DataFrame plot() method to define your chart.
- Customize your chart as needed by resizing it, add a legen, set your chart tile, set your axes ticks labels, etc’.
Importing data to the Python DataFrame
We’ll start with creating a pandas DataFrame by importing data using the read_csv() method.
import pandas as pd
my_df = pd.read_csv('interviews.csv')
print (my_df)
Here’s our DataFrame populated with some hr interview data:
language | first_interview | second_interview | |
---|---|---|---|
1 | Kotlin | 70.0 | 81.0 |
2 | VisualBasic | 84.0 | 75.0 |
3 | PHP | 82.0 | 91.0 |
4 | Python | 86.0 | 81.0 |
Make a Bar graphs from a DataFrame
If we want to create a simple chart we can use the df.plot() method. Note that we’ll use the kind= parameter in order to specify the chart type. Several graphs are available such as histograms, pies, area, scatter, density etc’.
my_df.plot(kind='bar' , x='language');
Here’s the early version of our very simple side by side bar chart:
Resize Pandas charts and adding a title
Modifying the Pandas chart size is easy using the figsize parameter. We are also use the title parameter to define the chart header content.
my_df.plot(kind='bar' , x='language', title='Interviews per month', figsize= (11,6));
Here’s the chart:
Making an horizontal Python graph
By using the barh chart type, we can transpose the graph rendering.
my_df.plot(kind='barh' , x='language', title='Interviews per month');
Stacked Python plot with Pandas
We can easily stack our bars chart as needed using the stacked parameter.
# stacked pandas bar graph
my_df.plot(kind='bar' , x='language', stacked=True, figsize= (11,6));
Here’s the chart:
Change the color of our Python chart
The DataFrame.plot() method also delivers capability to map the chart to a known color map (cmap). Here’s an example:
#change color maps
my_df.plot(kind='bar' , x='language', stacked=True, cmap='Dark2',figsize= (11,6));
And here’s the chart:
Modify the chart axes labels
We can also change the style of the axes labels. In this example we rotate the x-axis labels and enlarge the label font size.
my_df.plot(kind='bar' , x='language', stacked=True, cmap='Dark2', rot=30, fontsize=13, figsize= (11,6)));
And here’s the output:
Stacked columns plotting in Pandas – practical example
Step 1: Arrange your DataFrame
First step will be create a simple DataFrame containing the data that you would like to plot. In our case those would be random figures in a hiring campaign.
import pandas as pd
area = ['Java', 'R', 'Python', 'Javascript', 'Python']
applications = [150, 168, 158, 75, 98]
hired = [17,14, 17, 12, 15]
campaign = dict(area = area, hired = hired, applications = applications )
hrdf = pd.DataFrame(data=campaign)
hrdf.head()
Let’s loo into the data:
area | hired | applications | |
---|---|---|---|
0 | Java | 17 | 150 |
1 | R | 14 | 168 |
2 | Python | 17 | 158 |
3 | Javascript | 12 | 75 |
4 | Python | 15 | 98 |
Our goal will be to create a simple stacked plot showing the number of hired vs applicants for every programming language. Looking inot the DataFrame rows we can see that we have a couple of rows (indexes 2 and 4) that pertain to the Python area.
Hence we will first group our DataFrame rows by area:
hr_grp = hrdf.groupby(['area']).sum()
print(hr_grp)
Here’s our group data:
hired | applications | |
---|---|---|
area | ||
Java | 17 | 150 |
Javascript | 12 | 75 |
Python | 32 | 256 |
R | 14 | 168 |
Step 2: Render your bar chart with Pandas
Now, we’ll render our chart. As Pandas contains some basic matplotlib capability, we don’t need to import matplotlib or Seaborn to render simple charts.
hr_grp.plot(kind='bar', stacked = True, title= 'Hired vs applications by Area');
Here’s our chart:
Note the usage of the stacked=True parameter. Otherwise the columns will show up once next to the other.
hr_grp.plot(kind='bar', title= 'Hired vs applications by Area');
Step 3: Customize your stacked plot
Now it is time to fix up the loo and feel of our chart. We will change the chart color, re-position the legend and tide up our chart axis.
bar_chart = hr_grp.plot(kind='bar', stacked = True, cmap = 'viridis', title= 'Hired vs applications by Area') #1
bar_chart.legend(bbox_to_anchor= (1.3, 1)) #2
bar_chart.set_xlabel ('Programming Language', fontsize = 13); #3
bar_chart.set_ylabel ('Hired vs Applicants', fontsize = 13); #4
Here’s our nice stacked chart:
Explanation
- #1 – setting your chart color map.
- #2 – repositioning the legend
- #3 – setting the x label content and font
- #4 – setting the y label content and font