In this step by step tutorial i will explain how to quickly create a bar and a line chart from data stored in a json (Java Script Object Notation) file using pandas and matplotlib.
Step #1: Acquire your json file
In this example we will assume that we have already acquired the json file from an API or a legacy database.
This json file text, representing actual sales vs target by area is quite simple, here are its contents:
[{
"area": "North",
"target": 350,
"sales": 145
},
{
"area": "West",
"target": 320,
"sales": 113
},
{
"area": "East",
"target": 310,
"sales": 165
},
{
"area": "South",
"target": 300,
"sales": 147
}]
Step#2: Create a DataFrame from a json
Next step is to import the pandas libray and use the pd.DataFrame constructor to initialize a DataFrame.
import pandas as pd
# define path to the json in your file system
json_file_path = 'C:\Temp\Examples\sales_targets.json'
sales = pd.read_json(json_file_path)
Note: Make sure that your Json file text is correctly formatted. If your json includes invalid characters, Python will thrwo the following ValueError exception:
ValueError: Unexpected character found when decoding array value
We now have a pandas DataFrame. Let’s look into its contents:
sales.head()
area | target | sales | |
---|---|---|---|
0 | North | 350 | 145 |
1 | West | 320 | 113 |
2 | East | 310 | 165 |
3 | South | 300 | 147 |
Step#3: Plot a stacked column chart with pandas
Our next step is to render a chart. Luckily pandas DataFrames has a plot() method that allows us to do exactly that:
sales.plot(x = 'area', kind='bar', stacked=True, cmap = 'viridis');
Note: we have loads of tutorials on Python Data Visualization that covers the basic of plotting with pandas and matplotlib libraries.
Here’s the stacked chart we created based on the json contents:
Step#4: Create a multiple line plot with matplotlib
Our last step is to show how we can also use matplotlib to render our chart. In this case we would like to draw a multiple line plot with matplotlib.
fig, ax = plt.subplots()
ax.plot(sales['area'], sales['target'], color = 'green', label='Targets')
ax.plot(sales['area'], sales['sales'], color = 'blue', label='Sales');
ax.set_title('Sales vs Targets');
ax.legend();
Here’s our line plot: