How to create a Scatter plot in Pandas with Python?

Here’s how to quickly render a scatter chart using the data visualization Matplotlib library. This assumes that you have already defined X and Y column data:

import matplotlib.pyplot as plt
plt.scatter(x_col_data,y_col_data, marker = 'o');

Python scatter plots example – a step-by-step guide

Importing libraries

We will start by importing libraries and setting the plot chart:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
np.random.seed(10)
plt.style.use('ggplot')

Creating data for the Chart


We’ll define the x and y variables as well as create a DataFrame.

# define x and y
x = np.random.poisson(80,200)
y= np.random.poisson(32,200)

# Wrap x,y values into a DataFrame
my_data = pd.DataFrame.from_dict({'Duration':x, 'Cost':y})

Drawing a chart with Pandas

Once we have our DataFrame, we can invoke the DataFrame.plot() method to render the scatter using the built-in plotting capabilities of Pandas.

my_data.plot.scatter(x='Duration', y='Cost', title= 'Simple scatter with Pandas');

Here’s our chart:

Changing the plot colors

We can easily change the color of our scatter points.

# color change
my_data.plot.scatter(x='Duration', y='Cost', title= 'Simple scatter with Pandas', c='green');

Displaying the scatter legend in Pandas

We used the label parameter to define the legend text. Note the usage of the bbox_to_anchor parameter to offset the legend from the chart.

my_data.plot.scatter(x='Duration', y='Cost', title= 'Simple scatter with Pandas', label= ['Trip duration', 'Trip Cost']).legend(bbox_to_anchor= (1.02, 1));

Rendering a Plot with Matplotlib

Matplotlib offers a rich set of capabilities to create static charts. Here’s a simple example/


import matplotlib.pyplot as plt
plt.scatter(x,y)
plt.title('Simple scatter with Matplotlib');
plt.xlabel('Duration')
plt.ylabel('Cost');

Change the marker type and size

We can easily modify the marker style and size of our plots.

# marker style and size
plt.scatter(x,y, marker = 'x', s=70 );
plt.title('Scatter example with custom markers');

Adding a legend to the chart

You are able to display the legend quite easily using the following command:

plt.legend();

Scatter plot in Python with Seaborn

For completeness, we are including a simple example that leverages the Seaborn library (also built on Matplotlib).

Note that you will need to ensure that the Seaborn library is installed as part of your Python development environment before using it in Jupyter or other Python IDE.

import seaborn as sns

x = my_data['Duration']
y = my_data['Cost']
ax = sns.scatterplot(x=x, y=y)
ax.set_title('Scatter example')
ax.set_xlabel('Duration')
ax.set_ylabel('Cost');

Here’s our chart: