In today’s short tutorial we’ll learn how to easily convert DataFrame columns to different types.
Let’s start by defining a very simple Data Frame made from a list of lists. You might want follow along by running the code in your Jupyter Notebook.
import pandas as pd
employee = ['Eric', 'Tom', 'John', 'Dean']
sales = [140.3, 140.2, 180.2, 167.1]
salary = [87, 92, 76, 98]
'create the dataframe
revenue = pd.DataFrame(dict(emp=employee, sales=sales, sal=salary))
Let’s look into the column data types:
revenue.dtypes
We’ll receive the following output:
emp object
sales float64
sal int64
dtype: object
Change column to integer
We’ll start by using the astype method to convert a column to the int data type. Run the following code:
# convert to int
revenue['sales'].astype('int')
Change column to float in Pandas
Next example is to set the column type to float.
revenue['sal'].astype('float')
Convert column to string type
Third example is the conversion to string. In a nutshell, you can accomplish that by using the following very simple snippet below:
revenue['emp'].astype('string')
Converting multiple columns to float, int and string
You can easily change the type for multiple columns, simply by passing a dictionary with the corresponding column index and target type to the astype method. We’ll persist the changes to the column types by assigning the result into a new DataFrame.
# putting everything together
revenue_2 = revenue.astype({'emp':'string', 'sales':'int', 'sal':'float'})
Now we can easily check the dtypes:
revenue_2.dtypes
And the result will be as expected:
emp string
sales int32
sal float64
dtype: object