In this short Data Analysis tutorial we’ll learn how to use Python in order to access the first column of a Pandas DataFrame object so we can further process it as needed.
Creating an example DataFrame
import pandas as pd
hiring_dict = {'month' : ['Jan', 'Feb', 'March', 'April'], 'salary':[140, 145, 145, 190], 'days_to_hire': [45, 34, 23, 22]}
hiring = pd.DataFrame (hiring_dict)
print(hiring)
Here’s our Data:
month | salary | days_to_hire | |
---|---|---|---|
0 | Jan | 140 | 45 |
1 | Feb | 145 | 34 |
2 | March | 145 | 23 |
3 | April | 190 | 22 |
Get the first DataFrame column without index
We are able to get one DataFrame column into a Series using the Pandas brackets notation:
month_s = hiring['month']
print(month_s)
0 Jan 1 Feb 2 March 3 April Name: month, dtype: object
As mentioned before, the result is a Pandas Series object.
type(month_s)
The result will be:
pandas.core.series.Seriespandas.core.series.Series
Select the first column by index (without name)
We can easily get a column or more by index using the iloc indexer:
month_s = hiring.iloc[:,0]
In order to subset the first two columns we’ll make a very small tweak to our Python code:
subset = hiring.iloc[:,0:2]
The result in this case will be a DataFrame containing the two leftmost columns
Get a Pandas DataFrame index
Next, we’ll learn how to retrieve a Pandas DataFrame index.
hir_idx = hiring.index
We can now export the index to a list or to a Numpy array:
my_lst = hir_idx.to_list()
#or
my_array = hir_idx.to_numpy(my_array)
Get the first column values to a Python list
In a similar fashion we can get and export the first column of our DataFrame to a list object:
salaries = hiring['salary'].to_list()
Select the first row of our Pandas DataFrame
For completeness, here’s how you can get the first DataFrame row. For more information read our select first DataFrame row tutorial.
hiring.iloc[0]