To remove the first column in your R DataFrame you can write the following code in your R editor (RStudio, Jupyter etc’):
# with r base
my_df_subset <- my_df[2:ncol(my_df)]
# with dplyr / tidyverse
library(dplyr)
my_df_subset <- my_df %>% select (-1)
I suggest to go through the following practical examples to learn more about this very fundamental capability in R.
Example DataFrame
We will start by creating a simple R DataFrame that you can use to follow along these examples.
month <- c ('November', 'July', 'December', 'July', 'September', 'June')
office <- c ('Paris', 'Toronto', 'Los Angeles', 'Bangkok', 'Hong Kong', 'Bangalore')
language <- c ('Python', 'Javascript', 'R', 'Java', 'Javascript', 'R')
salary <- c (127.0, 124.0, 166.0, 134.0, 94.0, 121.0)
interviews_data <- data.frame (month = month, office = office, language = language, salary = salary)
Let’s take a look at our DataFrame columns:
print (names(interviews_data))
This will return the following output:
[1] "month" "office" "language" "salary"
Our goal will be to remove the month column from the DataFrame.
Delete the first column in R DataFrame
There are several ways to get rid of the leftmost column in our DataFrame / matrix.
Subset R column by index
In this example we use the ncol function representing the number of columns in our DataFrame to slice this required columns (all except the first)
interviews_subset <- interviews_data[2: ncol(interviews_data)]
We can accomplish a similar result using the length function, which in R represents the number of columns in the DataFrame.
interviews_subset <- interviews_data[2: length(interviews_data)]
Subset R DataFrame by column name
In a similar fashion we can filter out a column by name:
interviews_subset <- interviews_data[!names(interviews_data) %in% 'month']
Using the omit function
Drop first column with tidyverse / dplyr
In dplyr we use the select function in order to subset columns. In this case:
library(dplyr)
interviews_subset <- interviews_data %>% select (-1)
or alternatively, all columns from the second one onward:
library(dplyr)
interviews_subset <- interviews_data %>% select (seq(2,ncol(interviews_data)))
Also here we can also filter by column name:
library(dplyr)
interviews_subset <- interviews_data %>% select (-month)
Drop first n columns in R
Similarly we are able to subset out the first n columns of the frame. In this example we’ll show how to select all but the first leftmost two columns:
With r-base:
rows_to_remove = 2
interviews_subset <- interviews_data[(rows_to_remove+1) : ncol(interviews_data)]
With dplyr / tidyverse:
library(dplyr)
rows_to_remove = 2
interviews_subset <- interviews_data %>% select (seq((rows_to_remove+1) ,ncol(interviews_data)))