Dplyr in R

dplyr is a package in R that can be used for data exploration and data manipulation and data cleaning. It provides some great, easy-to-use functions that are very handy and the performance is super fast on Data Frames.

Below are the important functions of dplyr:
Functions
Descriptions
mutate()
Allows adding new columns at the end of the data frame that are functions of existing columns.
Check in detail here.
select()
Allows to pick only the required columns from a huge dataset with thousands of variables based on the names of the variables.
Check in detail here.
filter()
Allows to subset observations based on their values.
Check in detail here.
arrange()
Changes the ordering of the rows.It takes a data frame and a set of column names (or more complicated expressions) to order by the results.
Check in detail here.
summarise()
Reduces multiple values down to a single summary.
Check in detail here.

Installation:
1. One way of installing the dplyr package is to install the whole tidyversepackage
 install.packages("tidyverse")
2. The other way is to install the dplyr package alone
  install.packages("dplyr")
In RStudio, there is a cheatsheet for Dplyr and Tidyr. Below is the path:

The document is downloaded and added here for easy reference.

Thank You!

0

Leave a Reply

Your email address will not be published. Required fields are marked *