tidyr package is used in R to “tidy” the data. Tidy data is the clean data that’s easy to work with since if the data is tidy, we can conclude:
-
Each variable is in a column.
-
And Each observation is a row.
-
Also, Each value is a cell.
Steps to make the messy data tidy:
-
Identify the variables in the dataset
-
Use the tools provided by tidyr to move them into columns.
tidyr provides three main functions for tidying the messy data:
Functions
|
Descriptions
|
gather()
|
Takes multiple columns, and gathers them into key-value pairs. Gather() is described in detail here.
|
spread()
|
Spread() takes two columns (key & value) and spreads into multiple columns. Spread() is described in detail here.
|
unite()
|
Merge two or more variables into one variable.Unite() is described in detail here.
|
separate()
|
Sometimes two variables are clumped together in one column. separate() allows to tease them apart. Separate() is described in detail here.
|
Installation:
1. One way of installing the tidyr package is to install the whole tidyverse package
install.packages("tidyverse")
2. The other way is to install the tidyr package alone
install.packages("tidyr")
In RStudio, there is a cheatsheet for Dplyr and Tidyr. Please follow the below path: