Dplyr() is the data manipulation package in R and is described in detail here.
Problem:
Use of Dplyr Summarise() in R
Solution:
Summarise() reduces multiple values down to a single summary.
Example:
For this example, we can use the data sets related to flights that departed from NYC in 2013.
We have installed the dataset in our previous post and you can get the code here.
We have installed the dataset in our previous post and you can get the code here.
![]() |
Flights |
Summary Functions in R:
min(), max()
|
Minimum and Maximum Values
|
mean()
|
Mean Value
|
median()
|
Median Value
|
sum()
|
Sum Of Values
|
var(), sd()
|
Variance and Standard Deviation of a Vector
|
first()
|
First Value in a Vector
|
last()
|
Last Value in a Vector
|
nth()
|
Nth Value in a Vector
|
n()
|
The Number of Values in a Vector
|
n_distinct
|
The Number of distinct Values in a Vector
|
Summarise() Code:
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(nycflights13) | |
library(dplyr) | |
# Check the Dataset flights | |
View(flights) | |
#----------------------------------------------------------------- | |
# Summarise data to calcuate mean of all flight travel hours and total record count | |
flight_mean_hour<- summarise(flights,mean_hr = mean(hour),record_count = n()) | |
View(flight_mean_hour) | |
#----------------------------------------------------------------- | |
# Summarise data to find out the count of distinct Origin airports | |
flight_dist_orgin <- summarise(flights,distinct_origin = n_distinct(origin)) | |
View(flight_dist_orgin) |
![]() |
Flight_mean_hour |
![]() |
Flight_dist_origin |
Thank You!