Fascinating introduction to ggplot2() in R

A visual or graphical representation of a dataset can be intriguing. Before getting into any formal methods, it helps one to form an intuition about the important characteristics and statistical properties of a data set.

ggplot2() is the perfect go-to tool for the same.

What fascinates me about ggplot2 is its versatility. We can make a simple plot or we can keep on adding layers, themes, scales, coordinates, and facets with a + and thus enhance the plot.

ggplot2 allows us to create graphs that represent both univariate and multivariate numerical and categorical data in a straightforward way. Grouping can be represented by color, symbol, size, and even transparency.

Installation:

The easiest way to get ggplot2 is to install the whole “tidyverse” package.

install.packages("tidyverse")

Alternatively, to install the ggplot2 package alone, we can use the following:

install.packages("ggplot2")

And then to load it, we have to use the following:

library(ggplot2)

With ggplot2, we begin a plot with the function ggplot(). So, ggplot() creates the coordinates and then we can add the layers. For reference, you can also check the ggplot2 guidelines or the cheatsheets.

How to get Data?

We need to practice Data Visualization with some real data. Now, to practice on some real data, we actually need some real data set.

Let’s work with the famous Titanic data set from Kaggle. The steps to import the Dataset in R is explained here.

Prerequisite:

Let’s load the data and clean it to remove the NA values and also let’s add some basic features.

Now, the data is clean and ready to play with. Let’s have a look at the new data structure.

As a beginner, for day-to-day activities, I have seen and used the below plots. I will keep on adding here as and when I learn about a new one.

  1. Scatter Plot
  2. Jitter Plot
  3. Histogram
  4. Bar Plot
  5. Box Plot
  6. Line Chart
  7. Area Chart
  8. Heat Map

Thank You!

0

Leave a Reply

Your email address will not be published. Required fields are marked *