Replace a Substring with another Substring in R

Problem:

Pattern Matching and Replacement of an occurrence of a Substring in R with some other Substring.

Solution:

The Replacement Functions in R are as below:

  1. sub(): Replaces the first occurrence of a pattern
  2. gsub(): Replaces all occurrence of a pattern

The syntax for the sub() and the gsub() Functions:

******************************************************************************

sub(pattern, replacement, x, ignore.case )

gsub(pattern, replacement,x, ignore.case)

******************************************************************************

pattern – A Regular Expression to search for i.e. the old substring that needs to be replaced – A new substring to replace the occurrence (or occurrences for gsub) of the pattern.

X – A Character Vector where the Matches are sought
Ignore.case – Ignore the case during replacement, when TRUE
A ‘Regular Expression’ is a pattern that describes a set of strings. It is useful for Cleaning Data in R.

Syntax Of Regular Expressions:

The position of pattern within the string:

  • ^: matches the start of the string.
  • $: matches the end of the string.
  • b: matches the empty string at either edge of a word. Don’t confuse it with ^ $ which marks the edge of a string.
  • B: matches the empty string provided it is not at an edge of a word.
The Quantifiers specify how many repetitions of the pattern.
  • *: matches at least 0 times.
  • +: matches at least 1 times.
  • ?: matches at most 1 times.
  • {n}: matches exactly n times.
  • {n,}: matches at least n times.
  • {n,m}: matches between n and m times.
Examples:

Sub() Output

sub() function has replaced only the first occurrence of the term “Datalogy”.

gsub() function has replaced both the occurrence of the term “Datalogy” and even the case is ignored.

gsub() Output
0

Leave a Reply

Your email address will not be published. Required fields are marked *