Problem: “NAs introduced by coercion” Error
Solution:
I started to build a Random Forest Model with the very old Titanic Dataset and I was stuck. I got the below Error:
What’s wrong?
Well, “NAs introduced by coercion” error occurs if there is some feature in the DataSet with class ‘char’ and we are are trying to build a Model using those features.
So first, let’s take a look at the current Dataset with glimpse() or str(). To use the glimpse(), we need the library “dplyr“.
As we can see, the variables “Sex” and “Embarked” are “char”. So, let’s convert these two variable into “factors”.
Factors are treated as integers internally whereas character fields are not. So, for this example, we will convert the below variables into Factors:
1. Survived
2. Pclass
3. Sex
4. Embarked
After Transforming the variables to Factors, the issue was resolved and I was able to develop my very first Random Forest Model for Titanic Dataset. The post is here.
If you are also working with the Titanic dataset, the below posts might interest you:
Keep coding and keep learning!
0