Sentiment Analysis or Emotion Detection from Text is useful in many ways. To understand your customers or to integrate chatBots or analyze feedback and reviews, a machine can really help. Emotions are complex! Sometimes I really wonder whether it is a blessing or a curse.
My 5-year-old got an assignment from school to identify emotions from different pictures. There were 6 categories like below:
- Happy
- Sad
- Scared
- Frustrated
- Angry
- Surprise
Her first question was:
What is emotion?
Emotion, a complex experience of consciousness, bodily sensation, and behavior that reflects the personal significance of a thing, an event, or a state of affairs.
Encyclopedia Britanica
Now, that will be a bit much for a 5-year-old child, right? It is difficult for them to identify different emotions. Anger, Sadness, and Frustrations get mixed up all the time. Even, pride, love, and excitement get tangled. She just knows when she opens her favorite gift, or she answers all her math questions correctly or she gets a big hug, she feels HAPPY. And again, when a friend breaks her toy while playing, or she drops her treat while eating or when she wakes up at night after a bad dream, she feels SAD. So, she has only 2 colors for her emotions. It is easier to just broadly divide our emotions into 2 Categories:
- Positive Emotion – Happy, Excited, Surprised, Joy, Love, Pride
- Negative Emotion – Anger, Sadness, Frustration, Scare, Shame, etc.
In this project, we will also do the same. We will input a statement/text and will let our program identify its color.
Emotion Detection(Sentiment Analysis) from Text Input
In this project, I decided to use/practice Python Inheritance. Now, Inheritance allows us to define a class that inherits all the methods and properties of another class.
The parent class is the class being inherited from, also called the base class. The child class is the class that inherits from another class, also called derived class.
Approach:
I will train a Model with different Text Data and a Label (Positive vs Negative). Then I will Test the Model on some unseen data to get the accuracy of the Model. Finally, I will enter a random text to test the polarity of the statement. Thus, the Sentiment Analysis or Emotion Detection Model is an example of Supervised Learning.
Data:
I am using the Large Movie Reviews Dataset from Stanford for training the Model. The link of the Data is right here. This dataset is for binary sentiment classification. There are 25,000 highly polar movie reviews for training, and 25,000 for testing.
So, let’s analyze some sentiments from some text!
Now, before diving into Collecting Data and Building a Classification Model, I would like to try another approach.
Emotion Detection from an Input Text using TextBlob
I had posted another blog on Sentiment Analysis from Twitter Data using TextBlob where I was really impressed by the performance of TextBlob. Let’s take a look at how it works.
def textblob_Sentiment(text): print("Starting TextBlob Sentiment Analysis") from textblob import TextBlob from textblob.sentiments import NaiveBayesAnalyzer score = TextBlob(text,analyzer=NaiveBayesAnalyzer()).sentiment return score
Now, let’s take a Text Input and check it’s polarity.
input_sen = str(input("Enter Text: ")) print(" ") score = textblob_Sentiment(input_sen) print("Score: ", score)
Test Cases:
Example Text1: ” The Book was so boring that I slept after reading the first 2 pages and never opened it again.”
Expected Polarity: Negative
Actual Result: Negative
Example Text2: ” What a wonderful day!”
Expected Polarity: Positive
Actual Result: Positive
Not bad, right? We do not need to load data or spend some time on cleaning or on Data Modeling. The results are instantaneous. I actually love it. Now, how about some big text input?
Example Text3: ” A Christmas Together actually came before my time, but I’ve been raised on John Denver and the songs from this special were always my family’s Christmas music. For years we had a crackling cassette made from a record that meant it was Christmas. A few years ago, I was finally able to track down a video of it on Ebay, so after listening to all the music for some 21 years, I got to see John and the Muppets in action for myself. If you ever get the chance, it’s a lot of fun–great music, heart-warming and cheesy. It’s also interesting to see the 70’s versions of the Muppets and compare them to their newer versions today. I believe Denver actually took some heat for doing a show like this–I guess normally performers don’t compromise their images by doing sing-a-longs with the Muppets, but I’m glad he did. Even if you can’t track down the video, the soundtrack is worth it too. It has some Muppified traditional favorites, but also some original Denver tunes as well.”
Expected Polarity: Positive
Actual Result: Positive
Hmmm, impressive! Finally, let’s start building a Model and predict the sentiment polarity from an Input Text.
Emotion Detection from an Input Text using the Logistic Regression Model
Here, I will divide the whole task in 2 Classes to work on Inheritance.
Base Class: This Class will hold the common utilities like below:
- Load Training Data
- Load Test Data
- Data Preprocessing
- Data Tokenization
Child Class: The Child Class is inheriting from the base class and has the below funtions:
- Build a Model
- Check accuracy of the Model
Accuracy of the Logistic Regression Model is 88% which is really high. Now, let’s run some test cases.
So, all the emotions are identified correctly, as expected. I even tried a pinch of sarcasm and the result was accurate.
Although I chose the Logistic Regression Model for this classification task, I tried and tested some other Models. That code is here.
Thus, a simple Sentiment Analysis or Emotion Detection tool is pretty straight forward and accurate. But, human emotions or speeches are not that simple. How about a restaurant review like below:
“The food was great but the restaurant was smelly.”
Is that a positive review or a negative? Maybe, it’s a neutral one? The above Model identified it as a positive.
Where can I try sentiment analysis for free?
I would like to build a tool/app where I will input a text and sentiments behind the statement will be identified correctly. But for now, you can get started with the Stanford NLP or Natural Language Toolkit (NLTK) open-source distributions.
I hope you enjoyed reading this story. And if you did, feel free to clap for it.
Thank You!
20
Amirtha
Hi,
Great work! I am a research scholar and I have a doubt about emotion labelling.
Did the dataset was already labelled as positive and negative? Was that manually annotated? If we are working on a huge volume of data, is it possible to annotate them individually? What is the solution or code for the automatic annotation of labels for real world data?