Imagine, you are launching a new design for some product. How do you know that a new design is better than the previous one? Or maybe you are trying to upgrade your application. But what exactly you should change in it? What if your clients won’t like these changes? The obvious answer to these questions is – find out what your clients think about your product. People are expressing their opinions all the time. Data is all around you - in online reviews, blogs, forums, and social media. And this is where opinion mining comes to stage.
Opinion mining or sentiment analysis is a way to understand opinion from written language, i.e. a process of teaching a machine how to do it automatically. It is a part of Natural Language Processing (read more about it in our previous article) and it analyses the subjectivity of text, including the general attitude of person towards a given topic. In this series of articles, we’ll try to bring sentiment analysis closer to you and tell you everything needed to get maximum benefit from it.
What is an opinion? And how do we mine it?
Every text information can be roughly classified into two main categories: facts and opinions. A fact is an objective expression about something, an independent description of real state without any alternation made by the person stating it. While an opinion is usually a subjective expression of personal attitude, impression, and feeling toward some topic. Fact can be proven, and opinion becomes more valuable if a fact supports it.
If you have some review, how can you find out what this person thinks without actually reading a text? Well, you can teach your computer how to do it, how to detect text as positive or negative! Therefore, opinion mining is no more than a text classification problem (we talked about it in our previous article), where each sentence can be classified by subjectivity (objective or subjective) and by polarity (positive, negative or neutral).
Sometimes, you can handle it as a regression problem and assign polarity value to a sentence ranging from -1 (very negative) to 1 (very positive). You can apply sentiment analysis on three levels: document (sentiment of entire document), sentence (sentiment of a single sentence) or sub-sentence (sentiment of sub-expression within a sentence). Everything depends on what kind of training data you have and what do you want to achieve.
Types of sentiment analysis
With sentiment analysis you can focus on different things in a text: polarity, emotions (happy, sad, etc.) or intentions (like interested vs. not interested). In this article, we’ll cover the most important ones.
Polarity sentiment analysis
Polarity analysis considers the amount of positive and negative terms that appear in a given text. It’s excellent for brief segmentation and quite simple to implement, but not so good for advanced insights. Fine-grained sentiment analysis is the most popular subtype with a total of five categories: very positive, positive, neutral, negative and very negative. It also can be mapped into 5-star rating: 5 = very positive, 1 = very negative.
Emotion detection aims to identify emotions and feelings like happiness, frustration, anger or sadness. These systems use lexicons or complex machine learning algorithms. Lexicons, or just list of words, are much simpler to apply with a drawback of handling extensive word varieties. As we all know, people can be quite creative and vocal when they express their emotions. Some words that typically express anger like kill (e.g. “Your customer support is killing me!”) might also express happiness (e.g. “You are killing it!”).
Topic-based sentiment analysis
Sometimes you want to know not only if users like or dislike your product, but also which features they like or dislikes. Topic-based sentiment analysis systems receive as input a set of texts (like product reviews or messages from social media) discussing a particular entity (e.g. a new model of a mobile phone). The systems attempt to detect the main (the most frequently discussed) features (e.g. battery, screen) and to estimate the average sentiment of the texts per feature (e.g., how positive or negative the opinions are on average for each aspect).
Intent analysis detects what people want to do with a text rather than what people say with that text. It can be any intention such as the intention to sell, to purchase or intention to complain. In a field of customer handling, there is a need to understand the customers’ intentions behind the text and which actions will follow. This analysis can lead to many benefits for the company or the consumers; e.g. help you understand the customers’ feedback or to improve providing service. Most of the times, the intended action can be inferred from the text, but sometimes, it requires some contextual knowledge.
That’s all for now, in our next article, we’ll go through problems of opinion mining.