It is often said that the majority of the web consist of unstructured data – data such as images, videos and text. Companies looking into big data science nowadays, tend to focus mainly on their own, structured data. This is no surprise, because analyzing unstructured forms of data unknown to an organization is an order of magnitude harder than analyzing structured data where we usually have domain experts within reach to explain. But leaving out unstructured data actually means leaving out tremendous opportunities, often underestimated because of unfamiliarity with the possibilities. Let me explain how analyzing unstructured data – textual data in particular – can help you by highlighting some of the possibilities.
Communication is 70% posture, 20% intonation and only 10% is comprised by the actual content. Every organization has customers out there (in a B2B setting, a customer is simply a business) that they communicate with. Similarly, since we are all humans, we also communicate with each other within an organization. The only digital residue – data – we get from this communication is the actual content, being 10% of the entire message. Still, we can learn a great deal from just this 10%.
Our Emotional Customer
Obtaining sentiment from text – called sentiment analysis – is already going far beyond just measuring positivity or negativity and dives deep into layers of emotions and affect. While we often like to think that our communication is rational, it often isn’t – it is well understood that even news articles are mostly subjective to different extents.
Learning the deeper layer of sentiment that exists around your organization, brand or product allows to better help your customers. Ambassadors can be found by going after those that express trust and anticipation out of free will. Churn can even be detected by looking at emotions such as anger and negative surprise.
In our increasingly digital era, not only does it help to better understand your customers’ needs, they will even refuse to accept you not doing so. The power of markets is shifting from vendors and providers to the consumers, the masses and they don’t accept not being understood.
Still Doing NPS Wrong?
NPS is an abbreviation for the Net Promotor Score, most importantly capturing the likeliness an existing customer or user of your product or services would recommend you to a peer. All forms of criticism aside, the NPS is a widely adopted metric for measuring satisfaction.
The NPS is usually measured as part of a larger survey consisting of many questions with some form of limited answer capabilities. Surveys are good for getting answers to questions with a finite amount of options but are flawed in almost all other scenarios. After all, these forms of questioning force a respondent to make an opinion and one that can only consist of the answer space provided. It doesn’t take an expert to realize this leads to misperceptions about the actual opinions.
Instead of feeding your customers with tedious surveys that limit them in their response, why not just ask one single question: ‘What do you think of us’? And why not allow a customer do this at any given point in time instead of annually. Simply provide a free form text field and let your customer decide on his own what his opinion is. Of course this question is just an example question but you get the picture.
Using text analysis methods, all sorts of information can be derived from the answer to this one question. Don’t be fooled, even if the answer is very short – think one to two sentences – we can still get information such as the sentiment, who the respondent is, what aspects are talked about and in the case of lengthy monologue, a compact summary. Stop limiting your customers with what you think matters to them and let them decide on their own.
Stylometry – A Textual Fingerprint
Knowing the sentiment of your customers gives us an insight into what people think, but it doesn’t tell us who those people are. Of course there are obvious demographics that we often know about our own customers – but actually not about an anonymous web user that does still communicate with us – such as gender, age and location, but what about more extensive and latent traits? What does your customer really want, what does he really need?
From psychological point of view, people value different aspects of services or products based on their own personality and preferences. Many models try to explain why one person goes for the cheapest pick whereas the other always sticks to single brand based on who those people are. While it may seem farfetched, it is actually possible to learn such traits from what a person writes alone.
The field of stylometry assumes that what and how we write as an individual defines us – maybe even uniquely, like a fingerprint. From simply a piece of written text, we can learn about the person behind this text – what is his education level, is this a proactive or a reserving person, but also the demographics such as gender, age and location can be derived. More importantly so, we can also create a psychological model and know what this person intrinsically values and what his current state of mind is, all from a simple piece of text.
English Is Not the Only Language
The text analysis applications we talked about are just some examples of the potentials of analyzing texts. A great deal of research and work has been done in this area lately, but most of this work focuses on English only. Even if other languages are considered, often the practical application is a tough one because many methods rely on some form of linguistic resources that are only readily available for a handful of languages.
We at UnderstandLing think that focusing mainly on non-English languages is where most ground can be gained. Our unique text analysis methods allow to perform any text anlysis task – think of the ones in this post – in any language, at a performance level equal to those state-of-the-art methods focusing on English, without having all the resources that the English language has available in the other languages. Think of language such as Spanish, German, French, Dutch, Norwegian or Swedish, just to name a few but of course English too.