Stratifyd offers a suite of data science models used to analyze different types of data. Below is a list of core models and their intent:
- NLU Model – gain an understanding of unknown topics in an unstructured data set
- Neural Sentiment Model – gain an understanding of the sentiment of an unstructured data set
- Taxonomy Model – categorize data based on a known understanding of topics that appear in an unstructured data set
- Geo Tag Model – map geographical data to a variety of geo categories (City, Country, Long/Lat, etc).
- AutoLearn Custom Models – Used to predict the result of an important business outcome (revenue, NPS, CSAT, churn, etc).
Several models can be used on one data set, but not all of them were designed for the same type of analysis. This article will discuss how to effectively leverage an NLU model for short-term analyses that can build upon a taxonomy for historical tracking.
An NLU (Natural Language Understanding) analysis uses an unsupervised learning method which produces results without any bias from the user. This analysis is most appropriate when the user does not have an understanding of what topics should appear from an unstructured data set.
In general, this model is most effective when a user has little to no understanding of the topics contained in their unstructured data. The majority of the time the end user will understand the topics that appear over a long period of time. For example, a banking cx analyst would expect to see topics like: “credit card”, “interest rate”, etc. in their verbatims consistently. These topics will dominate the analysis and the dashboard visualizations. If the analyst shrinks the window of time, he/she can gain an understanding of new topics they do not expect. For example, one day’s worth of topics might expose problems with a log-in page for certain mobile device users. For this day we might see “android log-in”, “can’t login”, “android bug”, etc. displayed in the NLU topic visualizations. If we look over a longer window of time (month, quarter, year), the standard topics (“credit card”, “interest rate”, etc.) will displace the “log-in” topics in visualizations.
The images below demonstrate an example of how an NLU model results can change based on the size of the data. Each screen shot was created using public mobile app reviews for the amazon prime video app.
Topic Wheel Comparison
One week vs One Year – Unique Value
In this dashboard the widgets on the left are made from a stream of just one week’s worth of data. The widgets on the right are made out of one year’s worth of data. In the highlighted red boxes, you can clearly see unique topics appear in the weekly stream. Specifically, in a week in February an “Xray feature” was released and customers communicated positive feedback. Having a dedicated weekly stream makes it easier for this type of topic discovery.
One week vs One Year – Common Topics
In this dashboard the widgets on the left are made from a stream of just one week’s worth of data. The widgets on the right are made out of one year’s worth of data. Notice the highlighted blue boxes in the weekly visuals contain the same topics captured over the course of the year. The smaller stream allows the user to discover new topics faster and can also keep track of reoccurring topics. This is best demonstrated in the next set of visualizations.
One week vs One week – Delta of Topics
In this dashboard the widgets on the left are made from a stream of just one week’s worth of data. The widgets on the right are made out of the following weeks’ worth of data. These streams allow the user to track how common topics are trending over a certain period of time. In this example it is easy to see that “chromecast support” was needed more in the second week than the first week. Having a side by side display of NLU models over a smaller time period allows for rich insights and clear delta tracking. Pre-defined filters can be setup to make data analysis quicker for delta reporting and discovering new topics that should be added to models that use pre-defined logic, like a taxonomy.
A Taxonomy analysis uses pre-defined logic from the user to produce tracking on known topics. This analysis is most appropriate when the user has a good understanding of what topics should appear from an unstructured data set. The model is designed to be iterated on as users discover more topics.
For longer-term analysis it is best to use the taxonomy method to see how the most important topics are trending. This approach allows the user to focus on labels that is meaningful to their business without any noise.
One method of a long-term topic trend analysis is looking at a taxonomy label volume over time. See the widgets below for an example of how to capture this information.
To achieve the best taxonomy results a user should leverage intra-day or intra-week tracking on new topics with an NLU model. Once the new topics are discovered they can be added to the taxonomy logic with a new label. After enough new labels are added to a taxonomy the user can reprocess the analysis to display the new label results.
One method of using NLU results to impact taxonomy results is the Stratifyd drag and drop feature. See the example below where the user is pulling a newly discovered topic from the NLU topic wheel and adding it to the taxonomy on the right.
1. Step One
Find topics that need to be added to a taxonomy.
2. Step Two
Drag the topic over to the desired taxonomy label.
3. Step Three
Place the taxonomy into the desired label logic.
NOTE: Taxonomies are less computationally expensive than NLU models but deploying a taxonomy on millions of records can take several hours. A taxonomy with more labels and levels will take longer to process. When re-processing a taxonomy on a data stream with 1 MM+ records, it is best to run it over a weekend as to not impact any current work.