The Neural Topic model is an unsupervised natural language processing model and is used for analyzing unstructured textual data. This model will produce similar field/widget outcomes as Auto-Topic Predictive but Neural Topic has a different methodology that can produce different results. See below for a summary of when to use this model, advantages of this model, and best practices.
When to use this model
Similar to our Auto-Topic predictive model, the Neural Topic model (NTM) is best used when the user does not have an understanding of what topics persist in unstructured data. This model should be run on data sets with a limited scope where fresh topics can emerge. Hourly, Daily, Weekly time frames will work well for this model. The NTM results will produce similar model outputs to auto-topic and will provide the opportunity for topic/sentiment widget creation. The following is available for the NTM data outputs:
- Topic wheel
- Sentiment Scores per verbatim
- Sentiment aggregations (sentiment average gauge)
The additional outputs will eventually be available as well:
- Summary & Sentence Highlight = “what sentence(s) highlights/summarizes the verbatim?”
- Expanded topic chunks = “what phrases (2-5 words) highlight the data set?”
- Summary of important docs = “what are the verbatims that provide the most impactful feedback?”
Advantages of NTM
The NTM method leverages upgraded data science neural network structures that provide the following improvements:
- Higher quality topic discovery
- Rich topic outputs
- Semi-supervised customization available
Trade-offs:
- Slow processing than Auto-Topic.
- Only available in English
Best Practices
Choose the right minimum frequency can greatly impact your topic discovery results. The Min_count setting in the advance setup is the minimum frequency requirement for a word to appear in the vocabulary for topics.
- Trade off of diversity vs quality:
- Default min_count=5 is suitable for most datasets.