Solved: Text Analytics - Topic Detection

KhatiwadaR · ‎05-23-2024

Hello community members,

I just got Microsoft Fabric subscription and learning about the different paradigm of the Text Analysis. I liked the Sentiment Analysis part of the Azure Cognitive Services in Power BI. But I am looking more on Topic Detection on textual data. I saw Key Phrases Extraction, and the Word Cloud visualization, but that's not enough for me. Is there any way I can cluster the text data/comments to certain groupings/categories (lets say 5 groups) that talks about the similar topic/theme within a group and different among the groups (just like cluster analysis in structured data)?

Any concept is highly appreciated !!

Thank you,

Ram

Anonymous · ‎05-23-2024

Hi @KhatiwadaR ,

For clustering text data into groups based on similarity in topics or themes, you're essentially looking to perform a form of unsupervised machine learning known as topic modeling. Power BI’s Text Analytics currently does not support the cluster analysis function. However, I think you can achieve this by integrating Azure Cognitive Services with other tools and services, like Python.
For example:
Use the Key Phrase Extraction feature of Azure Cognitive Services to identify the main points in your text data.
What is key phrase extraction in Azure AI Language? - Azure AI services | Microsoft Learn
Convert your text data into numerical vectors using techniques such as TF-IDF (Term Frequency-Inverse Document Frequency) or word embeddings. Azure Machine Learning or libraries like scikit-learn in Python can be used for this purpose.
Azure Machine Learning documentation | Microsoft Learn
With your data now in a numerical format, apply a clustering algorithm like K-Means to group similar texts together. The number of clusters (K) can be set based on your requirement (e.g., 5 groups). Azure Machine Learning provides support for various clustering algorithms.
Finally, visualize the clusters using tools like Power BI to interpret and understand the common themes or topics within each group.

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

Anonymous · ‎05-23-2024

Hi @KhatiwadaR ,

For clustering text data into groups based on similarity in topics or themes, you're essentially looking to perform a form of unsupervised machine learning known as topic modeling. Power BI’s Text Analytics currently does not support the cluster analysis function. However, I think you can achieve this by integrating Azure Cognitive Services with other tools and services, like Python.
For example:
Use the Key Phrase Extraction feature of Azure Cognitive Services to identify the main points in your text data.
What is key phrase extraction in Azure AI Language? - Azure AI services | Microsoft Learn
Convert your text data into numerical vectors using techniques such as TF-IDF (Term Frequency-Inverse Document Frequency) or word embeddings. Azure Machine Learning or libraries like scikit-learn in Python can be used for this purpose.
Azure Machine Learning documentation | Microsoft Learn
With your data now in a numerical format, apply a clustering algorithm like K-Means to group similar texts together. The number of clusters (K) can be set based on your requirement (e.g., 5 groups). Azure Machine Learning provides support for various clustering algorithms.
Finally, visualize the clusters using tools like Power BI to interpret and understand the common themes or topics within each group.

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.