Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
chawong
New Member

Cluster algorithm

For the "automatically find cluster" capability in PowerBI desktop, what type of algorithm/logic is being used to arrive at the clusters? Trying to understand how the tool is arriving at the clusters based on parameters I input. Thanks

 

1 ACCEPTED SOLUTION
v-sihou-msft
Microsoft Employee
Microsoft Employee

@chawong

 

For details about the Microsoft Clustering Algorithm, please refer to article below:

 

https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm

 

Regards,

View solution in original post

6 REPLIES 6
v-sihou-msft
Microsoft Employee
Microsoft Employee

@chawong

 

For details about the Microsoft Clustering Algorithm, please refer to article below:

 

https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm

 

Regards,

@v-sihou-msft - From the technical link in the link you provided:

The Microsoft Clustering algorithm provides two methods for creating clusters and assigning data points to the clusters. The first, the K-means algorithm, is a hard clustering method. This means that a data point can belong to only one cluster, and that a single probability is calculated for the membership of each data point in that cluster. The second method, the Expectation Maximization(EM) method, is a soft clustering method. This means that a data point always belongs to multiple clusters, and that a probability is calculated for each combination of data point and cluster.+

You can choose which algorithm to use by setting the CLUSTERING_METHOD parameter. The default method for clustering is scalable EM.

 

https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/microsoft-clustering-algorithm-te...

 

So, does Power BI use K-Means or EM? Sounds like it is likely EM if that is the default.



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler did you ever get an answer to this? It's still not obvious which clustering method is used in 'Automatically Find Clusters' in Power BI, or how to actually change it.

 

Any ideas @v-sihou-msft

 

I'm not sure this question is solved yet. 

webportal
Impactful Individual
Impactful Individual

@B_Real I believe it is K-means clustering because it doesn't work when you try to cluster using categorical variables.
K-means uses averages to determine cluster centroids, so therefore only numerical values are accepted.

Hope this helps, I also couldn't find any specific documentation about this.

Hi Guys,

 

I have customers clustered in power bi based on the margin. When I did clusterng, I had date filter = this month. Now, when I change date filter to this year, it is not reclustering. This month clustering grouped date into 5 groups and total of 461 customers. This year still shows 461 which I know is incorrect.

 

See images below. Is there anyhting I can do to ensure it reclusters once filter changed to any other date?

 

this month.jpgthis year.jpg

Greg_Deckler
Community Champion
Community Champion

My guess would be K-Means.



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
DAX For Humans

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Top Solution Authors