March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Description
Clustering is used to find similarity groups in your data. For example, the clustering algorithm and visual can automatically find customer segments, for which you can then optimize in your marketing campaigns.
Prerequisites (The sample .pbix files will not work without these prerequites completed)
1. Install R Engine
Power BI Desktop does not include, deploy or install the R engine. To run R scripts in Power BI Desktop, you must separately installR on your local computer. You can download and install R for free from many locations, including the Revolution Open download page, and the CRAN Repository.
2. Install the required R packages.
Download the R script attached to this message and run it to install all required packages on your local machine.
Required R packages:
cluster. car. scales, fpc, mclust, apcluster, vegan
Tested on:
CRAN 3.3.1, MRO 3.3.0, powerbi.com
Legal Disclaimers:
Terms of Service and Third Party Programs.
I recieved an error when trying to download the 'vegan: Community Ecology Package' on my desktop. Anyone else having this problem? I have reported the error.
Hi @Cking,
Try to install it from RStudio console with
>install.packages("vegan")
Still failure?
You mau comment out
#libraryRequireInstall("vegan")
Just don't use "long" method for number of clusters search
Hi @boefraty,
Thanks for the reply. I used RSstudio console and downloaded all packages without further trouble.
Chris
Good day. I installed all the prerequisites in my machine but it does not enable the option. Will I be missing something else?
As a Power BI author with only the most basic knowledge of R, I think the R Library is a great concept. However, actually making use of what's been posted on the site is extremely hard because so much code has been written into the examples, I have no idea which lines of code are actually driving the visual. When I pasted the code into a Word document, it was 18 pages in length, with 1851 words.
Is there any way to simplify what's in the library so that it's more comprehensible and usable?
Thanks,
Jeff
Hi,
The clustering code is too complicated, mostly because it contains the implementation of the the automatic mode and many parameters to create flexible visual. In addition we do a lot of data correctness testing...
The simplest code to start with is "corrplot". We recommend to look at the R code in dedicated R-IDE, like RStudio.
R is its own language and it takes most of us some time to learn how to do the basics in it, not to mention the more advance transformations and graphics. This code is very well commented with #Comments Here and that should give you a sense as to what each step is doing.
You might want to load the script into R and then watch the graphics pane as you parse the code line by line. The function plot(...) is where some of the work is done, and then the subsequent statements add to it according to the parameters defined at the beginning of the script.