The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
08-16-2023 20:12 PM - last edited 08-16-2023 20:48 PM
Description
With decision trees, you can visualize the probability of something you want to estimate, based on decision criteria from the historic data.
The decision tree classifier automatically finds the important decision criteria to consider.
Prerequisites (The sample .pbix files will not work without these prerequites completed)
1. Install R Engine
Power BI Desktop does not include, deploy or install the R engine. To run R scripts in Power BI Desktop, you must separately installR on your local computer. You can download and install R for free from many locations, including the Revolution Open download page, and the CRAN Repository.
2. Install the required R packages.
Download the R script attached to this message and run it to install all required packages on your local machine.
Required R packages:
rpart, rpart.plot, RColorBrewer
Tested on:
CRAN 3.3.1 R, MRO 3.3.0, , powerbi.com
Legal Disclaimers:
Terms of Service and Third Party Programs.
I am working with the decision tree chart (v1.0.1.0) in the latest version of Power BI (2.119.986.0 64-bit / July 2023). For some reason, the text at the bottom indicating Rel error, CVal error, Root error, and cp does not appear (despite having Additional parameters > show info toggled on). Any suggestions on how to get this information to appear in the chart? Thank you!
After continued investigation into the problem, I discovered that the "issue" is actually a design choice.
The decision tree visual in Power BI was designed for classification trees, not regression trees—although the visual certainly works for both (because the underlying R code is based on rpart, which supports both). How do I know this? When you add the visual to a Power BI report, you are prompted to enable "script visuals." If instead you click the "Review" button, you can actually see the underlying R code.
Not too far down, you'll come across these comments:
# WARNINGS: # This visual intended to be used for classification tasks. It was not tested for regression trees.
... And if you scroll down much further, you'll can find the code that prints out the error text at the bottom:
#info for classifier if( showInfo && !is.null(dtree) && dtree$method == 'class') pbiInfo <- paste("Rel error = ", d2form(opt$relErr * rooNodeErr), "; CVal error = ", d2form(opt$xerror * rooNodeErr), "; Root error = ", d2form(rooNodeErr), ";cp = ", d2form(opt$CP, 3), sep = "")
Notice the first line of the "if" statement: "... dtree$method == 'class'. As a result of this conditional, the visual will only show the error text at the bottom if the tree is a classification tree—and not a regression tree.
(By the way: you can also review the full text of the R code on GitHub: https://github.com/microsoft/PowerBI-visuals-decision-tree/blob/master/script.r)
If you need to see the error values, there are at least a few solutions to this problem. The first would be to change your tree from a regression tree to a classification tree by converting your target variable from a continuous variable to a categorical variable. Once you do this, the error values will show up at the bottom.
A second solution, of course, would be to build your decision tree directly in R, or create your own custom R visualization in Power BI. In both cases, however, the text at the bottom still will not appear by default; you would need to manually generate that text by referencing the error values (as is done in the published decision tree chart visual). If you are interested in going down this path, here are a few links that will help you:
Hello,
I downloaded the Decision Tree Chart not long ago, it was very useful for our purposes at PSU; however, now my boss cannot download this. I've done some looking around and can't seem to find it anywhere, has this been discontinued? Thank you
I'm very interested in using this visual. However, I keep getting this error... any help would be appreiciated. I don't have much R experience but did use SPSS for several years so the opportunity to bring similar tools was exciting.
Feedback Type:
Frown (Error)
Timestamp:
2021-09-03T15:01:58.9672630Z
Local Time:
2021-09-03T11:01:58.9672630-04:00
Session ID:
0e75c987-d3a5-4196-94e5-3beee65e93a5
Release:
August 2021
Product Version:
2.96.1061.0 (21.08) (x64)
Error Message:
R script error.
Loading required package: rpart
Loading required package: rpart.plot
Warning message:
package 'rpart.plot' was built under R version 4.0.5
Loading required package: RColorBrewer
Error: package or namespace load failed for 'RColorBrewer':
package 'RColorBrewer' was installed before R 4.0.0: please re-install it
Warning message:
In libraryRequireInstall("RColorBrewer") :
*** The package: 'RColorBrewer' was not installed ***
Error: package 'RColorBrewer' was installed before R 4.0.0: please re-install it
Execution halted
Stack Trace:
Microsoft.PowerBI.ExploreServiceCommon.ScriptHandlerException: R script error.
Loading required package: rpart
Loading required package: rpart.plot
Warning message:
package 'rpart.plot' was built under R version 4.0.5
Loading required package: RColorBrewer
Error: package or namespace load failed for 'RColorBrewer':
package 'RColorBrewer' was installed before R 4.0.0: please re-install it
Warning message:
In libraryRequireInstall("RColorBrewer") :
*** The package: 'RColorBrewer' was not installed ***
Error: package 'RColorBrewer' was installed before R 4.0.0: please re-install it
Execution halted
---> Microsoft.PowerBI.Scripting.R.Exceptions.RScriptRuntimeException: R script error.
Loading required package: rpart
Loading required package: rpart.plot
Warning message:
package 'rpart.plot' was built under R version 4.0.5
Loading required package: RColorBrewer
Error: package or namespace load failed for 'RColorBrewer':
package 'RColorBrewer' was installed before R 4.0.0: please re-install it
Warning message:
In libraryRequireInstall("RColorBrewer") :
*** The package: 'RColorBrewer' was not installed ***
Error: package 'RColorBrewer' was installed before R 4.0.0: please re-install it
Execution halted
at Microsoft.PowerBI.Scripting.R.RScriptWrapper.RunScript(String originalScript, Int32 timeoutMs)
at Microsoft.PowerBI.Client.Windows.R.RScriptHandler.GenerateVisual(ScriptHandlerOptions options)
Hi,
I would like to know if this decision tree uses Chaid algorithym or CRT
Hello Sharon,
Thank you for sharing this example. I have a question, I just tried the following short code for plotting a decision tree in Power BI:
-----------------------------
library(rpart.plot)
library(rpart)
set.seed(1)
fit <- rpart(Kyphosis~Age+Number+Start,method="class",data=kyphosis)
rpart.plot(fit)
-----------------------------
That worked very well and the plot appeared to be OK in PBI Desktop, but once I published the file on Power BI service the plot changed a lot (attached file).
Then I tried with your template and the plot is perfect on desktop and on PBI service. I noted some differences in the code like the function replaceFancyRpartPlot(). But I don't know exactly why the visual changed so much between desktop and the version published on the web.
Can you give me a hand with this?
Thanks in advance.
J.
Hi @realexander,
The differences in performance for PBI desktop and service are due to different R engine (and packages) vesrions.
Unfortunatly, we still have R3.2.2 in service, it is about to be upgraded next month.
Hello Sharon,
I am having some troubles on trying to build a decision tree on power bi, I am quite new here so I think it should be something easy...
The error says : "The tree depth is zero"
Thanks in advance for your help,
Hi @Elizabeth24,
Sometimes algorithm "decides" that adding branches to root of the tree is not useful.
For example, if you have 90% negatives 10% positives in your data. The accuracy of root node is 0.9 and adding branches may not improve it much.
You may try to disable cross validation.
You may try to make changes in data.
You may try to change some parameters, for example:
complexity = 1e-05 #change to 1e-10
minBucket = 2# change to 1
minRows = 10 # change to 5
maxNumAttempts = 10 #change to 50
This visual is excellent.
Most of my code is from your sample PBIX file.
Here is my screenshot.
Hi ironryan77,
1) Format it via parameters of prp function, may be you will need to reload "format" function
2) Option 1: use characters instead of "numbers" for 1st column , Option 2: specify it explicitely as rpart(..., method ="class",...)
3) This visual is image, so tooltips are not possible. (We suport such functionality in HTML-based R-powered Custom Visuals)
Dear Sharon
This is really excellent - my only question is whether there is a way to format (i) the colors, to make the output better fit in with the existing report and (ii) the formating of numbers in nodes? I am working on a loan portfolio to estimate the likelihood of default - but the loan amounts are presented in ln (XeY) format - it would be really helpful to amend these.
Regardless, this is an excellent contribution!
Gratefully,
Tim
Hi TimKroemer ,
Sorry for the delay in reply:
(i)the colors.
Answer: look for "default.palettes" in R script and change it as you like
(ii) the formating of numbers in nodes?
Answer: prp {rpart.plot} function has plenty of parameters.
https://www.rdocumentation.org/packages/rpart.plot/versions/2.1.0/topics/prp
hello, where I can find the default.palettes option you are talking about? in the visual format pane there is no such thing. thank you.