The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
Help in calculating Bayes Probability
Greetings,
I need to calculate the probability of event type outcome given the occurrence of events type on monthly basis and the global probability of an event. Let’s say that we classify an event to safe, near miss, incident and accident. Each of which has it’s own probability in a 4x1 table called event probability by type.
Then, we have a log of events that took place on monthly basis. Based on those, we can calcilate the likelihood of occurrence and later the probability.
To calculate the likelihood,
Likelihood measure = count('Log'[Event_Type])/CALCULATE(COUNT('Log'[Event_Type]),ALLEXCEPT('Log','Log'[Month]))
I intentionally excluded month because I want to track the change in probability month-on-month.
Now, to calculate the probability, I use Bayes Theorem where the probability of an event (for a given month) is:
Probability = divide (numerator / denominator)
Where –
Numerator =
[Likelihood]*sum(Reference[Global_Event_Probability]) – for a given event type
Denominator = ??? which should be the sumx of the numerator (for all event types).
I am struggling in defining the denominator and would greatly appreciate your help.
Thanks,
I know this is an old post and most likely outdated but if it helps you out I recently made a blog post about Bayes theorem in Power BI on my blog and you can also download a sample report to look at my calculations there.
This report is however just a calculator with manual inputs from slicers but you could of course just change the slicer values to your measures.
you can do that with a measure that uses VALUES, SELECTCOLUMNS and SUMX. But it will be expensive as you will have to recalculate the SUMX for each individual "row". How big is your dataset?
For Bayes you need to have knowledge of a prior event. What is that - the probability for the previous month?
Thanks ibendlin for your time.
Size of dataset is roughly 5000 rows and growing.
For 2nd question, as you suggested, the previous month probability is available.
Give it a try - worst case you can use R or Python Script instead.
User | Count |
---|---|
69 | |
64 | |
62 | |
55 | |
28 |
User | Count |
---|---|
203 | |
82 | |
65 | |
48 | |
38 |