The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I am performing an analysis on referral patterns.
I have a single table of data named SDP_OUTPUT that contains locations named [PSA_NAME] for which there are 13 distinct values.
The table contains a column of IDs named [STOP_CODE]. For each [PSA_NAME] + [STOP_CODE] the row in the table is unique.
Other fields are [REFERRAL_LOCATION], [REFERRAL_PERC] and [STOP_TOTAL_DEMAND].
The Referral Location represents where the demand for the STOP code is being sent. Each PSA+STOP code only sends its demand to one location, but a location can receive demand from multiple locations. The Referral_Perc is the percentage being referred on to the referral location (between 0% and 100%). And finally, the STOP_total_demand is the demand generated at the PSA location before any referral.
My objective is to calculate all of the demand being referred out, referred in, and the final adjusted demand after referrals.
I have to be careful to stage out the calculation or end up in a mathematical loop.
I first created a measure to calculate all of the workload being referred out, but set a condition for the calculation to first check that for the given PSA+STOP code, no workload is being referred in.
Next, I created a 2nd measure named Adjusted Direct Demand.
This measure is returning blank results.
PSA_NAME | STOP_CODE | REFERRAL_LOCATION | REFERRAL_PERC | STOP_TOTAL_DEMAND | REFERRED_OUT | ADJUSTED_DIRECT_DEMAND | REFERRED_IN |
Antelope Valley | 1002 | Sepulveda | 1 | 118575.7 | 118575.7 | 0 | |
Antelope Valley | 104 | Sepulveda | 1 | 233.7255 | 233.7255 | 0 | |
Bakersfield | 1002 | Sepulveda | 1 | 133481.7 | 133481.7 | 0 | |
Bakersfield | 104 | Sepulveda | 1 | 190.7225 | 190.7225 | 0 | |
East Los Angeles | 1002 | Los Angeles | 1 | 34798.76 | 34798.76 | 0 | |
East Los Angeles | 104 | West Los Angeles | 1 | 73.47664 | 73.47664 | 0 | |
Lake Isabella | 1002 | Sepulveda | 1 | 5988.846 | 5988.846 | 0 | |
Lake Isabella | 104 | Sepulveda | 1 | 8.557033 | 8.557033 | 0 | |
Los Angeles | 1002 | West Los Angeles | 0.5 | 149614 | 74807.02 | 74807.02 | |
Los Angeles | 104 | West Los Angeles | 1 | 315.906 | 315.906 | 0 | |
Ridgecrest | 1002 | Sepulveda | 1 | 20467.5 | 20467.5 | 0 | |
Ridgecrest | 104 | Sepulveda | 1 | 29.24454 | 29.24454 | 0 | |
San Gabriel | 1002 | Sepulveda | 1 | 190845.9 | 190845.9 | 0 | |
San Gabriel | 104 | Sepulveda | 1 | 402.966 | 402.966 | 0 | |
San Luis Obispo | 1002 | Sepulveda | 1 | 51814.83 | 51814.83 | 0 | |
San Luis Obispo | 104 | Sepulveda | 1 | 70.79026 | 70.79026 | 0 | |
Santa Maria | 1002 | Sepulveda | 1 | 84130.06 | 84130.06 | 0 | |
Santa Maria | 104 | Sepulveda | 1 | 129.6575 | 129.6575 | 0 | |
Sepulveda | 1002 | Sepulveda | 0 | 232407.9 | 0 | 232407.9 | |
Sepulveda | 104 | West Los Angeles | 0.542 | 480.2137 | 260.2758 | 219.9379 | |
South Los Angeles | 1002 | West Los Angeles | 1 | 130810.3 | 130810.3 | 0 | |
South Los Angeles | 104 | West Los Angeles | 1 | 276.2024 | 276.2024 | 0 | |
Ventura | 1002 | Sepulveda | 1 | 142020.2 | 142020.2 | 0 | |
Ventura | 104 | Sepulveda | 1 | 240.2623 | 240.2623 | 0 | |
West Los Angeles | 1002 | West Los Angeles | 0 | 73635.01 | 0 | 73635.01 | |
West Los Angeles | 104 | West Los Angeles | 0 | 155.4783 | 0 | 155.4783 |
For [PSA_NAME] Los Angeles, I expect [REFERRED_IN] to equal 347,983.76.
For [PSA_NAME] Sepulveda, I expected [REFERRED_IN] to equal 747,324.80.
For [PSA_NAME] West Los Angeles, I expected [REFERRED_IN] to equal 205,617.33.
For all other PSAs I expect [REFERRED_IN] to return zero because no Referral Location matches that PSA.
I need to move on from this calculation to create a 4th measure named 2ndADJUSTED_DIRECT_DEMAND by adding the REFERRED_IN_DEMAND to the ADJUSTED_DIRECT_DEMAND.
I say 2nd here because there could be a 3rd or 4th adjustment needed, each time check to see if the demand continues to be referred. For example, in the data I provided, East Los Angels sends 100% of its workload on to Los Angeles, but then Los Angeles sends 50% its workload on to West Los Angeles. On the first calculation, what ends up at West Los Angeles is just 50% of Los Angles, but it should really be (the total demand of Los Angles + the 100% of the total demand from East Los Angles) * 50%.
If anyone can think of a way to do that more efficiently, without multiple measures and without running into circular logic, you win the grand prize. 😄
For now though, I'll be happy to know why my REFERRED_IN measures isn't returning the expected result.
Thank you!
a way to do that more efficiently, without multiple measures and without running into circular logic
- not really
- no
- heck no.
This is a closed loop control system that will require multiple runs to calm down and level in. There is no chance at all that a deterministic single loop system like Power BI (which doesn't even have the concept of global variables) can handle this very complex process.
Can you add a timing component? Datetime stamps on when the referrals happened? So you can build a sequence?
I guess you could build a solution out of hydraulics, or with mycelium.
@lbendlin, I made some progress yesterday.
I didn't know DAX had functions to measure parent/child relationships. I can leverage these to detect exactly how many PSAs away a given PSA may be on the chain of referral. PSAs at the beginning of the chain with no children will have the largest path length. PSAs at the end of the chain with the most children and/or are only referring to themselves have the shortest path length.
So the larger the path length, the earlier its demand should be calculated. The PSA(s) with the smallest path length are calculated last.
Next, I need to write a looping statement... which I guess isn't possible in DAX but thankfully @Greg_Deckler provided us with a write-up on how to work around that. I don't have the code written out yet but I think I have a map for how to get there.
PATH only works if each element appears in only one place of the hierarchy. From your original description it sounded more like you have circular references between PSAs?
The FOR and WHILE constructs you reference will only work for independent computations in each of the loops. As soon as one loop iteration depends on the result of another iteration this will break down. There is a Power Query function List.Accumulate that is a bit more powerful but even that cannot mask the fact that Power BI does not have a concept of global variables.
I think your best chance is still when you provide additional sequencing information (ideally timestamps, or index columns).
I don't believe it is as hopeless a situation as you describe.
I was able to make some progress - though I admit I still don't have the full solution.
After calculating the demand referred out, and calculating the demand remaining at the PSA, I have a measure for the demand referred in for each PSA as follows:
This resolves the first issue I originally posted about. I then took it a step further, by simply writing another measure that adds the Referred In demand and the "Adjusted Demand" - which probably isn't the best name for the field but it represents the demand originated at the PSA minus the portion that was referred out.
I think of this as the first cycle of referral. For any PSA that receives no Referred In demand, this is the end of the calculation but the problem is that some PSAs continue to refer their new adjusted demand onto another location. And the important part to remember is that it may not always be sending on 100% of it's demand.
To calculate this, I wrote a measure that subtracts the demand that the PSA already referred out from its Adjusted Demand, and then multiplie by the referral percentage. I subtract the referred out demand from the first cycle first so that I don't double dip on the referral percentage. Also, some PSAs will have already referred out 100% of their demand, so the subtraction will result in a negative number. I handle that with a conditional statement that just returns zero.
The measure looks like this:
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
21 | |
21 | |
18 | |
17 | |
13 |
User | Count |
---|---|
41 | |
38 | |
24 | |
20 | |
20 |