Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hi
Really wish someone could enlighten me on this - to multiplt columns in two fact tables that are in a star schema. I tried two apparently almost identical DAX solutions but the preferred one (more compact) yields totally wrong result.
Simplified data for example:
1. Facts tables:
table_1:
X | Value_1 |
X1 | 0.0625 |
X2 | 0.125 |
X3 | 0.1875 |
X4 | 0.3125 |
X5 | 0.3125 |
table_2:
X | Y | Value_2 |
X1 | Y1 | 0.142857 |
X1 | Y2 | 0.571429 |
X1 | Y3 | 0.285714 |
X2 | Y1 | 0.7 |
X2 | Y2 | 0.1 |
X2 | Y3 | 0.2 |
X3 | Y1 | 0.3125 |
X3 | Y2 | 0.5 |
X3 | Y3 | 0.1875 |
X4 | Y1 | 0.076923 |
X4 | Y2 | 0.538462 |
X4 | Y3 | 0.384615 |
X5 | Y1 | 0.272727 |
X5 | Y2 | 0.181818 |
X5 | Y3 | 0.545455 |
2. two dimention tables
table_X:
X |
X1 |
X2 |
X3 |
X4 |
X5 |
table_Y:
Y |
Y1 |
Y2 |
Y3 |
Obviously the schema is like this:
Problem:
To mutiply value_1 in table_1 and value_2 in table_2.
My DAX solutions:
Solution 1 - gives correct result = 1:
Measure_1 = SUM(table_1[value_1]) * SUM(table_2[value_2])
Measure_2 = SUMX(table_1, [Measure_1])
Soluiton 2 - try to use one combined measure but got wrong answer = 25:
Measure_2 =
VAR
Measure_1 = SUM(table_1[value_1]) * SUM(table_2[value_2])
RETURN
SUMX(table_1, Measure_1)
Could anyone please advise why Solution_2 is wrong and how I should correct it?
Many thanks in advance.
charlie77
Solved! Go to Solution.
Hi @charlie77
Please try
Measre1 =
SUMX (
CROSSJOIN ( VALUES ( table_X[X] ), VALUES ( table_Y[Y] ) ),
CALCULATE ( SUM ( table_1[Value_1] ) * SUM ( table_2[Value_2] ) )
)
Hi @charlie77
Please try
Measre1 =
SUMX (
CROSSJOIN ( VALUES ( table_X[X] ), VALUES ( table_Y[Y] ) ),
CALCULATE ( SUM ( table_1[Value_1] ) * SUM ( table_2[Value_2] ) )
)
Hi tamerj
Thank you so much for the prompt assistance. It works with one single measure!
May I know whether CROSSJOIN would dramatically increase memory uptake (as it expends into a big joined table) and thus slow down the process if my data is really big?
Also wish to knwo why my 2nd soluiton fails.
Best regards
charlie77
This is a crossjoin over the unique values of two dim. columns. This won't have a big impact on the performance.
having more than one fact table is a little tricky and requires proper handling of the filter context. The two fact tables are not directly related this why your both solutions didn't work.
Just to clarify my first solution does work. I'm still need to figure out why solution 2 doesn't work. I'd think calculations based on a data model (relationships) would be more efficient than merging tables together. Isn't it?
I seems to have figured it out. In solution 2 Measure_2 is actually a variable instead of a dynamic measure. The variable is always evaluated as SUM(table_1[value_1]) * SUM(table_2[value_2]) = 5. Then in the SUMX(table_1, Measure_1) part this constant 5 is repetitively added 5 times (as X1 to X5 in table 1) to make the result of 25. This can be further validated if in SUMX I use table_2 (15 rows) in place of table_1 (5 rows), I got a result of 75 (15 * 5).
charlie77
And further, I got a much more simplified solution without using CROSSJOIN, purely based on schema relationships:
Measure = SUMX(table_1, CALCULATE(SUM(table_1[Value_1]) * SUM(table_2[Value_2]))
Perfect!
The total matches exactly that from your equation.