The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
As a follow-up to this post, I've created a simpler example that demonstrates the core of the behavior I still don't quite understand.
Consider the following table, 'Data':
Data = DATATABLE ( "group", STRING, "amount", INTEGER, { {"A", 5}, {"A", 10}, {"B", 12}, {"C", 15} } )
I'd like to understand why the following do not give the same result:
SUMMARIZECOLUMNS (
Data[group],
"Rank", RANKX ( ALL ( Data[group] ), CALCULATE ( SUM ( Data[amount] ) ) )
)
versus
SUMMARIZECOLUMNS (
Data[group],
Data,
"Rank", RANKX ( ALL ( Data[group] ), CALCULATE ( SUM ( Data[amount] ) ) )
)
The first gives A,B,C with Rank 2,3,1 respectively, whereas the second gives Rank 1,1,1.
Similarly, the following also return different results (which match the results above):
ADDCOLUMNS (
VALUES ( Data[group] ),
"Rank",
CALCULATE (
RANKX ( ALL ( Data[group] ), CALCULATE ( SUM ( Data[amount] ) ) )
)
)
versus
ADDCOLUMNS (
VALUES ( Data[group] ),
"Rank",
CALCULATE (
RANKX ( ALL ( Data[group] ), CALCULATE ( SUM ( Data[amount] ) ) ),
Data
)
)
Curiously, if I replace RANKX with SUMX in the ADDCOLUMNS pair of expressions, the results match each other (both give A,B,C with Rank 47,47,47) whereas the SUMMARIZECOLUMNS pair gives A,B,C with Rank 47,47,47 for the first expression but Rank 15,12,20 for the second one.
These examples are a bit contrived but I'm looking to understand exactly why these behave the way they do and this was the most simplified form I could come up with. They may look slightly less odd if you replace CALCULATE ( SUM ( Data[amount] ) ) with a measure [SumAmount] := SUM ( Data[amount] ).
What's going on here? How is the table argument getting applied in these cases?
Solved! Go to Solution.
I believe this is Auto-Exist kicking in. SUMMARIZECOLUMNS combines all filters from the same table into one filter, so in your second evaluation, you have DATA, and DATA[Group] - and they get combined since it is the same table.
Take a look at the detailed explanation of SUMMARIZECOLUMNS() here and then the Auto-Exist issue you may need to contend with here.
Auto-exist is a pain to recon with.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reportingcan this help you?
think you can get more from here
https://www.sqlbi.com/articles/all-the-secrets-of-summarize/
I believe this is Auto-Exist kicking in. SUMMARIZECOLUMNS combines all filters from the same table into one filter, so in your second evaluation, you have DATA, and DATA[Group] - and they get combined since it is the same table.
Take a look at the detailed explanation of SUMMARIZECOLUMNS() here and then the Auto-Exist issue you may need to contend with here.
Auto-exist is a pain to recon with.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingThanks. Auto-Exists does seem like the likely explanation, at least for the SUMMARIZECOLUMNS cases since using a dimension table for group resolves the issue.
For the ADDCOLUMNS examples, using a dimension table doesn't seem to help so maybe there's something else happening. Auto-Exists also doesn't explain why RANKX and SUMX behave differently for these.
I'm certianly not an expert in Auto-Exists, but I believe it returns a table with specific info, therefore SUMX and RANKX are operating on a different result set than the SUMMARIZECOLUMNS without the filter table. That is why you are getting different results. SQLBI does a deep dive into Auto Exist in this video.
If you could mark one or more of these as the solution so this thread can be marked solved, we'd appreciate it.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingWhy would SUMX operate on a different result set than RANKX in the ADDCOLUMNS examples (where the results match each other using SUMX but don't match using RANKX)?
My only guess would be that there might be some kind of internal optimization such that SUMX isn't actually behaving as an iterator whereas RANKX still must.
SUMX is always an iterator. To really trouble shoot this you'd need to analyze the query plan in DAX Studio to see what the engine is doing. That will show you the exact pseudo-SQL being generated for each query and how they differ.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingJoin the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
18 | |
18 | |
16 | |
14 | |
12 |
User | Count |
---|---|
36 | |
35 | |
20 | |
18 | |
18 |