On seperate post https://community.powerbi.com/t5/Desktop/DAX-equivalent-to-complex-SQL-query/m-p/151869#M65682 Phil gave me 3 DAX code segments which I have used to create a 100% DAX solution in place of SQL Query based one. That's wonderful. But, I have one lingering problem of high RAC.ReportA_AttributeCount counts returning from the dataset which in SQL I resolve using DISTINCT.
I have applied DISTINCT here (line 4) and it has helped reduce the inflated counts of attributeNames
#ReportAttributeCount = SELECTCOLUMNS( SUMMARIZECOLUMNS( 'Report-AttributeMap'[Report Name], "rptA - Total Attributes", COUNTROWS(DISTINCT('Report-AttributeMap')) ), "ReportA", 'Report-AttributeMap'[Report Name], "rptA - Total Attributes", [rptA - Total Attributes] )
but still some higher than true counts are returning. the two other code segments are below.
CommonAttributesCount = SUMMARIZE( FILTER( GENERATE( SELECTCOLUMNS( Table2, "ReportName",'Table2'[ReportName], "AttributeName",'Table2'[AttributeName] ) , SELECTCOLUMNS( Table2, "ReportName2",'Table2'[ReportName], "AttributeName2",'Table2'[AttributeName] ) ),[AttributeName]=[AttributeName2] && [ReportName]< [ReportName2] ), [ReportName], [ReportName2], [AttributeName] )
ReportTable = VAR CommonAttributesCount2 = SELECTCOLUMNS( SUMMARIZE( 'CommonAttributesCount', [ReportName], [ReportName2], "CommonAttributeCount",DISTINCTCOUNT('CommonAttributesCount'[AttributeName])),"CACA.ReportA",[ReportName],"CACA.ReportB",[ReportName2],"CACA.DC",[CommonAttributeCount]) VAR FINAL = ADDCOLUMNS( FILTER( CROSSJOIN( FILTER( CROSSJOIN( SELECTCOLUMNS( ReportAttributeCount, "RAC.ReportA",[ReportA], "RAC.ReportA_AttributeCount", [ReportA_AttributeCount] ), CommonAttributesCount2), [RAC.ReportA]=[CACA.ReportA] ), SELECTCOLUMNS( 'ReportAttributeCount', "RACB.ReportA",[ReportA], "RACB.ReportA_AttributeCount",[ReportA_AttributeCount] ) ),[CACA.ReportB]=[RACB.ReportA]) , "reportApercentage",DIVIDE([CACA.DC],[RAC.ReportA_AttributeCount]), "reportBpercentage",DIVIDE([CACA.DC],[RACB.ReportA_AttributeCount])) RETURN FINAL
In addition to verified distinct SQL counts I have verified manually that the counts are higher than truly present in the underlying data (by doing export to excel for verification)
Can someone see anywhere in any of the code segments where DISTINCT can be added to ensure unique ReportName/AttributeName pairs are returned? Or have I applied the DISTINCT function incorrectly?
Solved! Go to Solution.
I think I have it.
Please update the code for the 'ReportAttributeCount' table (changes highlight)
ReportAttributeCount = SELECTCOLUMNS( SUMMARIZE( 'Report-AttributeMap', 'Report-AttributeMap'[ReportName], "Distinct Count", CALCULATE(DISTINCTCOUNT('Report-AttributeMap'[AttributeName]) ) ), "ReportA",[ReportName], "ReportA_AttributeCount",[Distinct Count] )
Hi @hxkresl
Just a couple of questions that will help me. So I understand the code is ok for the small sample dataset, yes? 🙂
Any chance you can give me a slightly bigger dataset including some duplication that you want to ignore. It would be great if you can do this and also provide the expected result for the bigger dataset to help me pick the best place to apply the de-dup.
Cheers,
Phil
Yes, no issues with duplicates (when using DISTINCT) on sample dataset.
I generated schema and data script for the actual dataset, saved as file called schemaanddatafor_ReportTable, and placed it here
https://www.dropbox.com/sh/6kgldb1fhfichy4/AAAbXQhUr8VvyD65qTEOvIPFa?dl=0
Can you access?
There are 6996 rows having blank report name/attribute name (which i would also like to handle with some kind of WHERE ReportName <> ''), and 17081 total rows including duplicate reportName/attributeName pairs.
Expected result eg. Area Dashboard Details (ReportA) has 172 total distinct attributes (this is correct):
SQL based result
but with DAX, same report am getting 186 count:
DAX results
Yep, got that and have imported into a local SQL DB and have subsequently imported into my PBI table.
I named the table [Report-AttributeMap]
I imported your data and for that scenario I got 172 (using my original queries). I think we might be getting close 🙂
Here is the PBIX file
previously I shared with you the 'actual' table minus 3 metadata rows....The source table actually has additional columns that aren't germane to report but that I now realize are skewing outcome of DAX. If you please go back to drop box and try this with the file named schemaanddatafor_ReportTable2, the issue with offcounts will surface. <sorry>
https://www.dropbox.com/s/hbdiwz2cgtxylqq/schemaanddatafor_ReportTable2.sql?dl=0
I was trying to clean it up for ease, but in fact the other columns are causing issue and I don't know how to adapt DAX.
Cool, that will hopefully help. I just have to do a few hours of outside chores (it's Saturday noon here) and will look at it when I get back.
yes, the measure column adds further distinctness. so Distinct needs to exclude from consideration the Measure and color columns.
I think I have it.
Please update the code for the 'ReportAttributeCount' table (changes highlight)
ReportAttributeCount = SELECTCOLUMNS( SUMMARIZE( 'Report-AttributeMap', 'Report-AttributeMap'[ReportName], "Distinct Count", CALCULATE(DISTINCTCOUNT('Report-AttributeMap'[AttributeName]) ) ), "ReportA",[ReportName], "ReportA_AttributeCount",[Distinct Count] )
one last thing. How to exclude any rows where ReportName is blank ?
You could try using the Query Editor to remove the blanks.
Thank you Phil for taking your time out of your weekend to help me.
I am guessing (since I don't have access to source to refresh) that the person who does, can add a step to remove blanks, which will then replay each time table is refreshed.
Know that this exchange is very powerful way of learning for me and I appreciate it very much. I must now sort through the nested FILTER, CROSSJOIN, FILTER, CROSSJOIN, SELECTCOLUMNS logic. It is very tricky.
Hi @hxkresl
Do you have access to the Query Editor in Power BI? If so have a look at the technique I suggested to remove the blanks. It should work perfectly 🙂
Yes. Happy to see correct result 172!
and one last thing...
Yay!
Nothing about DAX is especially complex. All the separate parts are pretty simple. It's just when you put it all together, sometimes it can become tricky.
If you can break it down into mini parts, it can sometimes help alot
Downloaded the file and now getting 186 for the RAC.ReportA_AttributeCount