Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
karun_r
Microsoft Employee
Microsoft Employee

DISTINCT over SUMMARIZE

I have a measure that follows the pattern as mentioned below

 

Test Adds =
CALCULATE (
    SUMX (
        DISTINCT ( SUMMARIZE ( Sales, Sales[Order Date], Sales[Order Number] ) ),
        IF ( Sales[Sales Amount] > 0, 1, 0 )
    )
)

 

I was wondering if I really need the DISTINCT over a SUMMARIZE call as the latter would already return unique combinations of order date and order number. Also, is there a better way to calculate the total number of orders for each day for which the sales amount > 0 ?

1 ACCEPTED SOLUTION
Greg_Deckler
Super User
Super User

I do not see the need for DISTINCT, SUMMARIZE already makes things distinct.

Maybe, but don't know your data to be sure. Should just be able to use COUNTROWS and FILTER the table returned from SUMMARIZE but not sure that is more efficient or not.


Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

12 REPLIES 12
Ashish_Mathur
Super User
Super User

Hi,

Here's what i would do:

  1. Create a Calendar Table and build a relationship from the Order date column of your Sales Table to the Date column of your Calendar Table.  Write calculated column formlas to extract various time dimensions such as Year, Month, Quarter
  2. Drag time dimensions from the Calendar Table.
  3. Write these measures

Total sales = SUM(Sales[Sales amount])

Orders which recorded sales = SUMX(FILTER(SUMMARIZE(Sales,Sales[Order Number],Calendar[Date],"Sale",[Total Sales]),[Sale]>0),[Sale])

Hope this helps.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

@Ashish_MathurWhy would I want to do a sum of [Sales] measure ? I am just trying to get a count of number of orders for each day where sales > 0. I shouldn't have to do a sum on Sales , right ?

Hi,

Does this work?

=COUNTROWS(FILTER(SUMMARIZE(Sales,Sales[Order Number],Calendar[Date],"Sale",[Total Sales]),[Sale]>0))


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

@Ashish_Mathur  Yes, that is matching with my current metric results, I have two other versions of the same metric, I am going to test the performance of each metric and get back

amitchandak
Super User
Super User

@karun_r 

I doubt you need to summarize there. Also, the formula should one of the two as per need

 

Test Adds =
CALCULATE (
SUMX (
SUMMARIZE ( Sales, Sales[Order Date], Sales[Order Number],"_1",sumx ( Sales,if(Sales[Sales Amount] > 0, 1, 0 ) )),
[_1]
)
)


Test Adds =
CALCULATE (
SUMX (
SUMMARIZE ( Sales, Sales[Order Date], Sales[Order Number],"_1",sum ( Sales[Sales Amount]) ),
if([_1] > 0, 1, 0 )
)
)

Join us as experts from around the world come together to shape the future of data and AI!
At the Microsoft Analytics Community Conference, global leaders and influential voices are stepping up to share their knowledge and help you master the latest in Microsoft Fabric, Copilot, and Purview.
️ November 12th-14th, 2024
 Online Event
Register Here

@amitchandak  Would that have an performance improvement over my current metric definition ? If so, can you please help me understand how it would improve things ?

Greg_Deckler
Super User
Super User

I do not see the need for DISTINCT, SUMMARIZE already makes things distinct.

Maybe, but don't know your data to be sure. Should just be able to use COUNTROWS and FILTER the table returned from SUMMARIZE but not sure that is more efficient or not.


Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

@Greg_Deckler  Yes, that's what I thought. I removed DISTINCT from the measure definition and the numbers seem to match. I am just trying to think of a case where I would need a DISTINCT over SUMMARIZE. It's impossible for me to test all possible combinations considering the dimensions I have in my dataset. In your experience, was there ever a case where you had to apply DISTINCT over a SUMMARIZE statement ? I am yet to do a performance comparison between both the measures.

@karun_r

 

No on DISTINCT with SUMMARIZE, because when you SUMMARIZE, whatever your are grouping on will be distinct. DISTINCT will make sure that entire rows are distinct in a table versus just a particular column, but the fact that you are using SUMMARIZE ensures that the table rows are distinct, it's built into the grouping.

 

@Ashish_Mathur put into a formula what I was discussing with COUNTROWS and FILTER. Now, whether that is an improvment? You'd have to test it. See my 4 part series on Performance Tuning DAX: https://community.powerbi.com/t5/Community-Blog/Performance-Tuning-DAX-Part-1/ba-p/976275



Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

@Ashish_Mathur  and @Greg_Deckler  So I had three copies of the measures for which I tested the performance on my Power BI desktop. SUMX() was consistently better performing than COUNTROWS(FILTER() >0) pattern

 

And removing DISTINCT from SUMMARIZE also had some minor improvements on the run time. Maybe it would change in a huge dataset and the time difference would be considerable.

Hi @karun_r ,

Thanks for sharing the results. And it seems that you have got the answer of the thread.  Can you please accept the helpful answers as solutions?  More people will benefit here.

 

Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Thanks for the information! That does not surprise me, the "X" functions seem to be very performant overall, as I note in my To *Bleep* with RANKX Quick Measure, https://community.powerbi.com/t5/Quick-Measures-Gallery/To-Bleep-with-RANKX/td-p/1042520


Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
OCT PBI Update Carousel

Power BI Monthly Update - October 2024

Check out the October 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.