Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
ricsolbra
Frequent Visitor

Sum Distinct not working in a matrix

Hi guys,

 

I'm really stuck in a SUM DISTINCT problem.

 

In the dataset I have various ID, which are 'businesses created', and these businessess have installments, in many cases more than 1 installment. Afterwards, I joined it with another table with the paid installments. So, I have now a final dataset where the ID will duplicate if it has more than one installment paid, once that the installment paid has its own paid date. It is something like this:

 

ricsolbra_1-1657656105174.png

I'mtryong to finish this matrix below (cohort analysis with the installments created sum in rows and the sum of the payments of these installments in columns):

 

ricsolbra_0-1657654727984.png

 

My big problem is that I need to have the sum of installments for the DISTINCT ID per created_date and paid_date in the cohort analysis. As you can see, when I sum the months individually it doesn't match the subtotal nor the total. Ex.: Total in JANEIRO 2022 =  406.344. Businesses that have no payment = 258.109. Businesses that have, at least, 1 payment = sum the months). Summing the businesses with no payment and the businesses with some payment = 647.828. So, it is still giving me the duplicates.

 

The DAX formula I used to have this SUM of the installments for the distinct ID: 

 

Sumx(DISTINCT('Table'[ID]),[_Max_Installment])

 

I would appreciate your help.

4 REPLIES 4
ricsolbra
Frequent Visitor

I created in a .pbix file what I've been trying to do. Hope it helps:

 

https://drive.google.com/drive/folders/1I0fEwjtScFxYTH8ygCgudr1uyh7VsvZK?usp=sharing 

 

Tks in advance.

 

ricsolbra
Frequent Visitor

Hi @v-chenwuz-msft. Thanks for your reply. I'm going to try to explain in plain text first, if it is not enough, I'll have to prepare and share the data you requested.

 

In JANUARY 2022 I have 406.344 distinct installments ID (real dataset) that where created in JANUARY 2022. Installments come from DEALS. The ROW total in the cohort matrix is correct (= the ROW total is related to installment sum from the distinct DEAL ID):

ricsolbra_0-1657917247475.png

When I relate PAYMENTS to these INSTALLMENTS, duplicates show up, because deals created will have, along time, installments paid, and more than one payment in many cases (just correcting myself, Installment_ID is DEAL_ID and DEALS have installments, as below):

ricsolbra_1-1657917543227.png

What I did was a matrix with CREATED_DATE in ROWS (realated to the deals created each month, where each deal has installments) and PAID_DATE in COLUMNS (related to the installments paids each month). Many installments, off course, weren't paid along time. That's why we have 258.109 distinct installments ID created in JANUARY 2022 with no payments:

ricsolbra_2-1657917791889.png

The problem is when the installments paid are distributed each month, i.e. COLUMNS. I expect to have distinct installments ID paid, with no duplicates. In my cohort we can see that the duplicates still show up. Summing each month, i.e. each column, is more than what is exhibitted in ROW totals. Look at the JANUARY ROW and all months in 2022 (columns): 105.064 (logically, out of 406.344 installments created in JANUARY 22, 105.064 were paid in JANUARY 22) + 79.231 + 67.271 + 49.393 + 42.325 + 34.336 + 4.929 + 380 + 258 + 250 + 265 + 240. Adding it all up we have, in fact, 279.005 (with duplicates) and not what is in the cohort analysis, 148.216 (which is the correct sum, with no duplicates).

 

The dax formula I used in "values" was:      Sumx(DISTINCT('Table'[ID]),[_Max_Installments])    -> Where ID is the DEAL ID (and DEALS duplicate if more than one installment were paid along time) and Max_Installments are the installments for each deal created.

 
Hope I was more fortunate to clear it all out.

 

 

 

v-chenwuz-msft
Community Support
Community Support

Hi @ricsolbra ,

 

Can you explain how the 657.828 is calculated? What the output you want for the example data you provided in picture 1?

 

Or you could share the pbix file without sensitive data and expect result if possible.

How to Get Your Question Answered Quickly - Microsoft Power BI Community

 

Best Regards

Community Support Team _ chenwu zhu

I created in a .pbix file what I've been trying to do. Hope it helps:

 

https://drive.google.com/drive/folders/1I0fEwjtScFxYTH8ygCgudr1uyh7VsvZK?usp=sharing 

 

Tks in advance.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.