Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
i have a fact table at invoice level and data looks like this(for one billing account)
billing_account_key | customer_key | due_days | due_amount |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 34 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 64 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 95 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 125 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 155 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 156 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 186 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 187 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 216 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 217 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 247 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 248 | 110 |
c9ca215d1a18caf55e1df0effc1d986b | efcd14ad702663dfebe27b60a8c4bb95 | 278 | 110 |
We have two dimentions which are called billing account dimention and customer dimention
i need to pull the report with the below columns.
billing_account_number-----from billing dimention
customer_name-------from customer dimention
1_30_DPD-----if fact.due_days between 1 and 30 then fact.due_amount end
31-60_DPD----if fact.due_days between 31 and 60 then fact.due_amount end
and so on till
180+DPD------if fact.due_days >180 then fact.due_amount end
oldest_aged_bucket: it is clculated like the below way.
1. If the amount in the specific bucket is greater than 50 aud
&&
2. Based on the due date ( it will be Payment Due Date +1 ) calculate number of days a customer in collection. Based on this value fetch the relevant aged debt bucket and update this field with the same aged debt bucket
for eg: if a customer have 100 aud in 1-29_DPD, 80 aud in 30-59_DPD and 20 aud in 60-89_DPD the the oldest aged debt bucket will be 30-59_DPD.
to calculate 1-30_DPD i have created a calculated column in the fact and logic is as below.
BILLING_ACCOUNT_NUMBER | Customer_Name | 1_30_DPD | 30_60_DPD | 60_90_DPD | 90_120_DPD | 120_150_DPD | 150_180_DPD | 180+ DPD |
7E+11 | SCHMITTEN LISA | 0 | 110 | 110 | 220 | 330 | 330 | 880 |
BILLING_ACCOUNT_NUMBER | Customer_Name | 1_30_DPD | 30_60_DPD | 60_90_DPD | 90_120_DPD | 120_150_DPD | 150_180_DPD | 180+ DPD | customer_aged_bucket |
7E+11 | SCHMITTEN LISA | 0 | 110 | 110 | 220 | 330 | 330 | 880 | 180+ DPD |
7E+11 | NA | ||||||||
7E+11 | "SEKISUI HOUSE CONSTRUCTION AUSTRALIA PTY LIMITED, | NA | |||||||
7E+11 | "WEIR MINERALS AUSTRALIA,LTD" | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOC | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF BRISBAN | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF BRISBANE | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF BRISBANE.. | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF CAIRNS NON | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF ROCKHAMPTON | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF TOOWOOMBA | NA | |||||||
7E+11 | (CEO) CATHOLIC TRUST CORP DIOCESE OF TOWNSVILLE | NA | |||||||
7E+11 | (CNA) CATHOLIC TRUST CORP DIOCESE OF BRISBANE | NA | |||||||
7E+11 | (CNA) CATHOLIC TRUST CORP DIOCESE OF CAIRNS | NA
|
Solved! Go to Solution.
There does appear to be a memory issue when you have multiple comparisons to measures in a SWITCH statement. This looks like a bug to me as the following variation works almost instantly on a model with the same table sizes as yours. You can see that the logic is identical, I'm just caching the measure values in variables (which should not necessary)
Max Age Bucket2a =
var _180 = [180+_DPD]
var _150 = [150-179_DPD]
var _120 = [120-149_DPD]
var _90 = [90-119_DPD]
var _60 = [60-89_DPD]
var _30 = [30-59_DPD]
var _1 = [1-29_DPD]
RETURN SWITCH (TRUE()
,_180 >50,"180+ Days"
,_150 >50,"150-179_DPD"
,_120 >50,"120-149_DPD"
,_90 >50,"90-119_DPD"
,_60 >50,"60-89_DPD"
,_30 >50,"30-59_DPD"
,_1 >50,"1-29_DPD"
)
So for 2 distinct dimensions that are only connected via a fact table in the following type of relationship:
Billing (1) --->--- (*) Fact (*) ----<---- (1) Customer
When you drag a column from Billing and one from Customer the theoretical space that the calculation engine has to consider is the cross product of the two columns. Normally you do not see this as the engine will optimize away non-existant combinations using the fact table, but if you write a measure that returns a non-blank value you will see them.
So the simple fix for this is to replace the "NA" value in the final else condition of your nested IF with a call to BLANK()
eg.
customer_aged_bucket =
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[180+ DPD]
) > 0,
"180+ DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[150_180_DPD]
) > 0,
"151-180_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[120_150_DPD]
) > 0,
"121-150_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[90_120_DPD]
) > 0,
"91-120_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[60_90_DPD]
) > 0,
"61-90_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[30_60_DPD]
) > 0,
"31-60_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[1_30_DPD]
) > 0,
"1-30_DPD",
BLANK()
)
)
)
)
)
)
)
PS. I formatted the code above using daxformatter.com as it is difficult to understand the structure of the expression when it is one long line of text.
Hi @d_gosbell ,
Even if i keep blank it is doing the cross product join only and when i select all the billing accounts, it is giving out of memory exception. Is there any way i can avoid this cross product join
@Anonymous
so, whats wrong? for customer names that have no value in your calculated *DPD columns it show "NA" exactly as you wish in customer_aged_bucket statement
if you replace "NA" to BLANK() it will give you an empty cell and you will able to hide it with deactivate "Show item with no data" visual parameter
It s not the best solution for your data model, but maybe the fastest with all your work done
Hi @Anonymous ,
I can hide the value using blank, For one billing account it is fine.When i select all the billing accounts it is thrwoing out of memory exception as it is doing cross product join.Is there any way we can achieve the same thing without cross product join.
Thanks
Aswani
I don't know how you are still seeing the cross join. Below is a screenshot from a file I built based on your description of your model. I added some made up data so that I had more than one customer and one account. You can see in the top table I am using the measure that returns "NA", in the bottom table is a copy of this measure that returns BLANK() instead which does not suffer from the crossjoin behaviour.
I've attached the sample file I built so that you can have a look at it yourself
@Hi @d_gosbell ,
For lesser rows it is giving, but for more number of billing accounts it is giving out of memory exception.We have nearly 1k billing accounts
only if we add that customer aged bucket it is giving out of memory exception (for all 1k billing account selected.)
The other issue here could be doing a SUMX using the entire table as the first parameter, potentially changing the pattern from SUMX( <table> , <column> ) to SUM( <column> ) will give you some increase in performance and possibly a drop in memory usage. But I'd suggest changing your approach entirely. I think the following measure which just does a single scan over the due_days will perform much better.
Max Age Bucket =
VAR maxDue = max('U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[due_days])
RETURN SWITCH (TRUE(),
maxDue >=1 && maxDue <= 30 , "1-30 Days",
maxDue >=31 && maxDue <= 60 , "31-60 Days",
maxDue >=61 && maxDue <= 90 , "61-90 Days",
maxDue >=91 && maxDue <= 120 , "91-120 Days",
maxDue >=121 && maxDue <= 150 , "121-150 Days",
maxDue >=151 && maxDue <= 180 , "151-180 Days",
maxDue >180 , "180+ Days"
)
And I don't think you need all of those calculated columns. I think a better approach would be to replace that set of calculated columns with measures using a format like the following:
31-60 DPD =
CALCULATE(
SUM('U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[due_amount])
, 'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[due_days] >=31
, 'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[due_days] <=60
)
This should save you memory by not storing the repeated copies of due_amount in calculated columns in memory and the simple scans like this should probably not be noticably slower than the calc column based approach (although this could be affected by the overall design of your model so it's worth comparing this approach against your existing columns using the performance analyzer)
Hi @d_gosbell ,
ideally my requirment for customer aged bucket is below
1. If the amount in the specific bucket is greater than 50 aud && 2. . Based on due days fetch the relevant aged debt bucket and update this field with the same aged debt bucket for eg: if a customer have 100 aud in 1-29_DPD, 80 aud in 30-59_DPD and 20 aud in 60-89_DPD the the oldest aged debt bucket will be 30-59_DPD. |
i should write as below
customer_aged_bucket =
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[180+ DPD]
) > 50,
"180+ DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[150_180_DPD]
) > 50,
"151-180_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[120_150_DPD]
) > 50,
"121-150_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[90_120_DPD]
) > 50,
"91-120_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[60_90_DPD]
) > 50,
"61-90_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[30_60_DPD]
) > 50,
"31-60_DPD",
IF (
SUMX (
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM',
'U2C_ARIA_ODS_SCHEMA FACT_AGED_DEBT_EOM'[1_30_DPD]
) > 50,
"1-30_DPD",
BLANK()
)
)
)
)
)
)
)
After seeing your solution, now i am wring for customer aged buckt as below:
That requirement for > 50 aud is something new, your original post only had > 0. So are you saying my last suggestion works for the >0 condition, but you are still getting out of memory issues trying to do >50?
If you need to check if it's greater than a specific amount you will need to retain the check on the actual bucket amounts. That will require extra scans, but I can't see anything obvious that would require large materializations that would cause out of memory issues. There might be something else in your model design that we can't see.
Can you post a screenshot of the relationships between the fact and billing and customer dimensions?
That all looks OK. How many records do you have in the Customer and Billing dimensions and in the fact table? And how much RAM do you have on your pc? Have you run task manager to see if the report actually is consuming a large amount of RAM?
Hi @d_gosbell ,
It is using 412 MB memry and 1.3% cpu usage.
select count(*) from [U2C_ARIA_ODS_SCHEMA].FACT_AGED_DEBT_EOM
42746
select count(*) from [U2C_ARIA_ODS_SCHEMA].[DIM_BILLING_ACCOUNT]
9841
select count(*) from [P2O_SFDC_ODS_SCHEMA].[DIM_CUSTOMER]
71078
There does appear to be a memory issue when you have multiple comparisons to measures in a SWITCH statement. This looks like a bug to me as the following variation works almost instantly on a model with the same table sizes as yours. You can see that the logic is identical, I'm just caching the measure values in variables (which should not necessary)
Max Age Bucket2a =
var _180 = [180+_DPD]
var _150 = [150-179_DPD]
var _120 = [120-149_DPD]
var _90 = [90-119_DPD]
var _60 = [60-89_DPD]
var _30 = [30-59_DPD]
var _1 = [1-29_DPD]
RETURN SWITCH (TRUE()
,_180 >50,"180+ Days"
,_150 >50,"150-179_DPD"
,_120 >50,"120-149_DPD"
,_90 >50,"90-119_DPD"
,_60 >50,"60-89_DPD"
,_30 >50,"30-59_DPD"
,_1 >50,"1-29_DPD"
)
Hi @Anonymous ,
I might have missed some of the requirements here. So please explain if you need the bucketing in static way (through a column) or should it depend on filters/react to slicing and dicing, so as a measure?
Thanks.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @Anonymous
if "customer_aged_bucket" is a calculated column, then the filter context for the SUMX will be empty. Therefore you see the values for all rows. Transform the row-context into a filter context by wrapping SUMX into a CALCULATE-statement.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF ,
Can you please suggest me how i can write the same formula using calculate instead of sumx