Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
[I apologize as I believe I posted this first under the wrong forum - deleted that post and reposting here]
Hi everyone, I'm working with a model that has a large table of contracts (100s of thousands) with several date columns such as (creation_date, application_date, effective_date, expiration_date, cancellation_date, etc.) The table also has columns for contract_id and customer_id.
The model uses a disconnected calendar table to produce the report visuals. This calendar table has columns for the report type (e.g. Monthly, Quarterly, or Annual), and corresponding calendar periods and period end dates. It has a little more columns and optiosn than this, but to give you an idea of how it works, this is what it generally looks like.
report_type | calendar_period | start_date | end_date |
Monthly | Jan-2017 | 01/01/2017 | 31/01/2017 |
Monthly | ... | ... | ... |
Monthly | Aug-2022 | 01/08/2022 | 31/08/2022 |
Quarterly | Q1-2017 | 01/01/2017 | 31/03/2017 |
Quarterly | ... | ... | ... |
Quarterly | Q3-2022 | 01/07/2022 | 30/09/2022 |
Annual | 2017 | 01/01/2017 | 12/31/2017 |
Annual | ... | ... | ... |
Annual | 2022 | 01/01/2022 | 12/31/2022 |
The result is that a lot of the measures in the model end up following the same structure:
MEASURE 'Some Measures Table'[Measure 1] =
VAR period_start = MIN ( 'ReportingCalendar'[start_date] )
VAR period_end = MAX ( 'ReportingCalendar'[end_date] )
RETURN CALCULATE (
[Some Base Measure],
// Some additional filters
KEEPFILTERS ( contracts_table[contract_status] <> "QUOTE" ),
KEEPFILTERS ( contracts_table[_is_original] ),
// Some date based filters
KEEPFILTERS ( contracts_table[application_date] <= period_end )
)
The above works really well in terms of making the DAX code modular and reducing the duplicaiton of code, but I found that the date filters are increasing the time it takes to calculate measures by 4x or more, and this impact is being propagated throughout my report as the business logic is such that date filters get used everywhere and in increasingly complex ways. For example, here's an example of a complex higher level measure:
MEASURE 'Some Measures Table'[Complex Measure] =
VAR period_start = MIN ( 'ReportingCalendar'[start_date] )
VAR period_end = MAX ( 'ReportingCalendar'[end_date] )
RETURN CALCULATE (
[Base Measure],
KEEPFILTERS(product_dimension[type] = "x"),
KEEPFILTERS(
contracts_table[application_date] >= period_start &&
contracts_table[application_date] <= period_end
),
KEEPFILTERS(contracts_table[expiry_date] > period_start || ISBLANK([expiry_date])),
KEEPFILTERS(
contracts_table[contract_status] <> "CANCEL" ||
contracts_table[cancellation_date] > period_end
)
)
As you can see, I'm having to use three separate date columns to calculate this measure. In this case I'm looking to calculate the base measure for contracts that were applied for in the calendar period and were still in effect by the end of the period (i.e. were not cancelled or did not expire before the end of the period). The base measure here can be a sum of contract values, or a count of contracts, or a unique count of customers. All of these are use cases in the report.
The net result of this is that some visuals in the report are taking anywhere between 3-4 minutes to load!!!
I thought of creating a static aggregation table that pre-calculates these measures for all sorts of dimensions I need to filter by in the report (e.g. geography, calendar periods, products, etc.), but the challenge is that some of the measures are for customers and require doing count of unique customer IDs, which always requires going back to the actual contracts table.
I'm wondering if there are best practices that I'm missing to deal with situations like this that could result in significant performance improvement.
If anyone has any ideas, that would greatly be appreciated.
Solved! Go to Solution.
@Hamzehn , You can try like
new measure =
var _max = maxx(allselected(Date),Date[Date])
var _min = minx(allselected(Date),Date[Date])
return
calculate( sum(contracts_table[Value]), filter(contracts_table, contracts_table[expiry_date] >= _max || ISBLANK([expiry_date])) )
new measure =
var _max = maxx(allselected(Date),Date[Date])
var _min = minx(allselected(Date),Date[Date])
return
calculate( sum(contracts_table[Value]), filter(contracts_table,contracts_table[contract_status] <> "CANCEL" || contracts_table[cancellation_date] > _max
) )
@Hamzehn , with many dates you have approach with one active join and no active join. File attached for both after signature
blog to refer
Thank you for taking the time to respond to my post. I had contemplated creating a calendar table with actual dates and using it that way, but I ran across a couple of questions/challenges that I couldn't answer:
contracts_table[expiry_date] >= period_start || ISBLANK([expiry_date])
or,
contracts_table[contract_status] <> "CANCEL" || contracts_table[cancellation_date] > period_end
both of which are being done in the same measure too
Would really appreciate your input on these.
Thanks again.
@Hamzehn , You can try like
new measure =
var _max = maxx(allselected(Date),Date[Date])
var _min = minx(allselected(Date),Date[Date])
return
calculate( sum(contracts_table[Value]), filter(contracts_table, contracts_table[expiry_date] >= _max || ISBLANK([expiry_date])) )
new measure =
var _max = maxx(allselected(Date),Date[Date])
var _min = minx(allselected(Date),Date[Date])
return
calculate( sum(contracts_table[Value]), filter(contracts_table,contracts_table[contract_status] <> "CANCEL" || contracts_table[cancellation_date] > _max
) )
User | Count |
---|---|
47 | |
26 | |
22 | |
17 | |
15 |
User | Count |
---|---|
55 | |
34 | |
18 | |
17 | |
15 |