Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hi community,
I have a performance issue with the following simple running total measure:
BLCost =
VAR _MaxDate = MAX ( TasksBLTP[TimeByDay] )
VAR _Result =
CALCULATE (
SUM ( TasksBLTP[TaskBaselineCost] ),
ALL ( 'Date' ),
'Date'[Date] <= _MaxDate
)
RETURN
_Result
This is my data model:
The TasksBLTP table has in total around 200k rows, in the filter context I'm running the measure currently the table has only 73 rows!
The Date table currently contains dates from 1990 to 2030, so in total 14.975 rows. I understand that this might is quite large for a date table, but due to insights required on other tables i currently need such a wide date table.
When I'm running the [BLCost] measure, in a line chart with 'Date'[Date] on the the X-Axis and [BLCost] on the Y-Axis, the query takes around 10sec. to complete.
This to me seems extremely slow considering that there are only 73 rows in TasksBLTP for the current context.
After testing a bit, I discovered that the large Date table is responsible for the poor performance.
The current selection of 73 rows on TasksBLTP , would only require dates from 2024. If I limit the Date table to contain only dates for 2024, the query is actually very fast:
The same query with 10 years in the date table, takes around 700ms.
The time requried to run the query seems to grow exponantially with the number of rows in the Date table.
How can I solve the issue? Should I use a separate Date table for the tables in the model for which I only need limited number of years?
I tried also to limit the the calculation for a certain time period, but the "ALL ( 'Date')" seems to be the driver for the poor performance.
Thanks in advance and Best regards
Solved! Go to Solution.
@ITManuel This is the alternate running total method: Better Running Total - Microsoft Fabric Community
You also might be interested in this post where it is a similar situation with a large date dimension that wreaks utter havoc on CALCULATE. CALCULATE Challenge - Round 1 - Microsoft Fabric Community
Hi @Greg_Deckler & @lbendlin ,
thank you, I'm surprised and impressed about this content in relation "CALCUHATE" 🤣😂🤣.
I was not aware that there is an alternative to CALCULATE for running totals and so many other things.
I tested CALCULATE vs. SUMX for running totals against different sizes of Date and Fact table. SUMX is faster with a very large Date table. However it becomes slower than the solution with CALCULATE with smaller sized Data tables depending on the size of the fact table.
I tested the following:
RT CALC =
CALCULATE (
SUM ( TasksBLTP[TaskBaselineCost] ),
ALL ( 'Date' ),
'Date'[Date] <= MAX ( TasksBLTP[TimeByDay] )
)
RT SUMX =
VAR _MaxDate =
MAX ( TasksBLTP[TimeByDay] )
VAR _T1 =
FILTER ( ALLSELECTED ( TasksBLTP ), TasksBLTP[TimeByDay] <= _MaxDate )
VAR _Result =
SUMX ( _T1, TasksBLTP[TaskBaselineCost] )
RETURN
_Result
Date table with 41 Years: SUMX is faster 1,8x to 46x faster depending on rows in Fact table
Date table with 11 Years: CALCULATE or SUMX is faster depending on rows in Fact table
Date table with 4 Years: CALCULATE or SUMX is faster depending on rows in Fact table
The pbix file can be downloaded under https://we.tl/t-3GUEmfy0fE
Since even SUMX takes 10+sec. for most selections in my real model, I'm thinking of implementing a separate Date table only for the TaskBLTP which only contains the number of rows required for the current selection.
Any other ideas?
Best regards
Keep measuring and then use whatever is appropriate in your scenario. Follow the best practices around how to filter, when to filter etc. The key aim is cardinality reduction without impacting the final result. Cartesian turtles all the way down.
Hi,
Is this any better?
BLCost = CALCULATE(SUM(TasksBLTP[TaskBaselineCost]),datesbetween('Date'[Date],MINX(ALL('Date'),'Date'[Date]),MAX('Date'[Date]))
Hi @Ashish_Mathur ,
I tried the following:
VAR _MinDate =
MIN ( TasksBLTP[TimeByDay] )
VAR _MaxDate =
MAX ( TasksBLTP[TimeByDay] )
VAR _Result =
CALCULATE (
SUM ( TasksBLTP[TaskBaselineCost] ),
DATESBETWEEN ( 'Date'[Date], MINX ( ALL ( 'Date' ), 'Date'[Date] ), _MaxDate )
)
RETURN
_Result
But this version results in "Visual has exceeded the available resources".
instead of ALL(Date) consider using a variable definition like CALENDAR(dt"2024-01-01",TODAY())
BLCost =
VAR Dates = CALENDAR(dt"2024-01-01",MAX ( TasksBLTP[TimeByDay] ))
RETURN
CALCULATE (
SUM ( TasksBLTP[TaskBaselineCost] ),
'Date'[Date] IN Dates
)
Or follow Greg's advice and use an iterator function instead (like SUMX)
Hi @lbendlin ,
I tried with
VAR _Dates = CALENDAR ( MIN ( TasksBLTP[TimeByDay] ), MAX ( TasksBLTP[TimeByDay] ) )
VAR _BLCostNew =
CALCULATE (
SUM ( TasksBLTP[TaskBaselineCost] ),
'Date'[Date] IN _Dates,
'Date'[Date] <= MAX ( TasksBLTP[TimeByDay] )
)
RETURN
_BLCostNew
...but this is not returning a running total anymore.
Could you please guide me to Greg's SUMX Solution?
@ITManuel This is the alternate running total method: Better Running Total - Microsoft Fabric Community
You also might be interested in this post where it is a similar situation with a large date dimension that wreaks utter havoc on CALCULATE. CALCULATE Challenge - Round 1 - Microsoft Fabric Community
User | Count |
---|---|
115 | |
73 | |
60 | |
48 | |
47 |
User | Count |
---|---|
173 | |
123 | |
60 | |
59 | |
57 |