Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.

Reply
mb769
Helper I
Helper I

DAX query needs optimization

The basic measure I use in my report, is the number of staff members (in a given acadmic year, selected with a slicer in my DAX date table 'AcademicDate'). This is simple, but I can only count the employees which have a Full Time Equivalent above 0; which means, they must have worked in that year.

This gives following measure:

NumberOfEmployees = 
CALCULATE(DISTINCTCOUNT(TASK[EMPLOYEE_ID]),
    FILTER(TASK, [FTE] > 0)
)

For context: The employees' jobs are divided in different "tasks"; each with a starting and ending date, and a 'volume' (determined by a numerator and denominator. Moreover, an employee can take different kinds of "leave" (like parental leave, sickness, temporary detachment to another institution, ...) for a part or the whole of their tasks (based on another numerator; but not all kinds of "leave" may be included in the calculation of the measure FTE.

My measures are working, but the calculation of FTE is rather heavy. Since NumberOfEmployees is the basic measure I use for tons of other measure calculations, optimization of this measure is important. Some visuals are not loading because of this (Not enough resources).

 

Does anyone have an idea how to improve?

FTE = 

VAR _dates = FILTER (AcademicDate, AcademicDate[Date] <= MAX(AcademicDate[Date]) && AcademicDate[Date] >= MIN(AcademicDate[Date]))

VAR _merge = FILTER(CROSSJOIN(_dates, TASK), [Date] >= TASK[StartingDate] && [Date] <= TASK[EndingDate])

VAR _withActualWorkedHours =
ADDCOLUMNS (
    _merge, "ActualWorkedHours", TASK[Numerator] -
    SUMX (
        FILTER (
            LEAVE,
            LEAVE[StartingDate] <= AcademicDate[Date] && LEAVE[EndingDate] >= AcademicDate[Date] && LEAVE[TASK_ID] = TASK[TASK_ID]),
    LEAVE[Numerator])
)

VAR _includeOrNot =
ADDCOLUMNS (
    _withActualWorkedHours, "Include", CALCULATE(MAXX(LEAVE, LEAVE[Include]), FILTER(LEAVE,LEAVE[StartingDate]<=AcademicDate[Date] && LEAVE[EndingDate] >= AcademicDate[Date] && TASK[TASK_ID] = LEAVE[TASK_ID]))
)

RETURN
CALCULATE (
    SUMX(_includeOrNot, IF(OR([Include]=1,ISBLANK([Include])),[ActualWorkedHours],0) / TASK[Denominator]) / (DATEDIFF(MIN(AcademicDate[Date]), MAX(AcademicDate[Date]),DAY)+1)
)

 

NB: This calculates the exact value of FTE, which I need in other calculations. The only thing I need now is to know whether or not FTE > 0

1 ACCEPTED SOLUTION
Bibiano_Geraldo
Super User
Super User

Hi @mb769 ,
To optimize the code, I avoided using CROSSJOIN, which was generating unnecessary combinations of dates and tasks. Instead, I applied direct filters on the TASK and LEAVE tables, reducing the computational load. I also used variables to calculate values only once, preventing repeated calculations and improving efficiency. Finally, I simplified the use of SUMX and ADDCOLUMNS, reorganizing the functions to reduce processing time and make the code more straightforward and faster.

FTE = 
VAR _dateRange = FILTER(
    AcademicDate, 
    AcademicDate[Date] <= MAX(AcademicDate[Date]) && AcademicDate[Date] >= MIN(AcademicDate[Date])
)
VAR _tasksInDateRange = 
    FILTER(
        TASK,
        TASK[StartingDate] <= MAX(AcademicDate[Date]) && TASK[EndingDate] >= MIN(AcademicDate[Date])
    )
VAR _leavePeriods = 
    FILTER(
        LEAVE, 
        LEAVE[StartingDate] <= MAX(AcademicDate[Date]) && LEAVE[EndingDate] >= MIN(AcademicDate[Date])
    )
VAR _actualWorkedHours = 
    ADDCOLUMNS(
        _tasksInDateRange,
        "ActualWorkedHours",
        TASK[Numerator] - SUMX(
            FILTER(
                _leavePeriods,
                LEAVE[TASK_ID] = TASK[TASK_ID]
            ),
            LEAVE[Numerator]
        )
    )
VAR _includeLeave = 
    ADDCOLUMNS(
        _actualWorkedHours,
        "Include",
        CALCULATE(
            MAXX(
                _leavePeriods, 
                LEAVE[Include]
            )
        )
    )
RETURN 
    SUMX(
        _includeLeave,
        IF(
            OR([Include] = 1, ISBLANK([Include])), 
            [ActualWorkedHours], 
            0
        ) / TASK[Denominator]
    ) / (DATEDIFF(MIN(AcademicDate[Date]), MAX(AcademicDate[Date]), DAY) + 1)

View solution in original post

6 REPLIES 6
Bibiano_Geraldo
Super User
Super User

Hi @mb769 ,
To optimize the code, I avoided using CROSSJOIN, which was generating unnecessary combinations of dates and tasks. Instead, I applied direct filters on the TASK and LEAVE tables, reducing the computational load. I also used variables to calculate values only once, preventing repeated calculations and improving efficiency. Finally, I simplified the use of SUMX and ADDCOLUMNS, reorganizing the functions to reduce processing time and make the code more straightforward and faster.

FTE = 
VAR _dateRange = FILTER(
    AcademicDate, 
    AcademicDate[Date] <= MAX(AcademicDate[Date]) && AcademicDate[Date] >= MIN(AcademicDate[Date])
)
VAR _tasksInDateRange = 
    FILTER(
        TASK,
        TASK[StartingDate] <= MAX(AcademicDate[Date]) && TASK[EndingDate] >= MIN(AcademicDate[Date])
    )
VAR _leavePeriods = 
    FILTER(
        LEAVE, 
        LEAVE[StartingDate] <= MAX(AcademicDate[Date]) && LEAVE[EndingDate] >= MIN(AcademicDate[Date])
    )
VAR _actualWorkedHours = 
    ADDCOLUMNS(
        _tasksInDateRange,
        "ActualWorkedHours",
        TASK[Numerator] - SUMX(
            FILTER(
                _leavePeriods,
                LEAVE[TASK_ID] = TASK[TASK_ID]
            ),
            LEAVE[Numerator]
        )
    )
VAR _includeLeave = 
    ADDCOLUMNS(
        _actualWorkedHours,
        "Include",
        CALCULATE(
            MAXX(
                _leavePeriods, 
                LEAVE[Include]
            )
        )
    )
RETURN 
    SUMX(
        _includeLeave,
        IF(
            OR([Include] = 1, ISBLANK([Include])), 
            [ActualWorkedHours], 
            0
        ) / TASK[Denominator]
    ) / (DATEDIFF(MIN(AcademicDate[Date]), MAX(AcademicDate[Date]), DAY) + 1)

Wow, this certainly improves a lot!

I never realised that I could perfectly omit the CROSSJOIN

Well, I spoke too soon. I cannot omit this CROSSJOIN.
I need the combiniation between DATE and TAKS, because a LEAVE can be for only a small period of a TASK. The CROSSJOIN function allowed me to check on day-by-day basis whether I had to take a LEAVE into account.

Without the CROSSJOIN, the LEAVE is substracted for the whole of the TASK.

Hi @mb769 ,

I made some changes here to include the CROSSJOIN, please let me know if was optimized:

FTE = 
VAR _dateRange = 
    FILTER(
        AcademicDate, 
        AcademicDate[Date] >= MIN(AcademicDate[Date]) &&
        AcademicDate[Date] <= MAX(AcademicDate[Date])
    )

VAR _taskRange =
    FILTER(
        TASK,
        TASK[StartingDate] <= MAX(AcademicDate[Date]) &&
        TASK[EndingDate] >= MIN(AcademicDate[Date])
    )

VAR _expandedTasks =
    FILTER(
        CROSSJOIN(_dateRange, _taskRange),
        AcademicDate[Date] >= TASK[StartingDate] &&
        AcademicDate[Date] <= TASK[EndingDate]
    )

VAR _leaveFiltered =
    FILTER(
        LEAVE, 
        LEAVE[StartingDate] <= MAX(AcademicDate[Date]) &&
        LEAVE[EndingDate] >= MIN(AcademicDate[Date])
    )

VAR _actualWorkedHours =
    ADDCOLUMNS(
        _expandedTasks,
        "ActualWorkedHours",
        TASK[Numerator] - SUMX(
            FILTER(
                _leaveFiltered,
                LEAVE[StartingDate] <= AcademicDate[Date] &&
                LEAVE[EndingDate] >= AcademicDate[Date] &&
                LEAVE[TASK_ID] = TASK[TASK_ID]
            ),
            LEAVE[Numerator]
        )
    )

VAR _includeLeave =
    ADDCOLUMNS(
        _actualWorkedHours,
        "Include",
        CALCULATE(
            MAXX(
                _leaveFiltered, 
                LEAVE[Include]
            ),
            FILTER(
                _leaveFiltered, 
                LEAVE[StartingDate] <= AcademicDate[Date] &&
                LEAVE[EndingDate] >= AcademicDate[Date] &&
                LEAVE[TASK_ID] = TASK[TASK_ID]
            )
        )
    )

RETURN 
    SUMX(
        _includeLeave,
        IF(
            OR([Include] = 1, ISBLANK([Include])), 
            [ActualWorkedHours], 
            0
        ) / TASK[Denominator]
    ) / (DATEDIFF(MIN(AcademicDate[Date]), MAX(AcademicDate[Date]), DAY) + 1)

Happy it was helpful.

 

Thank you for your feedback.

danextian
Super User
Super User

I dont have your model so I am only going to mention some of the best practices.

 

VAR _dates = FILTER (AcademicDate, AcademicDate[Date] <= MAX(AcademicDate[Date]) && AcademicDate[Date] >= MIN(AcademicDate[Date]))
--do you need all the columns in AcademicDate, if not just filter the relevant columns and not the whole table. You can use KEEPFILTERS('table[column] and the condition) or FILTER(VALUES('table[column]))...


VAR _merge = FILTER(CROSSJOIN(_dates, TASK), [Date] >= TASK[StartingDate] && [Date] <= TASK[EndingDate])
--reduce the number of rows first by filtering a table FILTER(TASK, TASK[Date] >= TASK[StartingDate] && TASK[Date] >= TASK[EndingDate])
--again, do you need all columns from these two tables
--pick just the needed columns SUMMARIZE(FILTER(TASK, TASK[Date] >= TASK[StartingDate] && TASK[Date] >= TASK[EndingDate]), [column])

 

The bottomline is pick only the relevant columns and not the whole table itself.





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!

FebPBI_Carousel

Power BI Monthly Update - February 2025

Check out the February 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.