Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
Anonymous
Not applicable

Really slow DAX expression to claculate student absence marks

I'm trying to write a DAX measure to SUM the number of absence marks a student has had since the last time they were marked present. To do this i've split the expression into two parts. The first part calculates the date they were last present, and the second part sums absent marks after that date. The measure works in that it returns the correct values but it is extremely slow (can take around 45 seconds). Im sure ive not written it in a very effective way but have no idea where to start to optimize it. Please can someone help?? Here is the measure;

 

AbsentMarksSincePresent = var _LastDate =
                CALCULATE(
                    MAX(t_DimDate[DateValue]),
                    FILTER(
                            t_FactStudentAttendance,
                            t_FactStudentAttendance[IsPossible]=1 && t_FactStudentAttendance[IsPresent]=1
                    )
                )
var _returnVal =
CALCULATE(
    SUM(t_FactStudentAttendance[IsAbsent]),
    FILTER(t_FactStudentAttendance,t_FactStudentAttendance[IsPossible]=1),
    FILTER(t_DimDate,t_DimDate[DateValue] > _LastDate)
)

RETURN
_returnVal
1 ACCEPTED SOLUTION
Anonymous
Not applicable

// Assuming that t_DimDate has a relationship
// for [DateValue] to some date column in
// the fact table.

[Last Presence Date] =
// When only one student is visible
// it'll return their last date of presence.
// If there are many students visible,
// it'll return the last date of presence
// of any of the students. The calculation
// happens in the time period that's currently
// visible (in the current context). So, this
// measure could potentially return BLANK if
// within the period of time no presence was
// recorded.
var Result =
    MAXX(
        CALCULATETABLE(
            t_FactStudentAttendance,
            KEEPFILTERS(
                t_FactStudentAttendance[IsPossible] = 1 
            ),
            KEEPFILTERS(
                t_FactStudentAttendance[IsPresent] = 1
            )
        ),
        t_FactStudentAttendance[Date]
    )
RETURN
    Result
    

[# Absent Marks Since Present] =
VAR LastPresenceDate = [Last Presence Date]
VAR Result =
// If we can't find the last presence date
// within the curent period of time, we can't
// calculate the number of absences from the
// non-existing date. Hence the IF.
    if( NOT ISBLANK( LastPresenceDate ),
        CALCULATE(
            // Is one field IsPresent not enough?
            // You have to also have IsAbsent? You're
            // proliferating columns and hence make
            // the model bigger. Apart from memory,
            // it may also impact the speed of the code.
            SUM( t_FactStudentAttendance[IsAbsent] ),
            KEEPFILTERS(
                t_FactStudentAttendance[IsPossible] = 1
            ),
            KEEPFILTERS(
                LastPresenceDate < t_DimDate[DateValue]
            )
        )
    )
RETURN
    Result

Your measure(s) is/are slow because you're not aware of the golden rule of DAX programming: You should never filter a table if you can filter a column. Especially filtering a full fact table is detrimental to the speed of execution.

View solution in original post

3 REPLIES 3
Anonymous
Not applicable

// Assuming that t_DimDate has a relationship
// for [DateValue] to some date column in
// the fact table.

[Last Presence Date] =
// When only one student is visible
// it'll return their last date of presence.
// If there are many students visible,
// it'll return the last date of presence
// of any of the students. The calculation
// happens in the time period that's currently
// visible (in the current context). So, this
// measure could potentially return BLANK if
// within the period of time no presence was
// recorded.
var Result =
    MAXX(
        CALCULATETABLE(
            t_FactStudentAttendance,
            KEEPFILTERS(
                t_FactStudentAttendance[IsPossible] = 1 
            ),
            KEEPFILTERS(
                t_FactStudentAttendance[IsPresent] = 1
            )
        ),
        t_FactStudentAttendance[Date]
    )
RETURN
    Result
    

[# Absent Marks Since Present] =
VAR LastPresenceDate = [Last Presence Date]
VAR Result =
// If we can't find the last presence date
// within the curent period of time, we can't
// calculate the number of absences from the
// non-existing date. Hence the IF.
    if( NOT ISBLANK( LastPresenceDate ),
        CALCULATE(
            // Is one field IsPresent not enough?
            // You have to also have IsAbsent? You're
            // proliferating columns and hence make
            // the model bigger. Apart from memory,
            // it may also impact the speed of the code.
            SUM( t_FactStudentAttendance[IsAbsent] ),
            KEEPFILTERS(
                t_FactStudentAttendance[IsPossible] = 1
            ),
            KEEPFILTERS(
                LastPresenceDate < t_DimDate[DateValue]
            )
        )
    )
RETURN
    Result

Your measure(s) is/are slow because you're not aware of the golden rule of DAX programming: You should never filter a table if you can filter a column. Especially filtering a full fact table is detrimental to the speed of execution.

Anonymous
Not applicable

@Anonymous 

Thank you so much for taking the time to respond to this, really appeciate it. That works perfectly, thank you. Are you able to possibly explain the use of KEEPFILTERS (where i had used FILTER)? Its not sometihng i've used before. Ive looked up the function but didnt really understand the explanation. 

 

Also, would the best course of action be to set up IsPossible and IsAbsent Yes/No flags in a dimension (I have register mark dimension) and use those fields to filter instead? They are currently in the fact as 0 or 1 because they are additive and they allow me to SUM possible register marks by student, course, faculty etc..

Anonymous
Not applicable

HERE is an explanation of KEEPFILTERS.

 

The best course of action is:

1. If you have an attribute in a fact table, it should not be used in slicing directly (exception: degenerate dimensions). It should be hidden and only used internally by DAX code. In such a case you don't need to bother with KEEPFILTERS.

2. If you have an attribute in a fact table and insist on letting end users use it, you most likely have to use KEEPFILTERS. This depends on WHAT you want to calculate.

3. If you want to expose a boolean flag to the end user, it's better to use descriptive text values instead of true/false or 0/1. If you don't expose the flag and only use internally, it should be a true boolean value (true/false).

4. If there's a boolean field like IsPresent, a complementary field IsAbsent is a waste of time and memory since you obtain IsAbsent by saying NOT( IsPresent ). If you want to turn such a field to an int value (to be able to sum over it), you can use INT( IsPresent ). If you want to turn false() into 1, you can use 1 - false().

 

So, all in all, having 2 fields which are logical complements of each other... is a waste of time and memory.

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors