Solved: call of cuthuluh (cumulative consecutive values wi...

ncraft · ‎01-21-2022

on another thread ( "Gaps&Islands" consecutive days in student absences ) , I got help from @Greg_Deckler when he supplied me with a link to his forumula approriatly named "cuthulhu".

Here is the full formula used as a solution in a calculated column:

consecutive_absences =

VAR __index = CALCULATE(MAX('attendance_data'[date]))

VAR __group = CALCULATE(MAX('attendance_data'[student_id]))

VAR __tmpTable1 = FILTER(ALL('attendance_data'),'attendance_data'[student_id]=__group && 'attendance_data'[date]<__index)
VAR __tmpTable2 = ADDCOLUMNS(__tmpTable1,"__diff",'attendance_data'[date] - MAXX(FILTER(ALL('attendance_data'),'attendance_data'[date]<EARLIER('attendance_data'[date]) && 'attendance_data'[student_id]=EARLIER('attendance_data'[student_id])),'attendance_data'[date]))

VAR __max = MAXX(__tmpTable2,'attendance_data'[date])

VAR __maxStart = MAXX(FILTER(__tmpTable2,[__diff]>1),'attendance_data'[date])

VAR __tmpTable3 = FILTER(__tmpTable2,'attendance_data'[date]>=__maxStart)

VAR cuthulhu = IF(ISBLANK(__max),1,IF(__max=__index-1,COUNTROWS(__tmpTable3)+1,1))

RETURN

cuthulhu

Now, business has added another layer to the problem. I have an additional column named [attendance_value] that we want to filter by. [attendance_value] is always going to equal 1.0 or 0.5, and they dont want 0.5 to factor in as consecutive. I tried adding a simple filter in two ways but was not successful.

VAR __index = CALCULATE(MAX('attendance_data'[date]), 'attendance_data'[attendance_value]=1)

or

VAR __tmpTable1 = FILTER(ALL('attendance_data'),'attendance_data'[student_id]=__group && 'attendance_data'[date]<__index && 'attendance_data'[attendance_value]=1)

These changes did not yeild the right result.

Here is sample data where I manually created a column called [desired_cuthulhu] and entered each cell value to equal the result I am trying to get from [cuthuhlu]

student_id	date	attendance_value	cuthuluh	desired_cuthulhu
269182	2/1/2021	1	1	1
269182	2/2/2021	1	2	2
269182	2/3/2021	1	3	3
269182	2/4/2021	1	4	4
269182	2/5/2021	1	5	5
269182	2/8/2021	0.5	1	1
269182	2/9/2021	1	2	1
269182	2/10/2021	1	3	2
269182	2/11/2021	1	4	3
269182	2/12/2021	1	5	4
269182	2/16/2021	0.5	1	1
269182	2/17/2021	1	2	1
269182	2/18/2021	0.5	3	1
269182	2/19/2021	1	4	1
269182	2/20/2021	1	5	2

v-angzheng-msft · ‎01-24-2022

Hi, @ncraft

Here is my solution.

Remove the statement at the comment you can get unconditional consecutive counts

Try to create a measure like this:

_cuthuluh_WithConditions = 
VAR _FirstToLast_Table =
    FILTER(
        ALL( 'Table' ),
        'Table'[student_id] = MAX( 'Table'[student_id] )
            && 'Table'[date] <= MAX( 'Table'[date] )
            &&'Table'[attendance_value]=1//Conditions:value=1
    )
VAR _Last =
    CALCULATE( MAX( 'Table'[date] ), _FirstToLast_Table )
VAR _First =
    CALCULATE( MIN( 'Table'[date] ), _FirstToLast_Table )
VAR _AllValue_List =
    GENERATESERIES( _First, _Last, 1 )
VAR _currentList =
    SUMMARIZE( _FirstToLast_Table, [date] )
VAR _MisssList =
    EXCEPT( _AllValue_List, _currentList )
VAR _countRows =
    COUNTROWS( _MisssList )
VAR _MaxOfMisssList =
    IF(
        _countRows = 0,
        _First - 1,
        MAXX( EXCEPT( _AllValue_List, _currentList ), [Value] )
    )
VAR _diff =IF(
    MAX('Table'[attendance_value])=1,//Conditions:value=1
    DATEDIFF( _MaxOfMisssList, _Last, DAY ),1)
RETURN
    _diff

Result:

Please refer to the attachment below for details.

Hope this helps.

Best Regards,
Community Support Team _ Zeon Zheng

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

How to get your questions answered quickly -- How to provide sample data

View solution in original post

v-angzheng-msft · ‎01-24-2022

Hi, @ncraft

Here is my solution.

Remove the statement at the comment you can get unconditional consecutive counts

Try to create a measure like this:

_cuthuluh_WithConditions = 
VAR _FirstToLast_Table =
    FILTER(
        ALL( 'Table' ),
        'Table'[student_id] = MAX( 'Table'[student_id] )
            && 'Table'[date] <= MAX( 'Table'[date] )
            &&'Table'[attendance_value]=1//Conditions:value=1
    )
VAR _Last =
    CALCULATE( MAX( 'Table'[date] ), _FirstToLast_Table )
VAR _First =
    CALCULATE( MIN( 'Table'[date] ), _FirstToLast_Table )
VAR _AllValue_List =
    GENERATESERIES( _First, _Last, 1 )
VAR _currentList =
    SUMMARIZE( _FirstToLast_Table, [date] )
VAR _MisssList =
    EXCEPT( _AllValue_List, _currentList )
VAR _countRows =
    COUNTROWS( _MisssList )
VAR _MaxOfMisssList =
    IF(
        _countRows = 0,
        _First - 1,
        MAXX( EXCEPT( _AllValue_List, _currentList ), [Value] )
    )
VAR _diff =IF(
    MAX('Table'[attendance_value])=1,//Conditions:value=1
    DATEDIFF( _MaxOfMisssList, _Last, DAY ),1)
RETURN
    _diff

Result:

Please refer to the attachment below for details.

Hope this helps.

Best Regards,
Community Support Team _ Zeon Zheng

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

How to get your questions answered quickly -- How to provide sample data

Greg_Deckler · ‎01-21-2022

@ncraft I *think* the only change you need to your formula is:

VAR __tmpTable2 = ADDCOLUMNS(__tmpTable1,"__diff",'attendance_data'[date] - MAXX(FILTER(ALL('attendance_data'),'attendance_data'[date]<EARLIER('attendance_data'[date]) && 'attendance_data'[student_id]=EARLIER('attendance_data'[student_id]) && 'attendance_data'[attendance_value]=1),'attendance_data'[date]))

Follow on LinkedIn
@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!: Power BI Cookbook Third Edition (Color)

DAX is easy, CALCULATE makes DAX hard...

ncraft · ‎01-24-2022

@Greg_Deckler No dice,
Making the change almost worked, but in many cases ist still doesnt, and I cant understand why. Consider the sample data below. After applying the change you suggested, 0.5 values are still counted

student_id	date	attendance_value	cuthulhu
299740	11/5/2020	1	1
299740	11/6/2020	1	2
299740	11/13/2020	0.5	1
299740	12/2/2020	1	1
299740	3/9/2021	1	2
299740	3/10/2021	1	3
299740	3/11/2021	1	4
299740	3/12/2021	0.5	5
299740	3/26/2021	1	1
299740	3/30/2021	1	1
299740	3/31/2021	0.5	2
299740	4/1/2021	1	3
299740	4/2/2021	0.5	2
299740	4/5/2021	1	1
299740	4/6/2021	0.5	2
299740	4/7/2021	0.5	3
299740	4/8/2021	1	2
299740	4/12/2021	1	1
299740	4/13/2021	0.5	2
299740	4/14/2021	1	3
299740	4/15/2021	0.5	2
299740	4/19/2021	0.5	1
299740	4/20/2021	1	2
299740	4/21/2021	0.5	2
299740	4/22/2021	0.5	3
299740	4/23/2021	1	2
299740	5/7/2021	0.5	1
299740	5/10/2021	1	1
299740	5/11/2021	1	2

I am not sure what the solution is, but each of these attendance entries also has its own unique ID, I think what I will do as a workaround for my problem is to use Power M quary to make a new table referecing the origional table, filter out the 0.5 values, apply the cuthulhu formula, and then perform a merge quaries to put cuthuluh back on my origional table.

call of cuthuluh (cumulative consecutive values with conditions)

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

How to Get Your Question Answered Quickly

Join the #PBI10 DataViz contest