The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
on another thread ( "Gaps&Islands" consecutive days in student absences ) , I got help from @Greg_Deckler when he supplied me with a link to his forumula approriatly named "cuthulhu".
Here is the full formula used as a solution in a calculated column:
Now, business has added another layer to the problem. I have an additional column named [attendance_value] that we want to filter by. [attendance_value] is always going to equal 1.0 or 0.5, and they dont want 0.5 to factor in as consecutive. I tried adding a simple filter in two ways but was not successful.
VAR __index = CALCULATE(MAX('attendance_data'[date]), 'attendance_data'[attendance_value]=1)
or
VAR __tmpTable1 = FILTER(ALL('attendance_data'),'attendance_data'[student_id]=__group && 'attendance_data'[date]<__index && 'attendance_data'[attendance_value]=1)
These changes did not yeild the right result.
Here is sample data where I manually created a column called [desired_cuthulhu] and entered each cell value to equal the result I am trying to get from [cuthuhlu]
student_id | date | attendance_value | cuthuluh | desired_cuthulhu |
269182 | 2/1/2021 | 1 | 1 | 1 |
269182 | 2/2/2021 | 1 | 2 | 2 |
269182 | 2/3/2021 | 1 | 3 | 3 |
269182 | 2/4/2021 | 1 | 4 | 4 |
269182 | 2/5/2021 | 1 | 5 | 5 |
269182 | 2/8/2021 | 0.5 | 1 | 1 |
269182 | 2/9/2021 | 1 | 2 | 1 |
269182 | 2/10/2021 | 1 | 3 | 2 |
269182 | 2/11/2021 | 1 | 4 | 3 |
269182 | 2/12/2021 | 1 | 5 | 4 |
269182 | 2/16/2021 | 0.5 | 1 | 1 |
269182 | 2/17/2021 | 1 | 2 | 1 |
269182 | 2/18/2021 | 0.5 | 3 | 1 |
269182 | 2/19/2021 | 1 | 4 | 1 |
269182 | 2/20/2021 | 1 | 5 | 2 |
Solved! Go to Solution.
Hi, @ncraft
Here is my solution.
Remove the statement at the comment you can get unconditional consecutive counts
Try to create a measure like this:
_cuthuluh_WithConditions =
VAR _FirstToLast_Table =
FILTER(
ALL( 'Table' ),
'Table'[student_id] = MAX( 'Table'[student_id] )
&& 'Table'[date] <= MAX( 'Table'[date] )
&&'Table'[attendance_value]=1//Conditions:value=1
)
VAR _Last =
CALCULATE( MAX( 'Table'[date] ), _FirstToLast_Table )
VAR _First =
CALCULATE( MIN( 'Table'[date] ), _FirstToLast_Table )
VAR _AllValue_List =
GENERATESERIES( _First, _Last, 1 )
VAR _currentList =
SUMMARIZE( _FirstToLast_Table, [date] )
VAR _MisssList =
EXCEPT( _AllValue_List, _currentList )
VAR _countRows =
COUNTROWS( _MisssList )
VAR _MaxOfMisssList =
IF(
_countRows = 0,
_First - 1,
MAXX( EXCEPT( _AllValue_List, _currentList ), [Value] )
)
VAR _diff =IF(
MAX('Table'[attendance_value])=1,//Conditions:value=1
DATEDIFF( _MaxOfMisssList, _Last, DAY ),1)
RETURN
_diff
Result:
Please refer to the attachment below for details.
Hope this helps.
Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi, @ncraft
Here is my solution.
Remove the statement at the comment you can get unconditional consecutive counts
Try to create a measure like this:
_cuthuluh_WithConditions =
VAR _FirstToLast_Table =
FILTER(
ALL( 'Table' ),
'Table'[student_id] = MAX( 'Table'[student_id] )
&& 'Table'[date] <= MAX( 'Table'[date] )
&&'Table'[attendance_value]=1//Conditions:value=1
)
VAR _Last =
CALCULATE( MAX( 'Table'[date] ), _FirstToLast_Table )
VAR _First =
CALCULATE( MIN( 'Table'[date] ), _FirstToLast_Table )
VAR _AllValue_List =
GENERATESERIES( _First, _Last, 1 )
VAR _currentList =
SUMMARIZE( _FirstToLast_Table, [date] )
VAR _MisssList =
EXCEPT( _AllValue_List, _currentList )
VAR _countRows =
COUNTROWS( _MisssList )
VAR _MaxOfMisssList =
IF(
_countRows = 0,
_First - 1,
MAXX( EXCEPT( _AllValue_List, _currentList ), [Value] )
)
VAR _diff =IF(
MAX('Table'[attendance_value])=1,//Conditions:value=1
DATEDIFF( _MaxOfMisssList, _Last, DAY ),1)
RETURN
_diff
Result:
Please refer to the attachment below for details.
Hope this helps.
Best Regards,
Community Support Team _ Zeon Zheng
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
@ncraft I *think* the only change you need to your formula is:
VAR __tmpTable2 = ADDCOLUMNS(__tmpTable1,"__diff",'attendance_data'[date] - MAXX(FILTER(ALL('attendance_data'),'attendance_data'[date]<EARLIER('attendance_data'[date]) && 'attendance_data'[student_id]=EARLIER('attendance_data'[student_id]) && 'attendance_data'[attendance_value]=1),'attendance_data'[date]))
@Greg_Deckler No dice,
Making the change almost worked, but in many cases ist still doesnt, and I cant understand why. Consider the sample data below. After applying the change you suggested, 0.5 values are still counted
student_id | date | attendance_value | cuthulhu |
299740 | 11/5/2020 | 1 | 1 |
299740 | 11/6/2020 | 1 | 2 |
299740 | 11/13/2020 | 0.5 | 1 |
299740 | 12/2/2020 | 1 | 1 |
299740 | 3/9/2021 | 1 | 2 |
299740 | 3/10/2021 | 1 | 3 |
299740 | 3/11/2021 | 1 | 4 |
299740 | 3/12/2021 | 0.5 | 5 |
299740 | 3/26/2021 | 1 | 1 |
299740 | 3/30/2021 | 1 | 1 |
299740 | 3/31/2021 | 0.5 | 2 |
299740 | 4/1/2021 | 1 | 3 |
299740 | 4/2/2021 | 0.5 | 2 |
299740 | 4/5/2021 | 1 | 1 |
299740 | 4/6/2021 | 0.5 | 2 |
299740 | 4/7/2021 | 0.5 | 3 |
299740 | 4/8/2021 | 1 | 2 |
299740 | 4/12/2021 | 1 | 1 |
299740 | 4/13/2021 | 0.5 | 2 |
299740 | 4/14/2021 | 1 | 3 |
299740 | 4/15/2021 | 0.5 | 2 |
299740 | 4/19/2021 | 0.5 | 1 |
299740 | 4/20/2021 | 1 | 2 |
299740 | 4/21/2021 | 0.5 | 2 |
299740 | 4/22/2021 | 0.5 | 3 |
299740 | 4/23/2021 | 1 | 2 |
299740 | 5/7/2021 | 0.5 | 1 |
299740 | 5/10/2021 | 1 | 1 |
299740 | 5/11/2021 | 1 | 2 |
I am not sure what the solution is, but each of these attendance entries also has its own unique ID, I think what I will do as a workaround for my problem is to use Power M quary to make a new table referecing the origional table, filter out the 0.5 values, apply the cuthulhu formula, and then perform a merge quaries to put cuthuluh back on my origional table.