Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
WorkHard
Helper V
Helper V

Count total days while excluding the same date if it was in the interval twice based on ID

I have a table with transactions tied to a project ID.

When counting project duration, I need to count the total days without counting the same days twice if there are multiple transactions occurring on the same dates.

In the data example below, Project ID 1 entire duration should be 8 and Project ID 2 should also be 8.

I do have a calendar table.

 

IDStartEndDuration
11/1/20201/2/20202
11/10/20201/15/20206
11/10/20201/15/20206
21/7/20201/10/20204
21/12/20201/15/20204
21/14/20201/15/20202

 

 

 

1 ACCEPTED SOLUTION

Hi @WorkHard 

 

The DAX to follow the M code logic would be something like this:

 

_projDuration = 
COUNTROWS(
	SUMMARIZE(
		ADDCOLUMNS(
			GENERATEALL(
				zTable2,
				FILTER(
					DISTINCT(cal[date]),
					cal[date] >= zTable2[Start]
					&& cal[date] <= zTable2[End]
				)
			),
			"idDate", [ID] & "-" & [date]
		),
		[idDate]
	)
)

 

 

Or this:

 

_projDuration2 = 
VAR listTable =
	GENERATEALL(
		zTable2,
		FILTER(
			DISTINCT(cal[date]),
			cal[date] >= zTable2[Start]
			&& cal[date] <= zTable2[End]
		)
	)
RETURN
    COUNTROWS(
        DISTINCT(
            SELECTCOLUMNS(
                listTable,
                "idDate", [ID] & "-" & [date]
            )
        )
    )

 

 

With the following outputs:

BA_Pete_0-1603631173967.png

 

I'm unsure if either of these would be very performant over very large datasets, but there's a couple of options there for you to test/compare. There's probably some scope for memory optimisation within the GENERATEALL function by not bringing in the whole of zTable2 in the first argument, but it's the combination of functions that I think might cause the most performance issues.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

4 REPLIES 4
BA_Pete
Super User
Super User

Hi @WorkHard ,

 

In Power Query, go to New Source>Blank Query then in Advanced Editor paste my code over the default code. You can then follow the steps I took to complete this:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUTLUN9Q3MjAyADONIMxYHbicAULS0JRYWSOwiDmSpAG6pKERPq2GJlhkYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Start = _t, End = _t]),
    chgDateTypes = Table.TransformColumnTypes(Source, {{"Start", type date}, {"End", type date}}, "en-US"),
    addDtList = Table.AddColumn(chgDateTypes, "dtList", each List.Transform({Number.From([Start])..Number.From([End])}, each Date.From(_))),
    expandDtList = Table.ExpandListColumn(addDtList, "dtList"),
    addIdDate = Table.AddColumn(expandDtList, "idDate", each Text.Combine({[ID], Text.From([dtList], "en-GB")}, "-"), type text),
    remDupeIdDate = Table.Distinct(addIdDate, {"idDate"}),
    groupRows = Table.Group(remDupeIdDate, {"ID"}, {{"firstStart", each List.Min([Start]), type nullable date}, {"lastEnd", each List.Max([End]), type nullable date}, {"duration", each Table.RowCount(_), Int64.Type}})
in
    groupRows

 

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Hi Pete,

 

Trying to follow your logic in DAX as I'm unable to change the code in Power Query due to my setup.
Sorry for not mentioning that.

 

I would generate a table series with start and end dates first.

DateSeries = 
GENERATE(
    var _DateStart = 'projdata'[Start date]
    var _DateEnd = 'projdata'[End date]
    return
    GENERATESERIES(_DateStart , _DateEnd , 1)
)

 

And then from here, I can just count the number of days from this generated series grouped by project ID.

How would I group it from here by Project ID? WIth a  measure or a calculated column?

 

Hi @WorkHard 

 

The DAX to follow the M code logic would be something like this:

 

_projDuration = 
COUNTROWS(
	SUMMARIZE(
		ADDCOLUMNS(
			GENERATEALL(
				zTable2,
				FILTER(
					DISTINCT(cal[date]),
					cal[date] >= zTable2[Start]
					&& cal[date] <= zTable2[End]
				)
			),
			"idDate", [ID] & "-" & [date]
		),
		[idDate]
	)
)

 

 

Or this:

 

_projDuration2 = 
VAR listTable =
	GENERATEALL(
		zTable2,
		FILTER(
			DISTINCT(cal[date]),
			cal[date] >= zTable2[Start]
			&& cal[date] <= zTable2[End]
		)
	)
RETURN
    COUNTROWS(
        DISTINCT(
            SELECTCOLUMNS(
                listTable,
                "idDate", [ID] & "-" & [date]
            )
        )
    )

 

 

With the following outputs:

BA_Pete_0-1603631173967.png

 

I'm unsure if either of these would be very performant over very large datasets, but there's a couple of options there for you to test/compare. There's probably some scope for memory optimisation within the GENERATEALL function by not bringing in the whole of zTable2 in the first argument, but it's the combination of functions that I think might cause the most performance issues.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Hi @WorkHard ,

 

Not sure if following the M code logic is the most efficient course to take with DAX.

I'll have a think about it now and see if I can come up with a DAX solution that fits. If not, I'll hand you over to one of the DAX specialists on the team.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.