cancel
Showing results for
Did you mean:

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Frequent Visitor

## Holiday Calculation: From Start and End Date to "Aggregation Graph"

Hello everyone, Hello Dax-Experts,

I'm new to Power BI, and I have only a limited Knowledge of DAX.

The following Problem keeps driving me nuts the whole last week:

I have an holiday table with a start and an end Date. There is a Coloumn with the qualification and a lot of other Coloumns with other details like the Name.

I would like to show a graph, which shows the sum of all people who are on holiday on the same day for every day in the year. So the X-Axis should display the year from January the first to 31. december. The Y-Axis should show me how many people are on holiday at the same time. I would like to be able to add a slicer to filter the qualification, name and so on.

To do this, i thought it should be possible to split up the Start and End Date to a table in which every row is displaying one day of the holiday.

2 ACCEPTED SOLUTIONS
Community Champion

Hi @Chemiefrik !

It seems we can achieve this by just slightly modifying @OwenAuger's solution here

http://community.powerbi.com/t5/Desktop/Generate-a-schedule-table/m-p/38834#M14436

```Holiday Schedule =
SUMMARIZE (
GENERATE (
HolidayTable,
CALCULATETABLE (
VALUES ( 'Calendar'[Date] ),
DATESBETWEEN (
'Calendar'[Date],
HolidayTable[Start Date],
HolidayTable[End Date]
)
)
),
HolidayTable[Name],
'Calendar'[Date],
HolidayTable[Qualification]
)```

Here's the result...

and the slicers you mentioned you'd like to use seem to work as well

Hope this helps and works in your case!

EDIT: The Qualification Slicer was not included in the previous pictures

Community Champion

Alternatively, here's my previous method. This was done without creating a new transformed table. Just a measure. Well, a measure and a date table...

I had a table where each record had a PersonID, a start date, and an end date. I had a date table with no active relationship to the other table. Having no relationship is important. Then I created a measure to count based on start and end dates. Put on a visual with some aggregation of a period dates (i.e. week, month, etc), this would give a count of active records during that time period.

Active Count = CALCULATE(

DISTINCTCOUNT(Table[PersonID]),

FILTER(

Table,

Table[Start Date] <= LASTDATE(DateTable[Date]) &&

Table[EndDate] >= FIRSTDATE(DateTable[Date])

)

)

The FIRSTDATE/LASTDATE thing always feels a little counterintuitive to me. The start and end look backward, but think of it this way: if they started anytime before the end of the period, then they were active during the period. Also if they ended anytime after the beginning of the period (whether that's during the period or after the end of the period), they were still active during the period. Make sense?

Also, @Chemiefrik Dax-Experts: Daxperts?

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

13 REPLIES 13
Frequent Visitor

Awesome, i am beginning to Understand how this formula works. It's like putting on the windshield wipers in heavy Rain 🙂

At the Moment i'm still trying to solve the same Problem with a time format instead of a date format.

The best solution so far was to group every Time in Hour segments. (0:00 to 0:59 and so on)

And to make 24 single Diagramms and given every one two filters which do the same as the dax code.

These Diagramms can be displayed without showing the Y-Axis. So that there is only a simple coloum in each diagramm.

When placed all in Order it looks just like a normal Chart.

Thank you, for taking your time to explain to me, what the formula ist doing.

Community Champion

@Chemiefrik fantastic, glad to hear it works. For the time version, did you see my suggested solution way back on page 1 of this thread? There's probably a good halfway point between that solution and yours. Create that time table that I described and add an Hour column to it. Then you'd just use that same suggested formula and use TimeTable[Hour] as the axis on your chart.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Community Champion

Hi @Chemiefrik !

It seems we can achieve this by just slightly modifying @OwenAuger's solution here

http://community.powerbi.com/t5/Desktop/Generate-a-schedule-table/m-p/38834#M14436

```Holiday Schedule =
SUMMARIZE (
GENERATE (
HolidayTable,
CALCULATETABLE (
VALUES ( 'Calendar'[Date] ),
DATESBETWEEN (
'Calendar'[Date],
HolidayTable[Start Date],
HolidayTable[End Date]
)
)
),
HolidayTable[Name],
'Calendar'[Date],
HolidayTable[Qualification]
)```

Here's the result...

and the slicers you mentioned you'd like to use seem to work as well

Hope this helps and works in your case!

EDIT: The Qualification Slicer was not included in the previous pictures

Community Champion

Alternatively, here's my previous method. This was done without creating a new transformed table. Just a measure. Well, a measure and a date table...

I had a table where each record had a PersonID, a start date, and an end date. I had a date table with no active relationship to the other table. Having no relationship is important. Then I created a measure to count based on start and end dates. Put on a visual with some aggregation of a period dates (i.e. week, month, etc), this would give a count of active records during that time period.

Active Count = CALCULATE(

DISTINCTCOUNT(Table[PersonID]),

FILTER(

Table,

Table[Start Date] <= LASTDATE(DateTable[Date]) &&

Table[EndDate] >= FIRSTDATE(DateTable[Date])

)

)

The FIRSTDATE/LASTDATE thing always feels a little counterintuitive to me. The start and end look backward, but think of it this way: if they started anytime before the end of the period, then they were active during the period. Also if they ended anytime after the beginning of the period (whether that's during the period or after the end of the period), they were still active during the period. Make sense?

Also, @Chemiefrik Dax-Experts: Daxperts?

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Regular Visitor

this solution only works if you deactivate the relationship

Community Champion

@Chemiefrik go with @KHorseman's solution - seems more efficient - no need to create an extra table!

Community Champion

Since the other thread was my question I guess I should clarify why I asked for that table when I already had the Active Count formula in hand. It might help readers understand which use case calls for which solution, because both solutions technically work. Basically it's because Active Count retains that two-date structure and can't be related to a date table. If all you want is the count and maybe a few simple derivitives of it (average Active Count per week maybe) it's no big deal. But the more time intelligence stuff you want to do alongside it, the harder it becomes (compare average Active Count per week in the last six months with the same period last year. When you can write that formula, Grasshopper, you may leave the temple). My specific case was essentially predicted future payroll. I could do it without the schedule table, but with the table I can write very simple sum and count measures for things like total hours per week, instead of accomplishing the same with giant mutant versions of Active Count.

TL;DR: if you just want that count based on start and end dates, use the Active Count formula pattern. If you want anything too much more complicated than that, you may want to consider the schedule table.

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Frequent Visitor

Awesome Job Daxperts, Thank you!

This works just ultra smooth 🙂

I tried both suggested solutions and they work both fine. In my current version I'm using the "New Table" Solution, because the other "FirstDate/LastDate" solution ist causing me a headache to figure out how it really works 🙂

I like to have a "deep" understanding what the Programm ist doing to my data to be able to answer to any questions.

These solutions still work smooth with 10.000 rows of Data! That's a lot of Holiday 🙂

Thanks a lot.

I am just wondering ist in possible to adjust this solution for use with a Time Format like 06:44 to 18:52. (HH:MM)

And sort these on an X-Axis from 0 to 24 Hours.

Greetings from Rainy-Hamburg,

Chemiefrik

Community Champion

Yeah it is a little hard to picture what's really happening in that Firstdate/Lastdate solution because half of the magic that makes it work isn't even in the formula itself. Let me see if I can walk you through it.

You have a visual, say a line chart for example. On the X-axis you have a column from that DateTAble that denotes a range. Let's go with Month. Each tick on the X-axis is one month, which means that each tick contains an invisible reference to every date in that month on the DateTable.

Here's the code for that formula again:

Active Count = CALCULATE(

DISTINCTCOUNT(Table[PersonID]),

FILTER(

Table,

Table[Start Date] <= LASTDATE(DateTable[Date]) &&

Table[EndDate] >= FIRSTDATE(DateTable[Date])

)

)

Remember that we didn't create a relationship between DateTable and your data table (which I've just called Table. I've named too many things with the same word. Oops). This is because a relationship would pass its own invisible filter context that would only affect one column from Table, and we don't want that. We want to tell the formula what to do with two of its columns, StartDate and EndDate.

Active Count in plain English is "a count of unique PersonIDs that were active at any time during a given period." They could have A) already been active before the period, B) they could have started and ended during the period, or C) they could have started during the period but not ended until later. All of those count. Logically that means that "active" = had a start date any time before the period ended and have a start date any time after it started.

Now let's put all the pieces together.

Each tick on the X-axis passes a range of approximately 30 days to the Active Count formula. Active Count then looks back at Table and filters it down to only those rows with both 1) a StartDate on or before the last date in that range, and 2) an End Date on or after the first date in that range. All other rows are ignored because of that FILTER() statement. Then it takes those remaining rows and does a count of unique PersonIDs from them.

Make sense?

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Community Champion

Active Count in plain English is "a count of unique PersonIDs that were active at any time during a given period."

They could have:

A) already been active before the period,

B) they could have started and ended during the period, or

C) they could have started during the period but not ended until later.

All of those count.

Logically "active" = had a start date any time before the period ended AND have an end date any time after it started

@KHorseman - I think the above summarizes perfectly why / how your solution works! (and no relationship between the tables!)

If however anyone needs more - here's the link

http://www.powerpivotpro.com/2013/04/counting-active-rows-in-a-time-period-guest-post-from-chris-cam...

Next: Test with blank end dates...

Community Champion

Anyone with a blank end date will be ignored because it's an AND condition. I was about to suggest that you could switch it to an AND NOT condition, i.e. AND have an end date that is not before the period started.

NOT(Table[End Date] < FIRSTDATE(DateTable[Date]))

...but then I remembered that the reason the current formula will ignore them is that a less than/greater than/equal to comparison will treat blanks as zeros so no matter how you phrase the comparison it will always see blank end dates as having ended before any possible time period.

I can think of ways to force it to count anyone with a blank end date, but I probably wouldn't recommend it. In most business cases I've seen a missing end date means it's a bad entry that you wouldn't want to count. You could do fake end dates. Call the existing end date column "Proto-End Date" or something, then add a new column like

End Date = IF(ISBLANK(Table[Proto-End Date]), Table[Start Date] + 90, Table[Proto-End Date])

...or whatever value you want instead of 90 days.

Without filling in the blanks with default data, the only other options I can think of are to ignore them as we already are, or force count them. Any measure that just forces a count on blank end dates will effectively treat anyone with a blank end date as immortal. They'll start but never end. If you want that it's a simple change:

Active Count = CALCULATE(
DISTINCTCOUNT(Table[PersonID]),
FILTER(
Table,
Table[Start Date] <= LASTDATE(DateTable[Date]) &&
(Table[EndDate] >= FIRSTDATE(DateTable[Date]) || ISBLANK(Table[EndDate]))
)
)

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Community Champion

And to address the two other points:

If you think about how few rows and columns ActiveCount is ever working with at once you'll probably get why it runs so smoothly even with 10,000 rows. The Schedule Table solution is efficient for a very similar reason. In a way they both do exactly the same thing. The difference is mostly about what form the output takes. ActiveCount is a measure that's calculated live, whereas Schedule Table predigests the data and gives you a table that looks pretty similar to what's happening behind the scenes in Active Count.

As far as I can tell both of these solutions should work equally well for time ranges, but with a few complications: first you would need a time table similar to DateTable. If you want to drill down to seconds you'll need a row for each second in a day, all 86,400 of them. Second you couldn't use FIRSTDATE or LASTDATE. I believe the new formula would be something like:

Active Count = CALCULATE(

DISTINCTCOUNT(Table[PersonID]),

FILTER(

Table,

Table[StartTime] <= LASTNONBLANK(TimeTable[Time], 1) &&

Table[EndTime] >= FIRSTNONBLANK(TimeTable[Time], 1)

)

)

The syntax for FIRSTNONBLANK and LASTNONBLANK looks a little weird. Unlike FIRSTDATE, those formulas require both a column and an expression, but in this case there is no special expression we want it to evaluate for that column. 1 is synonymous for true, so it's sort of like saying "give me the first non-blank row of the Time column, yes."

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

Helper I

You could create table using CALENDER(BeginDate; EndDate) function.

This returns a table with with said dates and you can then add columns (ADDCOLUMNS) that include qualification and employee name.

Announcements

#### Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.