The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Im working on amodell regarding working ours.
I have a fact table where each row represent a person, working on one activity between two timestamp. So the fact table has the columns User, Activity, FromTimestamp, ToTimestamp.
I want to be able to do an anlysis where I can group the number of working ours within each hour. So for example if I have a fact row looking like this:
UserX;cleaning;2024-01-01 13:05:00;2024-01-01 15:10
So if I have a measure called "number of minutes worked" and I add the hour of the time dimension I wold like that row to look like:
Hour;number of minutes worked
13;55
14;60
15;10
How can I modell this?
I have access to the backend so I can change the data structure so one way is to multiple all the fact rows so it exists one row for each hour but that will encrease the data volumes.
Does enybody has a better model solution for this?
Solved! Go to Solution.
I would also use your approach and multiply the rows in the fact table.
You might not even increase the overall size of the semantic model. Cardinality (the number of distinct values) is the most important factor for model size. [FromTimestamp] and [ToTimestamp] have a high cardinality. [StartDate], [Hour] and [Minutes Worked] have a much lower cardinality, so you might even decrease the overall size of your semantic model.
I would also use your approach and multiply the rows in the fact table.
You might not even increase the overall size of the semantic model. Cardinality (the number of distinct values) is the most important factor for model size. [FromTimestamp] and [ToTimestamp] have a high cardinality. [StartDate], [Hour] and [Minutes Worked] have a much lower cardinality, so you might even decrease the overall size of your semantic model.
Thanks for the reply
Thanks for the reply
I would do as you say and split the table into multiple rows, one for each hour covered. You could do this in SQL, Power Query or DAX. Below is a sample in DAX
SELECTCOLUMNS(
GENERATE(
'Table',
VAR StartYear= YEAR( 'Table'[Start] )
VAR StartMonth = MONTH( 'Table'[Start] )
VAR StartDay = DAY( 'Table'[Start] )
VAR StartHour = HOUR( 'Table'[Start] )
VAR StartMinute = MINUTE( 'Table'[Start] )
VAR NumHours = DATEDIFF( 'Table'[Start], 'Table'[End], HOUR )
VAR EndMinute = MINUTE( 'Table'[End] )
VAR Result = SELECTCOLUMNS(
GENERATESERIES( StartHour, StartHour + NumHours ),
"Hour number", [Value],
"Start date", DATE( StartYear, StartMonth, StartDay ),
"Minutes worked",
VAR StartOfHour = DATE( StartYear, StartMonth, StartDay ) + TIME( [Value], 0, 0)
VAR EndOfHour = DATE( StartYear, StartMonth, StartDay ) + TIME( [Value] + 1, 0, 0)
VAR MinsWorked = DATEDIFF( MAX( StartOfHour, 'Table'[Start] ), MIN( EndOfHour, 'Table'[End] ), MINUTE )
RETURN MinsWorked
)
RETURN Result
),
"User", 'Table'[User],
"Task", 'Table'[Task],
"Start date", [Start date],
"Hour number", [Hour number],
"Minutes worked", [Minutes worked]
)
In this I have stripped out the time part of the start time and left just the date, so that it can be linked to a normal date table.
If you expect the time periods to cross midnight into another day then you will need to implement additional logic to correctly calculate the StartOfHour and EndOfHour variables, but everything else should work the same.