Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
Dear all,
I have a date table (with date times, years, months, days, hours, minutes). I then have an events table which defines the start and end dates of events. I need to be able to say at which rows in the date table the events are occuring. To do this in SQL or other BI environments I would do a join along the lines of
WHERE datetable[datetime] >= events[start date] AND datetable[datetime] <= events[end date]
This does not seem to be an option in power query. It is a table of 1.8 million rows of events so I need to be able to do this computation pretty efficiently.
The reason I need to do this is to map room booking data against data in a datetime table that measure occupancy in the room.
How should I approach this problem in Power BI?
Kind regards,
Ben
Solved! Go to Solution.
That sounds great & should be doable.
So you only need to create 15-minute-intervalls, which will significantly speed up load and reduce table length. Just replace the following steps with this code:
DurationMinutes = Table.AddColumn(#"Removed Columns1", "Minutes", each if [Custom]=[start date] then {Number.RoundDown(Duration.Minutes([Duration])/15,0)..3} else if [Custom]=[end date] then {0..Number.RoundDown(Duration.Minutes([Duration])/15,0)} else {0..3}), #"Expanded Minutes" = Table.ExpandListColumn(DurationMinutes, "Minutes"), Minutes = Table.AddColumn(#"Expanded Minutes", "DateTime", each [DayHour]+#duration(0,0,[Minutes]*15,0)),
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi,
Thanks for your help! That code you have given me will be useful for some other problems. After playing around with this I found out I was thinking about things the wrong way. Instead what I did was reverse things.
Instead I calculated it in DAX in a calculated column in the timetabling table. Basically I looked up the max value of room occupancy in for a given event.
Max Occupancy (APC) = CALCULATE( MAX('APC Data'[People Inside]), FILTER( FILTER('APC Data', 'APC Data'[RoomName] = 'CELCAT Timetabling'[RoomName]), 'APC Data'[DateTime] >= 'CELCAT Timetabling'[DateTime Start]
&& 'APC Data'[DateTime] <= 'CELCAT Timetabling'[DateTime End] ) )
Actually, it's not much different in M, just add a column to your events-table and expand the result:
DateAllocation= Table.AddColumn( Events, "Custom", (FilterTable) => Table.SelectRows( Table.Buffer(Dates), (ValuesTable)=> ValuesTable[datetime]>= FilterTable[start date] and ValuesTable[datetime]<=FilterTable[end date] ) )
The Table.Buffer is essential for speed.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
@ImkeF Thanks for your code:
I am tring to do something similar but struggling to adjust your code.
In my case I have one HearRate table with HR_DateTime column
and another Workouts table with WO_StartDateTime and WO_EndDateTime columns
I would like to bring a WorkoutType to my HR table when HR_DateTime>=WO_StartDateTime and HR_DateTime<= WO_EndDateTime.
I feel like your code is supposed to help, but you are adding the expandable column to the Events table (Workouts in my case), but I need to bring it to my HR table...
Please help
Thanks
Michael
DateAllocation= Table.AddColumn( Events, "Custom", (FilterTable) => Table.SelectRows( Table.Buffer(Dates), (ValuesTable)=> ValuesTable[datetime]>= FilterTable[start date] and ValuesTable[datetime]<=FilterTable[end date] ) )
"
Hi @michaelshparber ,
if my understanding is correct, it should work like so:
DateAllocation= Table.AddColumn( Events, "Custom", (HR) => Table.SelectRows( Table.Buffer(YourWorksoutsTable), (WO)=> HR[HR_DateTime] >= WO[WO_StartDateTime] and HR[HR_DateTime] <= WO[WO_EndDateTime] ) )
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi @ImkeF ,
I am struggling a bit to translate this to my situation.
I have an item table that I would like combine with a date table.
There are 180 dates in the table so I would like to create 180 item-date combinations per item in a query.
How should I write this in M? In sql I would normally write something like join dates is not null.
These are starting points for function definitions: https://docs.microsoft.com/en-us/powerquery-m/understanding-power-query-m-functions
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Thanks! I will try that out and let you know how I get along. Away for the next week so will ahve a go in a weeks time!
I am running this now, however it was going really slowly and got about 60 rows in and then ran out of memory. Would this be excepted. The events table is 1.8 million rows long and the dates table 47,000 rows..
OK, that's a pretty long dates table. So you really need it on minute level? Are your events rather short / how long will the resulting table become? You're running 64bit?
Anyway - this is a different approach: No merge, just stepwise calculation of the needed time invervalls:
let Source = SourceEvents, Duration = Table.AddColumn(Source, "Duration", each [end date]-[start date]), DurationDays = Table.AddColumn(Duration, "Days", each {0..Duration.Days([Duration])}), #"Expanded Days" = Table.ExpandListColumn(DurationDays, "Days"), Days = Table.AddColumn(#"Expanded Days", "Custom", each [start date]+#duration([Days],0,0,0)), #"Removed Columns" = Table.RemoveColumns(Days,{"Days"}), DurationHours = Table.AddColumn(#"Removed Columns", "Hours", each if [Custom]=[start date] then {Duration.Hours([Duration])..23} else if [Custom]=[end date] then {0..Duration.Hours([Duration])} else {0..23}), #"Expanded Hours" = Table.ExpandListColumn(DurationHours, "Hours"), Hours = Table.AddColumn(#"Expanded Hours", "DayHour", each [Custom]+#duration(0,[Hours],0,0)), #"Removed Columns1" = Table.RemoveColumns(Hours,{"Hours"}), DurationMinutes = Table.AddColumn(#"Removed Columns1", "Minutes", each if [Custom]=[start date] then {Duration.Minutes([Duration])..59} else if [Custom]=[end date] then {0..Duration.Minutes([Duration])} else {0..59}), #"Expanded Minutes" = Table.ExpandListColumn(DurationMinutes, "Minutes"), Minutes = Table.AddColumn(#"Expanded Minutes", "DateTime", each [DayHour]+#duration(0,0,[Minutes],0)), #"Removed Columns2" = Table.RemoveColumns(Minutes,{"Duration", "Custom", "DayHour", "Minutes"}) in #"Removed Columns2"
Otherwise have a look a the DAX-approaches to "Events-in-Progress" like here: http://blog.gbrueckl.at/2014/12/events-in-progress-for-time-periods-in-dax/
Although you might just shift the performance-drag from load to query...
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Thankyou very much for your help. It has been useful in working out what to do. Also it is a good problem to learn more about M - which I am liking the more I understand it.
What I am doing is conducting an analysis of the timetabling data over a two year period for a large university timetabling system. It is an analysis to examine efficiency of space useage and so on.
The timetabling data is in the form event, room name, start time, end time ...
Most of the analyses are fairly straight forward without needing to do any huge processing in power query - so no problem.
The one analysis that is the problem which this query relates to is:
I am comparing the timetabling data against data that is sampled in 15 minute intervals in a different system that measures the occupancy data for all the rooms. I present a histogram of room occupancy. ie, how often are rooms at 85% or greater occupancy. This is no problem, and is quite nice.
So I can then click on the part of the histogram that will filter a table of the room names and times when the rooms are occupied at this given level. This is a nice feature for the user to see what and what time is causing the high occupancy events. I would like them to be able to also see who is the booking department, what type of event it was - all data from the room occupancy table.
What I want to do is to then get the event (ie. which department booked it, what type of event it was) from the timetable table. However I can't just have a relationship between the times in the occupancy data and the timetabling events as the timetabling table doesn't specify for a given room at a given time what is happening. So I think I need to create a new table that specifies this, then I can just relate the timetabling (event information) table to the room occupancy table.
I might just have to brute force this overnight by going throught the timetabling table and inserting new rows for each event to cover all the times they are occuring and then do a fill down.
That sounds great & should be doable.
So you only need to create 15-minute-intervalls, which will significantly speed up load and reduce table length. Just replace the following steps with this code:
DurationMinutes = Table.AddColumn(#"Removed Columns1", "Minutes", each if [Custom]=[start date] then {Number.RoundDown(Duration.Minutes([Duration])/15,0)..3} else if [Custom]=[end date] then {0..Number.RoundDown(Duration.Minutes([Duration])/15,0)} else {0..3}), #"Expanded Minutes" = Table.ExpandListColumn(DurationMinutes, "Minutes"), Minutes = Table.AddColumn(#"Expanded Minutes", "DateTime", each [DayHour]+#duration(0,0,[Minutes]*15,0)),
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Hi,
Thanks for your help! That code you have given me will be useful for some other problems. After playing around with this I found out I was thinking about things the wrong way. Instead what I did was reverse things.
Instead I calculated it in DAX in a calculated column in the timetabling table. Basically I looked up the max value of room occupancy in for a given event.
Max Occupancy (APC) = CALCULATE( MAX('APC Data'[People Inside]), FILTER( FILTER('APC Data', 'APC Data'[RoomName] = 'CELCAT Timetabling'[RoomName]), 'APC Data'[DateTime] >= 'CELCAT Timetabling'[DateTime Start]
&& 'APC Data'[DateTime] <= 'CELCAT Timetabling'[DateTime End] ) )
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
109 | |
104 | |
103 | |
87 | |
61 |
User | Count |
---|---|
162 | |
132 | |
131 | |
95 | |
86 |