Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi there,
I have a dataset that looks something like this (note dates are in UK format):
Employee | Reason code | Date from | Date to | |
#1 | Bill | 1 | 05/04/2022 | 08/04/2022 |
#2 | Bill | 2 | 09/06/2022 | 09/06/2022 |
#3 | James | 2 | 01/02/2022 | 02/03/2022 |
#4 | Sandy | 1 | 05/01/2022 | 06/01/2022 |
#5 | Bill | 1 | 09/04/2022 | 15/05/2022 |
#6 | Sandy | 1 | 07/01/2022 | 07/01/2022 |
#7 | Andrew | 2 | 05/01/2022 | 06/01/2022 |
#8 | Jill | 2 | 03/04/2022 | 09/04/2022 |
#9 | Emma | 1 | 03/03/2022 | 04/03/2022 |
Note that line #1 and #5 are for the same person with the same reason code and the date range is continuous (ie 05/04/2022 - 08/04/2022 and 09/04/2022 - 15/05/2022 is actually just 05/04/2022 - 15/05/2022 if taken as a continuous date). Likewise, the same is true of lines 4 and 6.
How can I use Power Query to turn the data into this (where continuous dates are combined into one record):
Employee | Reason code | Date from | Date to |
Bill | 1 | 05/04/2022 | 15/05/2022 |
Bill | 2 | 09/06/2022 | 09/06/2022 |
James | 2 | 01/02/2022 | 02/03/2022 |
Sandy | 1 | 05/01/2022 | 07/01/2022 |
Andrew | 2 | 05/01/2022 | 06/01/2022 |
Jill | 2 | 03/04/2022 | 09/04/2022 |
Emma | 1 | 03/03/2022 | 04/03/2022 |
Thanks in advance.
Solved! Go to Solution.
Hi,
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Date from", type date}, {"Date to", type date}}, "en-GB"),
#"Sorted Rows" = Table.Sort(#"Changed Type with Locale",{{"Employee", Order.Ascending}, {"Date from", Order.Ascending}}),
Index = Table.Buffer(Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type)),
#"Added Custom" = Table.AddColumn(Index, "ModIndex", each try if Index[Employee]{[Index]-1}=[Employee] and Index[#"Reason code"]{[Index]-1}=[#"Reason code"] and Index[#"Date to"]{[Index]-1}=Date.AddDays([#"Date from"],-1) then [Index]-1 else [Index] otherwise [Index]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Employee", "Reason code", "ModIndex"}, {{"Date from", each List.Min([Date from]), type nullable date}, {"Date to", each List.Max([Date to]), type nullable date}})
in
#"Grouped Rows"
You can sort the table by name and start date. Then add index column, add custom column which cheks if previous row has the same name, reason and date to is one day smaller then date from. If yes get previous row idex else current row index. Then gropu by modyfied index column and get min and max from date field respectively.
Hope this will help.
Artur
Hi,
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Date from", type date}, {"Date to", type date}}, "en-GB"),
#"Sorted Rows" = Table.Sort(#"Changed Type with Locale",{{"Employee", Order.Ascending}, {"Date from", Order.Ascending}}),
Index = Table.Buffer(Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type)),
#"Added Custom" = Table.AddColumn(Index, "ModIndex", each try if Index[Employee]{[Index]-1}=[Employee] and Index[#"Reason code"]{[Index]-1}=[#"Reason code"] and Index[#"Date to"]{[Index]-1}=Date.AddDays([#"Date from"],-1) then [Index]-1 else [Index] otherwise [Index]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
#"Grouped Rows" = Table.Group(#"Removed Columns", {"Employee", "Reason code", "ModIndex"}, {{"Date from", each List.Min([Date from]), type nullable date}, {"Date to", each List.Max([Date to]), type nullable date}})
in
#"Grouped Rows"
You can sort the table by name and start date. Then add index column, add custom column which cheks if previous row has the same name, reason and date to is one day smaller then date from. If yes get previous row idex else current row index. Then gropu by modyfied index column and get min and max from date field respectively.
Hope this will help.
Artur
Thanks very much Artur. This seems to be what i'm looking for. I get an idea of what the code is doing, but really struggling to understand it so that I can use it in practice.
This bit of the code i am really struggling to understand. Is there a name for this particular function or technique that I can google in order to find out more. Just not clear on what this part is doing.
Index[Employee]{[Index]-1}=[Employee] and Index[#"Reason code"]{[Index]-1}=[#"Reason code"] and Index[#"Date to"]{[Index]-1}=Date.AddDays([#"Date from"],-1)
Thanks
-Jocky
Index is the name of previous step, [Employee] is the name of the column in previous step between curly brackets you can choose record number (zero based) from which you want to get the data. In this case {[Index]-1} in courrent row value in column Index has value ie. 2 so PQ calculates that you want data from row number 1 from column [Employee] of the previous step. If you just call column name without giving table and row number you get values from current row.
I hope it's more clear now.
Artur
Thanks very much.
Thanks Daryl. Looking up the group by and min/max date from/to features now (going to try to learn this step by step as I don't really understand the code). One thing that has come to mind when i'm thinking about this though is how it would deal with a scenario where the dataset included multiple date ranges for the same person and reason code that weren't consecutive? Just thinking the min/max might take the min from the earliest one and the max from the latest one even if there was a gap between the two ranges.
Hi @Jocky - so in your example if Bill second record with reason type 1 was on the 10/04/2022, you want to show two rows? Hmm - tricky one? Again Group By might be able to help but you will need to develop function to process the table created in the group by function.
Is there a scenario we the date overlap e.g. the instead of 09/04/2022 or 10/04/2022 could the value be 07/04/2022 or earlier?
Thanks Daryl. Yeah, that's right. In the scenario described there would be two rows for Bill.
Hi @Jocky - I think you could achieve the desire result using the Group By functionality. You will group by "Employee" and "Reason", and use the Min Date From and Max Date to. Try the following:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcsrMyVHSUTIEYgNTfQMTfSMDIyOQAJBjCuHE6sCVgWQMLPUNzGDKkDggZV6JuanFMHWG+gZGcHVG+gbGCHXBiXkplUjWGsLVmcM5IHWOeSlFqeUwA1EUmqEo9EJynzGSN0DuM0Eoc83NTYRZawx3EZBjgnBeLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, #"Reason code" = _t, #"Date from" = _t, #"Date to" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Reason code", Int64.Type}}),
#"Parsed Date" = Table.TransformColumns(#"Changed Type",{{"Date from", each Date.From(DateTimeZone.From(_)), type date}, {"Date to", each Date.From(DateTimeZone.From(_)), type date}}),
#"Grouped Rows" = Table.Group(#"Parsed Date", {"Employee", "Reason code"}, {{"Date from", each List.Min([Date from]), type date}, {"Date to", each List.Max([Date to]), type date}})
in
#"Grouped Rows"
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
15 | |
12 | |
8 | |
8 | |
7 |
User | Count |
---|---|
15 | |
13 | |
8 | |
7 | |
6 |