Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Jocky
Frequent Visitor

Merging records that have continuous dates

Hi there,

 

I have a dataset that looks something like this (note dates are in UK format):

 

 EmployeeReason codeDate fromDate to
#1Bill105/04/202208/04/2022
#2Bill209/06/202209/06/2022
#3James201/02/202202/03/2022
#4Sandy105/01/202206/01/2022
#5Bill109/04/202215/05/2022
#6Sandy107/01/202207/01/2022
#7Andrew205/01/202206/01/2022
#8Jill203/04/202209/04/2022
#9Emma103/03/202204/03/2022

 

Note that line #1 and #5 are for the same person with the same reason code and the date range is continuous (ie 05/04/2022 - 08/04/2022 and 09/04/2022 - 15/05/2022 is actually just 05/04/2022 - 15/05/2022 if taken as a continuous date).  Likewise, the same is true of lines 4 and 6.

 

How can I use Power Query to turn the data into this (where continuous dates are combined into one record):

 

EmployeeReason codeDate fromDate to
Bill105/04/202215/05/2022
Bill209/06/202209/06/2022
James201/02/202202/03/2022
Sandy105/01/202207/01/2022
Andrew205/01/202206/01/2022
Jill203/04/202209/04/2022
Emma103/03/202204/03/2022

 

Thanks in advance.

1 ACCEPTED SOLUTION
artpil
Resolver II
Resolver II

Hi,

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Date from", type date}, {"Date to", type date}}, "en-GB"),
    #"Sorted Rows" = Table.Sort(#"Changed Type with Locale",{{"Employee", Order.Ascending}, {"Date from", Order.Ascending}}),
    Index = Table.Buffer(Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type)),
    #"Added Custom" = Table.AddColumn(Index, "ModIndex", each try if Index[Employee]{[Index]-1}=[Employee] and Index[#"Reason code"]{[Index]-1}=[#"Reason code"] and Index[#"Date to"]{[Index]-1}=Date.AddDays([#"Date from"],-1) then [Index]-1 else [Index] otherwise [Index]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
    #"Grouped Rows" = Table.Group(#"Removed Columns", {"Employee", "Reason code", "ModIndex"}, {{"Date from", each List.Min([Date from]), type nullable date}, {"Date to", each List.Max([Date to]), type nullable date}})
in
    #"Grouped Rows"

You can sort the table by name and start date. Then add index column, add custom column which cheks if previous row has the same name, reason and date to is one day smaller then  date from. If yes get previous row idex else current row index. Then gropu by modyfied index column and get min and max from date field respectively.

Hope this will help.

Artur

View solution in original post

8 REPLIES 8
artpil
Resolver II
Resolver II

Hi,

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    #"Changed Type with Locale" = Table.TransformColumnTypes(Source, {{"Date from", type date}, {"Date to", type date}}, "en-GB"),
    #"Sorted Rows" = Table.Sort(#"Changed Type with Locale",{{"Employee", Order.Ascending}, {"Date from", Order.Ascending}}),
    Index = Table.Buffer(Table.AddIndexColumn(#"Sorted Rows", "Index", 0, 1, Int64.Type)),
    #"Added Custom" = Table.AddColumn(Index, "ModIndex", each try if Index[Employee]{[Index]-1}=[Employee] and Index[#"Reason code"]{[Index]-1}=[#"Reason code"] and Index[#"Date to"]{[Index]-1}=Date.AddDays([#"Date from"],-1) then [Index]-1 else [Index] otherwise [Index]),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Index"}),
    #"Grouped Rows" = Table.Group(#"Removed Columns", {"Employee", "Reason code", "ModIndex"}, {{"Date from", each List.Min([Date from]), type nullable date}, {"Date to", each List.Max([Date to]), type nullable date}})
in
    #"Grouped Rows"

You can sort the table by name and start date. Then add index column, add custom column which cheks if previous row has the same name, reason and date to is one day smaller then  date from. If yes get previous row idex else current row index. Then gropu by modyfied index column and get min and max from date field respectively.

Hope this will help.

Artur

Jocky
Frequent Visitor

Thanks very much Artur.  This seems to be what i'm looking for.  I get an idea of what the code is doing, but really struggling to understand it so that I can use it in practice.

 

This bit of the code i am really struggling to understand.  Is there a name for this particular function or technique that I can google in order to find out more.  Just not clear on what this part is doing.

 

Index[Employee]{[Index]-1}=[Employee] and Index[#"Reason code"]{[Index]-1}=[#"Reason code"] and Index[#"Date to"]{[Index]-1}=Date.AddDays([#"Date from"],-1)

 

Thanks

-Jocky

Index is the name of previous step, [Employee] is the name of the column in previous step between curly brackets you can choose record number (zero based) from which you want to get the data. In this case {[Index]-1} in courrent row value in column Index has value ie. 2 so PQ calculates that you want data from row number 1 from column [Employee] of the previous step. If you just call column name without giving table and row number you get values from current row.

I hope it's more clear now.

 

Artur

Jocky
Frequent Visitor

Thanks very much.

Jocky
Frequent Visitor

Thanks Daryl.  Looking up the group by and min/max date from/to features now (going to try to learn this step by step as I don't really understand the code).  One thing that has come to mind when i'm thinking about this though is how it would deal with a scenario where the dataset included multiple date ranges for the same person and reason code that weren't consecutive?  Just thinking the min/max might take the min from the earliest one and the max from the latest one even if there was a gap between the two ranges.

Hi @Jocky - so in your example if Bill second record with reason type 1 was on the 10/04/2022, you want to show two rows?  Hmm - tricky one?  Again Group By might be able to help but you will need to develop function to process the table created in the group by function.  

 

Is there a scenario we the date overlap e.g. the instead of 09/04/2022 or 10/04/2022 could the value be 07/04/2022 or earlier?

Thanks Daryl.  Yeah, that's right.  In the scenario described there would be two rows for Bill.

Daryl-Lynch-Bzy
Resident Rockstar
Resident Rockstar

Hi @Jocky - I think you could achieve the desire result using the Group By functionality.  You will group by "Employee" and "Reason", and use the Min Date From and Max Date to.  Try the following:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WcsrMyVHSUTIEYgNTfQMTfSMDIyOQAJBjCuHE6sCVgWQMLPUNzGDKkDggZV6JuanFMHWG+gZGcHVG+gbGCHXBiXkplUjWGsLVmcM5IHWOeSlFqeUwA1EUmqEo9EJynzGSN0DuM0Eoc83NTYRZawx3EZBjgnBeLAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Employee = _t, #"Reason code" = _t, #"Date from" = _t, #"Date to" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Reason code", Int64.Type}}),
    #"Parsed Date" = Table.TransformColumns(#"Changed Type",{{"Date from", each Date.From(DateTimeZone.From(_)), type date}, {"Date to", each Date.From(DateTimeZone.From(_)), type date}}),
    #"Grouped Rows" = Table.Group(#"Parsed Date", {"Employee", "Reason code"}, {{"Date from", each List.Min([Date from]), type date}, {"Date to", each List.Max([Date to]), type date}})
in
    #"Grouped Rows"

 

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.