Microsoft is giving away 50,000 FREE Microsoft Certification exam vouchers. Get Fabric certified for FREE! Learn more
Hello,
I have a table with two columns: ID and Date.
The ID column is a 20-digit whole number
The Date column is in the format MM/DD/YYYY HH:MM:SS.
There are some rows with matching IDs. Of the rows with matching IDs, some have dates within X days of each other, say 3 days. I am trying to consolidate the rows with matching IDs AND dates within 3 days of each other. Is there any easy way to do this?
Ideally, the finished product would be a table with the following three columns:
Thanks in advance,
Kyle
Solved! Go to Solution.
Ok @kblommer - see if this helps.
Original Data I used as a test (the comment is to show you what I will recognize as a repeat repair after the query. I didn't cheat and put that in my source queries 😉 )
ID | Date | Comment |
1 | 1/1/2020 | |
1 | 1/3/2020 | Repeat |
2 | 2/1/2020 | |
3 | 1/13/2020 | |
4 | 2/20/2020 | |
5 | 1/30/2020 | |
5 | 1/31/2020 | Repeat |
6 | 1/21/2020 | |
6 | 2/21/2020 | |
6 | 3/4/2020 | |
7 | 1/18/2020 | |
7 | 3/3/2020 | |
8 | 1/1/2020 | |
9 | 1/28/2020 | |
10 | 3/31/2020 | |
10 | 4/1/2020 | Repeat |
10 | 5/1/2020 |
In the file below, I did three queries:
Matching Repair Date Formula
if ([ID] = [Repairs.ID]) and (Number.Abs(Number.From([Repairs.Date]) - Number.From([Date])) < 3)
then [Repairs.Date]
else null
So my end data looks like this:
I did all of this in Excel, but it is the exact same logic in Power BI's Power Query tool. Here is my Excel file.
Let me know if that works for you.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingOk @kblommer - see if this helps.
Original Data I used as a test (the comment is to show you what I will recognize as a repeat repair after the query. I didn't cheat and put that in my source queries 😉 )
ID | Date | Comment |
1 | 1/1/2020 | |
1 | 1/3/2020 | Repeat |
2 | 2/1/2020 | |
3 | 1/13/2020 | |
4 | 2/20/2020 | |
5 | 1/30/2020 | |
5 | 1/31/2020 | Repeat |
6 | 1/21/2020 | |
6 | 2/21/2020 | |
6 | 3/4/2020 | |
7 | 1/18/2020 | |
7 | 3/3/2020 | |
8 | 1/1/2020 | |
9 | 1/28/2020 | |
10 | 3/31/2020 | |
10 | 4/1/2020 | Repeat |
10 | 5/1/2020 |
In the file below, I did three queries:
Matching Repair Date Formula
if ([ID] = [Repairs.ID]) and (Number.Abs(Number.From([Repairs.Date]) - Number.From([Date])) < 3)
then [Repairs.Date]
else null
So my end data looks like this:
I did all of this in Excel, but it is the exact same logic in Power BI's Power Query tool. Here is my Excel file.
Let me know if that works for you.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting@edhans thank you! I'll probably end up doing this in Power Query as we want it to be a live table updating as the excel sheet gets updated.
I really appreciate the help, this is exactly what I was looking for!
Great @kblommer . Hope the rest of your project is successful.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingHello @kblommer
what do you think of such an approach?
let
Source = #table
(
{"ID","Date"},
{
{"1","Jan 1, 2020"}, {"1","Jan 2, 2020"}, {"1","Jan 3, 2020"}, {"1","Jan 4, 2020"}, {"1","Jan 5, 2020"}, {"2","Jan 1, 2020"}, {"2","Jan 2, 2020"}, {"2","Jan 3, 2020"},
{"2","Jan 4, 2020"}, {"2","Jan 5, 2020"}
}
),
ToDate = Table.TransformColumns
(
Source,
{
{
"Date",
each Date.FromText(_,"en-US"),
type date
}
}
),
Sort =Table.Buffer( Table.Sort
(
ToDate,
{{"ID", Order.Ascending}, {"Date", Order.Ascending}}
)),
Group = Table.Group
(
Sort,
{"ID", "Date"},
{{"AllRows", each _, type table [ID=text, Date=date]}},
GroupKind.Local,
(group,current) => if Value.Compare(current[ID],group[ID])=0 and current[Date]-group[Date]<#duration(3,0,0,0) then 0 else 1
),
DeleteColumn = Table.RemoveColumns(Group,{"Date"}),
Group2 = Table.Group(DeleteColumn, {"ID"}, {{"AllRows", each try Table.AddIndexColumn(Table.FromList({_[AllRows]{0}, Table.Combine(List.Range(_[AllRows],1))},Splitter.SplitByNothing()),"Index",1) otherwise Table.AddIndexColumn(Table.FromList( {_[AllRows]},Splitter.SplitByNothing()),"Index", 1)}}),
ExpandAllRows = Table.ExpandTableColumn(Group2, "AllRows", {"Column1","Index"}, {"Dates","Index"}),
PivotColumn = Table.Pivot(Table.TransformColumnTypes(ExpandAllRows, {{"Index", type text}}, "de-DE"), List.Distinct(Table.TransformColumnTypes(ExpandAllRows, {{"Index", type text}}, "de-DE")[Index]), "Index", "Dates"),
RenameColumn = Table.RenameColumns(PivotColumn,{{"1", "Within date Range"}, {"2", "Out of date Range"}})
in
RenameColumn
Copy paste this code to the advanced editor in a new blank query to see how the solution works. If this solution fits your need, copy and past a part of it and implement it in your query.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Thanks Jimmy,
So it looks like with your approach we'd end up with embedded tables with all the dates. Do you know if there's a way I could pull out the dates from the embedded tables and add them into the main table adjacent to their respective IDs?
Thanks,
Kyle
Not sure I understand your logic requirements. For example, how would you match the following:
ID | Date |
1 | Jan 1, 2020 |
1 | Jan 2, 2020 |
1 | Jan 3, 2020 |
1 | Jan 4, 2020 |
1 | Jan 5, 2020 |
would those get matched to one record? The Jan 1 and Jan 5 dates are more than 3 days apart, but Jan 5 is 1 day from Jan 4, which is 3 days from Jan 1.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingThere won't be a case where there is that many dates that closely spaced.
The table that I have is a repair log. The table gets updated by those who do the repairs and they fill in the time of the repair. Some repairs require multiple people or a long time (~1 day). Thus, sometimes multiple entries are done into the table either because more information needed to be logged as the repair progressed or another person came in and made an addional entry. No repairs are going to take longer than 3 days, so there won't be a log like your example of 5 successive days. If there are two entries separated by more than 3 days, they are separate repairs.
Does that help clarify?
Yes, but if there are more than 3 days separating and are separate repairs, won't the IDs be unique? Or are the IDs like a customer number, not a repair number?
I need to work up some fake data here to play with.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting@edhans Sorry, didn't make that clear. The IDs are fiber numbers and we are repairing fibers. They are not Repair numbers.
The data we have covers a 3 year period, so there are multiple repairs for each ID over that time frame.
Ok. Maybe someone else will jump in first. Otherwise, I need to think about this. I think the key is to join the table to itself and do some comparisons.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingCheck out the April 2025 Power BI update to learn about new features.
Explore and share Fabric Notebooks to boost Power BI insights in the new community notebooks gallery.
User | Count |
---|---|
23 | |
11 | |
10 | |
10 | |
9 |