The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi everyone, I'm trying to remove some duplicates in a large list, which is formed from appending 2 lists of staff IDs.
Some of the Staff IDs have duplicate entries, I'd like to set up some conditions to remove duplicates:
1) When there is a duplicate in [Staff ID], if the [End Date] is non blank, then keep the non-blank row, delete the blank row;
2) If there is no duplicate in [Staff ID], then keep the row.
Thanks!
The original table:
Staff ID | End Date |
123 | 19/11/2024 |
123 | |
456 | 10/10/2024 |
456 | |
789 | |
1011 | |
1011 | 9/02/2024 |
1112 | |
1213 |
The desired outcome would be:
Staff ID | End Date |
123 | 19/11/2024 |
456 | 10/10/2024 |
789 | |
1011 | 9/02/2024 |
1112 | |
1213 |
Solved! Go to Solution.
Hi,
One of ways is using Table.Group function in Power Query Editor.
let
Source = Data_Source,
#"Grouped Rows" = Table.Group(Source, {"Staff ID"}, {{"End Date", each List.Max([End Date]), type nullable date}})
in
#"Grouped Rows"
Power Query Code:
let
Source = YourTableName,
SortTable = Table.Sort(Source, {{"Staff ID", Order.Ascending}, {"End Date", Order.Descending}}),
RemoveDuplicates = Table.Distinct(SortTable, {"Staff ID"})
in
RemoveDuplicates
💌 If this helped, a Kudos 👍 or Solution mark would be great! 🎉
Cheers,
Kedar
Connect on LinkedIn
Hi,
Please try something like below whether it suits your requirement.
Please check the below picture and the attached pbix file.
It is for creating a new table.
expected result table =
VAR _list =
DISTINCT ( Data[Staff ID] )
VAR _t =
ADDCOLUMNS ( _list, "End Date", CALCULATE ( MAX ( Data[End Date] ) ) )
RETURN
_t
Sorry I'm using Power Query in Excel.
Hi,
One of ways is using Table.Group function in Power Query Editor.
let
Source = Data_Source,
#"Grouped Rows" = Table.Group(Source, {"Staff ID"}, {{"End Date", each List.Max([End Date]), type nullable date}})
in
#"Grouped Rows"