Don't miss your chance to take the Fabric Data Engineer (DP-700) exam on us!
Learn moreThe FabCon + SQLCon recap series starts April 14th at 8am Pacific. If you’re tracking where AI is going inside Fabric, this first session is a can't miss. Register now
I am looking to remove duplicates, but not just by date, but by vendor ID. This will mean that each most recent occurance of a barcode, from each vendor, will load. Looking at the below exmaple, I should be left with the two rows at the bottom showing the latest occurances of each barcode from each vendor. In my data there are many vendors with the same barcode so I need to make sure it is not simply the latest occurance of each barcode, but the latest occurance of each barcode with each vendor.
| Vendor ID | Barcode | Date |
| 4456 | 77889909 | 01/03/2026 |
| 1212 | 66565565 | 05/03/2026 |
| 4456 | 77889909 | 05/03/2026 |
| 1212 | 66565565 | 02/03/2026 |
| 1212 | 66565565 | 01/03/2026 |
| 1212 | 66565565 | 02/03/2026 |
| 4456 | 77889909 | 02/03/2026 |
End result:
| Vendor ID | Barcode | Date |
| 4456 | 77889909 | 05/03/2026 |
| 1212 | 66565565 | 05/03/2026 |
I have tried to copy my query, keeping only these three columns
= Table.Group(#"Expanded Items", {"Vendor ID", "Barcode"}, {{"Latest_Date", each List.Max([Price_Date]), type nullable date}})
Then, I've gone back to my original query and joined them, using all three columns, using an inner join (only matching rows). This hasn't worked well. It looks like it may have worked as intended with some vendor's but definately not all.
Solved! Go to Solution.
Thanks for the responce.
I struggled to replicate this as the steps take so long in Power Query to load. I manage to solve my issue as there was a column in my table for the latest price, true or false, for each product.
Hi,
This M code works
let
Source = Excel.CurrentWorkbook(){[Name="Data"]}[Content],
#"Grouped Rows" = Table.Group(Source, {"Vendor ID"}, {{"Count", each Table.Max(_,"Date")}}),
#"Expanded Count" = Table.ExpandRecordColumn(#"Grouped Rows", "Count", {"Barcode", "Date"}, {"Barcode", "Date"}),
#"Changed Type" = Table.TransformColumnTypes(#"Expanded Count",{{"Vendor ID", type text}, {"Barcode", type text}, {"Date", type date}})
in
#"Changed Type"
Hope this helps.
Hi @RossS ,
If the response shared by @danextian , @Natarajan_M , @Kedar_Pande aligns with your expectations, please take a moment to review it and let us know if you need any additional details or clarifications.
Thank you all for your valuable support.
Regards,
Yugandhar.
In Power Query (one step):
Sort by Date descending (newest first)
Add Index column
Buffer table: = Table.Buffer(#"Added Index")
Select Vendor ID + Barcode + Index → Remove Duplicates
Remove Index, expand other columns from original
Hi @RossS ,
I'm able to mimic your scenario.
If your data volume is low, you can use the Table.Buffer option. However, if your volume is high, using Table.Buffer will increase the query processing overhead. Here are the reasons why:
1. It loads all the data into memory before processing.
2. It breaks query folding.
3. It is ideal for small lookup tables but not for fact tables (which I assume is the case with your transaction fact).
A solution I would prefer is to use the following approach: sort, then group, and finally take the first record.
Query:
let
Source = Excel.Workbook(File.Contents("C:\Users\natar\Downloads\Community\Dedup_VendorBarcode.xlsx"), null, true),
RawData_Sheet = Source{[Item="RawData",Kind="Sheet"]}[Data],
#"Promoted Headers" = Table.PromoteHeaders(RawData_Sheet, [PromoteAllScalars=true]),
#"Typed" = Table.TransformColumnTypes(#"Promoted Headers", {
{"VendorID", Int64.Type}, {"Barcode", Int64.Type},
{"Date", type date}, {"Price", type number}, {"Description", type text}
}),
#"Sorted" = Table.Sort(#"Typed",{{"Date", Order.Descending}}),
#"Grouped" =
Table.Group(
#"Sorted",
{"VendorID","Barcode"},
{{"Latest", each Table.First(_), type record}}
),
#"Expanded" =
Table.ExpandRecordColumn(
#"Grouped",
"Latest",
{"Date","Price","Description"}
)
in
#"Expanded"
Raw data
Transformed data :
PBIX : Latest data.pbix
Thanks
If this response was helpful in any way, I’d gladly accept a kudo.
Please mark it as the correct solution. It helps other community members find their way faster
Thanks for the responce.
I struggled to replicate this as the steps take so long in Power Query to load. I manage to solve my issue as there was a column in my table for the latest price, true or false, for each product.
Thanks for the update. Good you figured it out . Seems like the issue was due to the latest price true/false column.
Feel free to reach out if you need any further help.
Hi @RossS
Sort your data by date in descending order. Add a custom applied step with Table.Buffer to store the sorted data into memory. Select Vendor ID column and then remove duplicates.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjExNVPSUTI3t7CwtDSwBDINDPUNjPWNDIzMlGJ1opUMjQyNgKJmZqZmpkAEUmCKogCLCaaETDAipICgG4wIuQFJQSwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Vendor ID" = _t, Barcode = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Vendor ID", Int64.Type}, {"Barcode", Int64.Type}, {"Date", type date}}),
#"Sorted Rows" = Table.Sort(#"Changed Type",{{"Date", Order.Descending}}),
Table.Buffer = Table.Buffer(#"Sorted Rows"),
#"Removed Duplicates" = Table.Distinct(Table.Buffer, {"Vendor ID"})
in
#"Removed Duplicates"
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 52 | |
| 39 | |
| 37 | |
| 19 | |
| 18 |
| User | Count |
|---|---|
| 67 | |
| 66 | |
| 34 | |
| 32 | |
| 29 |