Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes! Register now.
Good morning,
I have a car history with the model, registration and status of each car. But, some people access to this file and edit it.
When I try to cross-reference this file with another file that lists cars with th revisions, I find that there are duplicate cars.
In the status column, we have two types: garage and service.
My idea is to discard duplicate cars based on the following:
-If the car/registration doesn't have a duplicate, import it directly (the condition doesn't matter).
-If the car/registration does have a duplicate, import the one with the service status and discard the one with the garage status.
How can I make this differentiation?
Example of the file:
Model | Registration | Status |
Terraco | 3749VQI | service |
Golf | 8371PLO | garage |
GT500 | 2036EMU | service |
Multipla | 3581KFC | garage |
Pajero | 1531RMF | garage |
Cona | 6219TFJ | garage |
RAV4 | 6325ALV | service |
ASX | 4071SJZ | service |
Golf | 8371PLO | service |
Cona | 6219TFJ | service |
Greetings and thank you very much
Solved! Go to Solution.
Hello @alejandroezp ,
It depends on why they are still there. In my opinion, it is because the data must follow a different scenario than the one initially mentioned.
Have a nice day,
Vivien
You can use Table.Buffer for this problem.
Copy and paste the following code in Advanced Editor to see how.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCkktKkpMzlfSUTI2N7EMC/QEsopTi8oyk1OVYnWildzzc9KAQhbG5oYBPv5AVnpiUWI6VC7E1MAAKGRkYGzm6huKptO3NKcksyAnEWS0qYWht5szqu6AxKzUIpDFhqbGhkG+bqiyzvl5IJ1mRoaWIW5eqHJBjmEmIDljI1NHnzA0ax2DI4AiJgbmhsFeUYQ8gyyJYSNcMhYA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Model = _t, #"Registration " = _t, Status = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Model", type text}, {"Registration ", type text}, {"Status", type text}}),
#"Sorted Rows" = Table.Buffer(Table.Sort(#"Changed Type",{{"Status", Order.Descending}})),
#"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Model", "Registration "})
in
#"Removed Duplicates"
Hi @alejandroezp ,
You can address this in Power Query by ensuring that, for duplicate Registrations, the record with Status = "service" is retained and the one with "garage" is removed.
Option 1 – Using Power Query UI Steps
1. Load your table into Power Query.
2. Select the Registration column.
3. (Optional) Use Home - Keep Duplicates to identify duplicate entries.
4. Add a Conditional Column: if [Status] = "service" then 1 else 2.
5. Sort the table by Registration (ascending) and then by the new column (ascending).
6. Go to Home - Remove Duplicates on the Registration column.
7. Power Query will retain the first instance, prioritizing the "service" row if present.
Result: Each Registration will have only one record, always keeping "service" when available.
Option 2 – Using Automated M Code
If you prefer to use code, you can apply the following M script:
let
Source = YourTableName, // Replace with your table name
AddPriority = Table.AddColumn(Source, "Priority", each if [Status] = "service" then 1 else 2, Int64.Type),
Sorted = Table.Sort(AddPriority, {{"Registration", Order.Ascending}, {"Priority", Order.Ascending}}),
Result = Table.Distinct(Sorted, {"Registration"})
in
Result
This approach ensures your data is organized as follows:
Cars without duplicates remain unchanged.
For cars with duplicates, only the "service" record is kept.
Add this step to your query...
= Table.Group(previousStepName, {"Model", "Registration "}, {{"Status", each if Table.RowCount(_) = 1 then _[Status]{0} else List.Select(_[Status], each _ = "service"){0}, type text}})
Change 'previousStepName' to the name of your previous step.
Start->
Finish ->
Proud to be a Super User! | |
Good morning,
I tried your suggestion and duplicate cars appeared, and the rest of the process failed.
Best regards, and thank you very much for all the help.
Hello @alejandroezp ,
My advice is not to do this during import, but once you have joined your two files in PowerQuery (after merging the queries).
You then have the option to remove duplicates according to the columns you want. Power Query will keep the first occurrence in the table, so you need to sort the results so that Status Service appears first, and it will remove the row with the status garage.
Feel free to give it a kudo and mark it as the solution if it suits you.
Have a nice day,
Vivien
Good morning @vivien57,
I have just trying to do the process like you explain and i have not the results. Can you explain the process?
Greetings and thank you very much
Feel free to give it a kudo and mark it as the solution if it suits you.
Have a nice day,
Vivien
Good morning,
I've tried removing duplicates, and it's worked in most cases. But there's a small group that hasn't been affected by this and is still showing up incorrectly. What could be happening in these cases?
Regards, and thank you very much for all the help.
Hello @alejandroezp ,
It depends on why they are still there. In my opinion, it is because the data must follow a different scenario than the one initially mentioned.
Have a nice day,
Vivien