Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
I'm trying to figure out how to remove duplicate rows based on certain criteria. If one of the rows matches the criteria, I keep it and delete the duplicates. I can do this in VBA, but I really want to learn how to accomplish the same in M. Goal is to feed my spreadsheet into PowerBI and let it do all the work. But I'm new to M and having trouble finding examples I can learn from.
Here's a snippet of the excel sheet with the 3 columns the script checks.
Column tc_and_lineitem identifies duplicates. The sheet is sorted so the duplicate rows naturally end up next to each other. af2 and af3 are the criteria.
Here's the VBA code that loops through and deletes duplicate rows (please no comments
)
Essentially loops through the sheet, when duplicates are found it checks af2 (fund), and in some circumstances af3 (orgn), then deletes the unwanted row.
rowNum = 2
Do While rowNum < lastRowNum
startTCLineItem = thisSheet.Range("O" & rowNum)
startFund = thisSheet.Range("CT" & rowNum)
startOrgn = thisSheet.Range("CU" & rowNum)
rowNum = rowNum + 1
thisTCLineItem = thisSheet.Range("O" & rowNum)
' // check if TCLineItems match. When they stop matching while loop returns to outer loop and thisTCLineItem becomes the new startTCLineItem.
Do While startTCLineItem = thisTCLineItem
thisFund = thisSheet.Range("CT" & rowNum)
thisOrgn = thisSheet.Range("CU" & rowNum)
' // Keep fund 16700 and remove the duplicate row,
' // or if fund is 30330 then keep orgn 5770 and remove the duplicate row.
If startFund = "16700" Then
thisSheet.Rows(rowNum).Delete
rowNum = rowNum - 1
lastRowNum = lastRowNum - 1
ElseIf thisFund = "16700" Then
z = rowNum - 1
thisSheet.Rows(z).Delete
rowNum = rowNum - 1
lastRowNum = lastRowNum - 1
ElseIf startFund = "30330" Then
If startOrgn = "5770" Then
thisSheet.Rows(rowNum).Delete
rowNum = rowNum - 1
lastRowNum = lastRowNum - 1
ElseIf thisOrgn = "5770" Then
z = rowNum - 1
thisSheet.Rows(z).Delete
rowNum = rowNum - 1
lastRowNum = lastRowNum - 1
End If
End If
rowNum = rowNum + 1
thisTCLineItem = thisSheet.Range("O" & rowNum)
Loop
LoopIf somebody could point me in the right direction regarding how to accomplish the same in M, that would be great! I would love to do this better and with less human interaction. Power Query and M seem the way to go.
Thanks ![]()
James
Hi @dudeyates ,
You could refer to Remove Duplicates function in query editor:
Result:
You could also download the pbix file to have a view.
Regards,
Daniel He
@v-danhe-msft Thanks for taking the time took look at this. I think that works because coincidentally the sort puts af2 16700 above the duplicate to be removed. But what if the first duplicate had af2 of 12345, and the af2 of 16700 I want to keep ends up being the 2nd duplicate, does the af2 16700 get removed?
For the vast majority of my data, the straight forward Remove Duplicates works. But there are instances when the line I want to keep is the 2nd of the duplicates, and I actually want to remove the first duplicate. And maybe I'm just missing something in your example (I have downloaded and viewed)
Thanks,
James
In query editor; create the criteria column, sort, and then Remove Rows/Delete Duplicates. Like I've done here:
Thanks for the response. I'm not sure of the steps to take to get to the Asset Key you created. How does Removing Duplicates know that I want to keep tc_and_lineitem: "1083 1" with a af2 = "16700", but remove tc_and_lineitem: "1083 1" with a different af2?
Thanks,
James
My column was a simple concatenate basically = [Column1]&[Column3]&[Column5]
yours will likely be more complex, don't really know with the data your showing. Instead of Remove Duplicate you may need to create some sort of key that you could then filter duplicate rows out that way.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.