Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Anonymous
Not applicable

Deleting duplicate rows but choosing which row to delete

Hi All

 

In my dataset I have data that is duplicated in several rows but the only difference is that in one column there is a value for one of the rows and in the other row there is no value. 

I want to be able to remove the duplicate row that has the blank value in the one column or alternatively, add the same resource number to the blank field as then I can just remove duplicates and it won't matter which one is deleted. 

There are many of these I need to do, not just one or two. 

Below is an example of what I mean.

 

resource issue.PNG

 

 

 

I am doing this via Power Query in Power BI. 

If I choose all columns except the Resource column, then  delete duplicates, the row with the resource number in it, gets deleted. I need this one to stay. 

 

Any assistance anyone can provide me with will be appreciated. 

Thank you. 

 

Karen

1 ACCEPTED SOLUTION
BA_Pete
Super User
Super User

Hi @Anonymous ,

 

Select all columns except [Resource], do Group By, then add an aggregate column that is the MAX of [Resource].

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




View solution in original post

9 REPLIES 9
serpiva64
Solution Sage
Solution Sage

Hi,

you can do a fill down like the one in the exemple:

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjRQ0lEyVIrVgTKNwEwjhKgRmiiYZYyQN0bIg5nGCCaCZaoUGwsA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [SN = _t, Step = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,"",null,Replacer.ReplaceValue,{"Step"}),
#"Grouped Rows" = Table.Group(#"Replaced Value", {"SN"}, {{"AllRows", each _, type table [SN=nullable text, Step=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.FillDown([AllRows],{"Step"})),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Custom"}),
#"Expanded Custom" = Table.ExpandTableColumn(#"Removed Other Columns", "Custom", {"SN", "Step"}, {"SN.1", "Step"})
in
#"Expanded Custom"

 

then, as you thought, you can remove duplicates.

 

If this post is useful to help you to solve your issue consider giving the post a thumbs up 

 and accepting it as a solution !

BA_Pete
Super User
Super User

Hi @Anonymous ,

 

Select all columns except [Resource], do Group By, then add an aggregate column that is the MAX of [Resource].

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Anonymous
Not applicable

Hi BA_Pete,

 

It looks like this may have worked. Can I just confirm with you what this does? From my understanding, your solution is saying:

 

If all columns that I have highlighted are the same, group them under the one resource that is listed. If there is no resource, listed for dulplicated data in the columns, it will just return the one row with a blank cell in the resource column.  

 

Is that correct?

 

Karen

Hi @Anonymous ,

 

I don't think that's quite rght, but might just be that I've misunderstood your explanation.

I would describe it as the following, but also may not be a perfect explanation 🙂 :

 

For all the columns you selected to group, if they each have the same values as other rows in the same column, Power Query will turn them into just one row (as they are all the same value in each respective column anyway). Where you have whole rows that otherwise match, but the [Resource] column is different, we choose one value from the [Resource] column (the MAX value) so it can also be turned into one row with the rest. Given the choice between null/blank and any text or numerical value, the non-null/non-blank will always be evaluated as the 'MAX' value. If there is only one row but it has a null/blank [Resource] value, then the MAX of the choices PQ has is still null/blank, so you keep this row.

For other purposes, you could also choose to turn your columns that don't match with the rest of the row into SUM, MIN, AVG etc. so it can be resolved to a single row with the rest of the columns.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Anonymous
Not applicable

Great. Thanks Pete. Appreciate your time explaining that a bit more in depth. 

Karen

PhilipTreacy
Super User
Super User

Hi @Anonymous 

If the blank cells are actually empty

f1.png

 

Then filter out blanks

 

f6.png

f5.png

 

Regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


Anonymous
Not applicable

Thanks Phil. I thought this sounded like an easy fix but unfortunately I do have other blank cells in that column that I need to keep as they aren't duplicates.  

Hi @Anonymous 

This is why it's important to provide a representative sample of your data so we can actually work with the data you are.

So if you need to keep some blanks and remove others, how do we distinguish between these blanks?

Please provide a good size sample of your data that includes all instances of the data you are working with.

regards

Phil



Did I answer your question? Then please mark my post as the solution.
If I helped you, click on the Thumbs Up to give Kudos.


Blog :: YouTube Channel :: Connect on Linkedin


Proud to be a Super User!


Anonymous
Not applicable

Thanks Phil.

I will provide that for future queries I may have.

BA_Pete may have found me a solution. It appears to be doing what I need it to do so I will use that solution in this instance. Thank you for your prompt reply to my query though. It is appreciated.

My apologies for not providing you with more data to begin with. 

 

Karen

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.