Solved: Re: Remove rows in Power Query based on multiple C...

AmitSaini · ‎01-15-2025

Hi,

I have a below table: -

As you seen in the above table, we have some duplicate id's with different status and auto_status.

if we have any duplicate id then we consider only id who have the status= Yes and Auto_Status = Candidate for Automation only.

Taking example for 1st id - 1047452, need to keep only second row because who have the status= Yes and Auto_Status = Candidate for Automation, first row should remove. And keep unique rows as well.

Need output like below : -

Please help me to achieve this in Power query.

BA_Pete · ‎01-15-2025

Hi @AmitSaini ,

Paste this example into Advanced Editor to see one way to achieve this:

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("lY/NCsIwEIRfpeTcw6ZJmnosXgSxeJXSw0IihuoGbHx/W1H7g2g8LOzCNzszdc02SG3HUsZBaqmyfqvK4exnf0Yia1iTfsIOtntyayTjDAabHP01KW/BXzA4T3MhhzzT+uf/Nxb5f4d0whCQHuKigFxNPSofXoqlk+p9QP4LR8baeuOwHUjB5cJm2nvkJKwEiO9xRppryPSsaWQcXoBQIk7Y3AE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Project = _t, ID = _t, Status = _t, Count = _t, Auto_Status = _t]),
    groupProjectID = Table.Group(Source, {"Project", "ID"}, {{"ProjLineCount", each Table.RowCount(_), Int64.Type}, {"data", each _, type table [Project=nullable text, ID=nullable text, Status=nullable text, Count=nullable text, Auto_Status=nullable text]}}),
    expandData = Table.ExpandTableColumn(groupProjectID, "data", {"Status", "Auto_Status"}, {"Status", "Auto_Status"}),
    filterAuto_Status = Table.SelectRows(expandData, each ([ProjLineCount] > 1 and [Auto_Status] = "Candidate for Automation") or ([ProjLineCount] = 1))
in
    filterAuto_Status

Summary:

groupProjectID = Group table on [Project] and [ID] adding Count and All Rows aggregated columns.

expandData = Reinstate our original table columns.

filterAuto_Status = Apply our conditional filter using the new [ProjLineCount] column to identify where multiple [ID]s exist per project.

Output:

Pete

Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!

View solution in original post

G_Kiranmayi · ‎01-16-2025

Hi

Try following the steps below:

1. Group your rows by ID column, like below:

2. Your output should look something, like this:

3. Now expand the table by selecting the columns as seen below:

4. Add a custom column to flag each row based on your condition and based on the newly added Count_Rows column, like this:

Here is the code: if [Count_Rows] = 2 and [Status] = "Yes" and [Auto_Status] = "Candidate for Automation" then 1 else if [Count_Rows] = 1 then 1 else 0

5. Filter the Flag column for value 1 to get the required rows.

6. Finally remove the uneccesary columns which are Flag and Count_Rows and you should be left with your desired output, like below:

Hope this helps 🙂