Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM. Register now.

Reply
FabvE
Helper I
Helper I

Remove rows based on same AND other column

Hi,

I'm fairly new to PowerBI and have a task to create a dashboard about surveys.

 

The data sets include the following columns:

  • uniqueId: as the name says an unique id for the entry
  • operation: either CreateForm (when a new survey is created) or CreateResponse (when an answer to a survey is sent)
  • formId: id of the form; when operation = CreateResponse then this response is for the corresponding operation = CreateForm 
  • formName: name of the survey
  • timestamp: when the entry was created
  • user: can be an email or plain text "anonymous"

 

Sample Table:

 

uniqueId

operationformIdformNametimestampuser

1

CreateResponseAWhat about lunch?2024-08-10user1
2CreateFormXWhich design is better?2024-08-12user2
3CreateResponseXWhich design is better?2024-08-13user3
4CreateResponseBWhen will we meet?2024-08-15user4
5CreateFormYFish or pork?2024-08-19user3
6CreateResponseXWhich design is better?2024-08-22user5
7CreateResponseYFish or pork?2024-08-28user3

 

I already sorted the list, filtered double uniqueIds, created additional columns but one thing is missing before I can generate reports from the dataset.

My problem is: the list also contains responses to surveys which are NOT listed/marked as CreateForm or are from surveys which were created before the first logs were delivered. So I need to remove all entries from the list which match:

  • operation = CreateResponse
    and
  • formID = formId where operation = CreateForm

In my sample above

  • rows 1, 4 should be removed
    • because there is no corresponding formID with CreateForm
  • rows 2, 5 must be kept
    • because operation = CreateForm
  • rows 3, 6, 7 must be kept
    • because operation = CreateResponse
      and
    • existence of coresponding formIds + operation = CreateForm (rows 2 and 5)

I hope the sample makes it much clearer.

I already tried multiple filters and whatever and also consulted the web but I think I searched wrong.

 

Thanks in advance

 

Edit: added sample table and additional description

3 ACCEPTED SOLUTIONS

The Source line, where you see the Base64 conversion, is what you get when you paste the data into a table in PowerBI/PowerQuery. You should replace the Source line with your own source. You can read the M code documentation for the Binary.FromText function, but that Base64 argument merely determines the type of Binary encoding to be used.

 

And either you didn't run the code, or I misunderstood your before and after.

 

Source Step:

ronrsnfld_0-1728947332723.png

And you wrote: remove rows 1,4 and keep 2,3,5,6,7

 

After running code:

ronrsnfld_1-1728947421260.png

Did I misunderstand what you wanted?  Or did you not run the code to see what it did?

 

If I misunderstood, what rows did you wish to keep? As the code I provided clearly keeps 2,3,5,6 and 7

 

View solution in original post

Anonymous
Not applicable

Hi @FabvE ,
Thanks for ronrsnfld reply.
You can try the following code

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndBRC4IwEAfwr3LsOVCnlj1FBX6AICrEB7WjjXSTbeLXbwwJs0HQw8F2cL/9b0VBIrIiR4WVwRPqXgqNtrG3dWGVgaqWg4F2EA3b2V4WRGFAQ5rY81mjiki5Kgh9E7lUnb1c3ThvGNxR84cArqFGY1BNCJ0j1CGxL8dPKJ5DsYMSH3RwEAoYedvCiNAhmslI50bijHS50c1WzjUDqaCX6jmNbr+fX/+zB/34kNRBGx/kz0GzRY7yBQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Unique ID" = _t, Operation = _t, #"From ID" = _t, #"Form Name" = _t, Timestamp = _t, User = _t]),
    AddFormID = Table.AddColumn(Source, "FormID", each if [Operation] = "CreateForm" then [From ID] else null),
    FillDownFormID = Table.FillDown(AddFormID,{"FormID"}),
    FilterCreateResponse = Table.SelectRows(FillDownFormID, each [Operation] = "CreateResponse"),
    AddCustom = Table.AddColumn(FilterCreateResponse, "HasCorrespondingForm", each List.Contains(FillDownFormID[FormID], [From ID])),
    RemoveInvalidResponses = Table.SelectRows(AddCustom, each [HasCorrespondingForm] = true),
    CombineTables = Table.Combine({Table.SelectRows(FillDownFormID, each [Operation] = "CreateForm"), RemoveInvalidResponses}),
    SortByUniqueID = Table.Sort(CombineTables,{{"Unique ID", Order.Ascending}}),
    RemoveHelperColumns = Table.RemoveColumns(SortByUniqueID,{"FormID", "HasCorrespondingForm"})
in
    RemoveHelperColumns

Final output

vheqmsft_0-1728955963296.png

Best regards,
Albert He


If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

View solution in original post

FabvE
Helper I
Helper I

Ok, after many tries and more fails I went with another approach which solved my problem.

I created an addinional query which contained only the created records:

let
Quelle = Table.SelectRows(Auswertung_raw, each [Operation] = "CreateForm")
in
Quelle

I then used Merge queries to do a join between the initial table Auswertung_raw and the CreateForm-Table:

let
Quelle = Table.NestedJoin(Auswertung_raw, {"FormId"}, tblErstellteUmfragen, {"FormId"}, "tblErstellteUmfragen", JoinKind.LeftSemi),
#"Entfernte Duplikate" = Table.Distinct(Quelle, {"RecordId"})
in
#"Entfernte Duplikate"

This gives me a table with all responses which have a corresponding CreateForm + the initial CreateForm-entries.

 

Nevertheless, thank you all - especially @ronrsnfld- for your help and time! 👍 

View solution in original post

17 REPLIES 17
FabvE
Helper I
Helper I

Ok, after many tries and more fails I went with another approach which solved my problem.

I created an addinional query which contained only the created records:

let
Quelle = Table.SelectRows(Auswertung_raw, each [Operation] = "CreateForm")
in
Quelle

I then used Merge queries to do a join between the initial table Auswertung_raw and the CreateForm-Table:

let
Quelle = Table.NestedJoin(Auswertung_raw, {"FormId"}, tblErstellteUmfragen, {"FormId"}, "tblErstellteUmfragen", JoinKind.LeftSemi),
#"Entfernte Duplikate" = Table.Distinct(Quelle, {"RecordId"})
in
#"Entfernte Duplikate"

This gives me a table with all responses which have a corresponding CreateForm + the initial CreateForm-entries.

 

Nevertheless, thank you all - especially @ronrsnfld- for your help and time! 👍 

wdx223_Daniel
Super User
Super User

=let a=Table.SelectRows(YourTable,each [operation]="CreateForm")[formId] in Table.SelectRows(YourTable,each [operation]="CreateForm" or List.Contains(a,[formId]))

I tried this but ended up dismissing your solution. The refreshing took many minutes for just 4 rows so I canceled it... 😞

Anonymous
Not applicable

Hi @FabvE ,
Thanks for ronrsnfld reply.
You can try the following code

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndBRC4IwEAfwr3LsOVCnlj1FBX6AICrEB7WjjXSTbeLXbwwJs0HQw8F2cL/9b0VBIrIiR4WVwRPqXgqNtrG3dWGVgaqWg4F2EA3b2V4WRGFAQ5rY81mjiki5Kgh9E7lUnb1c3ThvGNxR84cArqFGY1BNCJ0j1CGxL8dPKJ5DsYMSH3RwEAoYedvCiNAhmslI50bijHS50c1WzjUDqaCX6jmNbr+fX/+zB/34kNRBGx/kz0GzRY7yBQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Unique ID" = _t, Operation = _t, #"From ID" = _t, #"Form Name" = _t, Timestamp = _t, User = _t]),
    AddFormID = Table.AddColumn(Source, "FormID", each if [Operation] = "CreateForm" then [From ID] else null),
    FillDownFormID = Table.FillDown(AddFormID,{"FormID"}),
    FilterCreateResponse = Table.SelectRows(FillDownFormID, each [Operation] = "CreateResponse"),
    AddCustom = Table.AddColumn(FilterCreateResponse, "HasCorrespondingForm", each List.Contains(FillDownFormID[FormID], [From ID])),
    RemoveInvalidResponses = Table.SelectRows(AddCustom, each [HasCorrespondingForm] = true),
    CombineTables = Table.Combine({Table.SelectRows(FillDownFormID, each [Operation] = "CreateForm"), RemoveInvalidResponses}),
    SortByUniqueID = Table.Sort(CombineTables,{{"Unique ID", Order.Ascending}}),
    RemoveHelperColumns = Table.RemoveColumns(SortByUniqueID,{"FormID", "HasCorrespondingForm"})
in
    RemoveHelperColumns

Final output

vheqmsft_0-1728955963296.png

Best regards,
Albert He


If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Interesting algorithm. Kindly explain the advantage over what I supplied since yours seems to execute quite a bit more slowly using the Power Query diagnostic tools.

quantumudit
Super User
Super User

Hello @FabvE 

Your query is a bit unclear to me. It would be helpful if you could include a sample dataset with your question, as well as an example of the expected outcome. This information would greatly assist in clarifying the issue.

 

Thanks,
Udit

Hi, I added a sample table. Hope this clarifies my question. Thx

I reversed your logic so we instead select rows where, for the same formId 

  • keep rows where operation = "CreateForm"
  • if operation = "CreateResponse" then there must also be another entry with "CreateForm"
  • (I did not know how to interpret your requirement surveys which were created before the first logs were delivered)

Here's one way putting your logic into a Table.SelectRows function:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ndBRC4IwEAfwr3LsWVCnlj1FBX6AXirEB7WjjXSTbeLXbwwJs0HQw8F2cL/9b2VJYhKQk8La4Bn1IIVG2zjYurDaQN3I0UA3ipbtbS8P4yikEU3tedSoYlIFJaFvopCqt5erG+ctgztq/hDANTRoDKoZoUuEOiTx5fgJJUsocVDqg44OQgET7zqYEHpEMxvZ0kidka03utkquGYgFQxSPefR3ffzm3/2oB8fkjlo64P8OWi+ylG9AA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [uniqueId = _t, operation = _t, formId = _t, formName = _t, timestamp = _t, user = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{
        {"uniqueId", Int64.Type}, {"operation", type text}, {"formId", type text}, 
        {"formName", type text}, {"timestamp", type date}, {"user", type text}}),

    #"Remove Rows" = Table.SelectRows(#"Changed Type",(r)=> 
        r[operation] = "CreateForm" 
        or (r[operation] = "CreateResponse" 
                and List.Contains(Table.SelectRows(#"Changed Type", each [formId]=r[formId])[operation],"CreateForm")))
in
    #"Remove Rows"

 

Ok, I did my best but still couldn't implement the code nor get it somehow to run.

 

My current query is this:

let
Quelle = Table.Combine({#"2023-10_12", #"2024-01_05", #"2024-06"}),
#"Remove Duplicates" = Table.Distinct(Quelle, {"RecordId"}),
#"Get JSON" = Table.AddColumn(#"Remove Duplicates", "JSON Block", each Json.Document([AuditData])),
#"Extend JSON Block" = Table.ExpandRecordColumn(#"Get JSON", "JSON Block", {"FormId", "FormName", "SourceApp", "FormsUserType", "OrganizationId"}, {"JSON Block.FormId", "JSON Block.FormName", "JSON Block.SourceApp", "JSON Block.FormsUserType", "JSON Block.OrganizationId"}),
#"Remove Columns" = Table.RemoveColumns(#"Extend JSON Block",{"AuditData"}),
#"Column isAnonymous" = Table.AddColumn(#"Remove Columns", "isAnonymous", each if Text.StartsWith([UserId], "urn:forms:anonymous") then true else false, type logical),
#"Sort Rows" = Table.Sort(#"Column isAnonymous",{{"CreationDate", Order.Descending}}),
#"SourceApp Mapping" = fBulkReplace(#"Sort Rows", tblSourceAppMapping, {"JSON Block.SourceApp"}),
#"FormsUserType Mapping" = fBulkReplace(#"SourceApp Mapping", tblFormsUserType, {"JSON Block.FormsUserType"})
in
#"FormsUserType Mapping"

 

  • The JSON part splits a JSON part from the CSV source into separate columns for PBI, then removes the source column to keep the query simple and clean.
  • An extra column isAnonymous is created to help identify anonymou replies.
  • The last two actions replace values with mapped values from tables via a custom function fBulkReplace

 

Is there a way to put your nice code into this one?

I'm stuck.... 😐

 

You write, in part, that you couldn't get my code to run. What does that mean? When you pasted it into a blank query, did you not even get an error message? What happened?

When I paste it 1:1 in a new query I just get the test data.

When I use the part after your source-row it runs forever and nothing happens.

 

s posted above I already hav a pre-filtered query which just needs this last step to get rid of the unwanted CreateResponse parts.

 

Atm my table looks like this:

FabvE_2-1729782953250.png

 

Sorry for so many questions and postings. 😐 

 

 

What you need to do is run your own query until you have a table which looks like your data table that you posted in your original question. Then, in the query I provided, you should be able to reference your own query in the Source line.

 

eg:  Source = #"Your Query Name"

 

 

Well I combined your code with mine and it's like this now:

let
Quelle = Table.Combine({#"2023-10_12", #"2024-01_05", #"2024-06"}),
#"Remove Duplicates" = Table.Distinct(Quelle, {"RecordId"}),
#"Get JSON" = Table.AddColumn(#"Remove Duplicates", "JSON Block", each Json.Document([AuditData])),
#"Extend JSON Block" = Table.ExpandRecordColumn(#"Get JSON", "JSON Block", {"FormId", "FormName", "SourceApp", "FormsUserType", "OrganizationId"}, {"JSON Block.FormId", "JSON Block.FormName", "JSON Block.SourceApp", "JSON Block.FormsUserType", "JSON Block.OrganizationId"}),
#"Remove Columns" = Table.RemoveColumns(#"Extend JSON Block",{"AuditData"}),
#"Column isAnonymous" = Table.AddColumn(#"Remove Columns", "isAnonymous", each if Text.StartsWith([UserId], "urn:forms:anonymous") then true else false, type logical),
#"Sort Rows" = Table.Sort(#"Column isAnonymous",{{"CreationDate", Order.Descending}}),
#"SourceApp Mapping" = fBulkReplace(#"Sort Rows", tblSourceAppMapping, {"JSON Block.SourceApp"}),
#"FormsUserType Mapping" = fBulkReplace(#"SourceApp Mapping", tblFormsUserType, {"JSON Block.FormsUserType"}),
#"Remove Rows" = Table.SelectRows(#"FormsUserType Mapping",(r)=>
r[Operation] = "CreateForm"
or (r[Operation] = "CreateResponse"
and List.Contains(Table.SelectRows(#"FormsUserType Mapping", each [JSON Block.FormId]=r[JSON Block.FormId])[Operation],"CreateForm")))
in
#"Remove Rows"

 

It's running over 30mins now and doesn't seem to end...

 

When I select the step before (FormsUserType Mapping) it's quiet fast but the last step doesn't come to an end.

 

 

But that step "FormsUserType Mapping" is not part of the code I supplied. It's something you've developed; it refers to a function you've developed, so I don't know how to help.

 

As I suggested above, it would be better (at least for troubleshooting) to split the queries and have mine just refer to yours. Then, if mine is still running slowly, you could supply a more realistic example so we can see where the hangup is.

Ok, what does that base64 conversion do?

 

And you said:


@ronrsnfld wrote:

I reversed your logic so we instead select rows where, for the same formId 

  • keep rows where operation = "CreateForm"
  • if operation = "CreateResponse" then there must also be another entry with "CreateForm"

 


It must be the other way round. 🙂

If there's an CreateForm there should be a CreateResponse (as a freshly created survey can have zero answers). Therefore keep all CreateForm.

But there can be formIds with CreateResponse where there's no CreateForm - formId. These should be removed.

The Source line, where you see the Base64 conversion, is what you get when you paste the data into a table in PowerBI/PowerQuery. You should replace the Source line with your own source. You can read the M code documentation for the Binary.FromText function, but that Base64 argument merely determines the type of Binary encoding to be used.

 

And either you didn't run the code, or I misunderstood your before and after.

 

Source Step:

ronrsnfld_0-1728947332723.png

And you wrote: remove rows 1,4 and keep 2,3,5,6,7

 

After running code:

ronrsnfld_1-1728947421260.png

Did I misunderstand what you wanted?  Or did you not run the code to see what it did?

 

If I misunderstood, what rows did you wish to keep? As the code I provided clearly keeps 2,3,5,6 and 7

 

Yes, your 2nd screenshot is correct. I'll give it a try, I'm on holidays atm and don't have time to test on my business device. 😄

Helpful resources

Announcements
October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors