Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Herbie
Regular Visitor

Fastest way of checking if column value appears anywhere in another column

Hello,

 

For each row, I want to check if the id appears anywhere in the previous_id column.

 

idprevious_idis_parent
1nullTrue
2nullTrue
31True
43True
52False
64False

 

I have done this using List.Contains and List.Distinct, but this step seems slower than it should be (simple operation?).

 

= Table.AddColumn(#"foo_step", "is_parent", each List.Contains(List.Distinct(#"this_step"[previous_id]), [id]), type logical)

 

Is there a better way to do this in Power Query?

 

Thanks,

Herbie

1 ACCEPTED SOLUTION
Vijay_A_Verma
Super User
Super User

1. If you use List.Buffer, it will speed up your operation

List.Contains(List.Buffer(List.Distinct(#"this_step"[previous_id])), [id])

In my view, List.Distinct is meaningless here. You can omit this.

2. Another way is self merge which is also quite fast. Test the code here.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUVKK1YlWMoIxjIEMQzDLBMgyBrNMgSwjMMsMyDJRio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, previous_id = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"previous_id", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"id"}, #"Changed Type", {"previous_id"}, "Changed Type", JoinKind.LeftOuter),
    #"Expanded Changed Type" = Table.ExpandTableColumn(#"Merged Queries", "Changed Type", {"previous_id"}, {"Found"})
in
    #"Expanded Changed Type"

 But ultimately everything depends upon your dataset and quantum of data and which approach is better for you if something you will have to see on your dataset

3. Otherwise, you can use below approach to make it very fast

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUVKK1YlWMoIxjIEMQzDLBMgyBrNMgSwjMMsMyDJRio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, previous_id = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"previous_id", Int64.Type}}),
    Listid = List.Buffer(#"Changed Type"[id]),
    Listpid = List.Buffer(#"Changed Type"[previous_id]),
    Cnt = List.Count(Listid),
    Output = List.Generate(()=>[x=Listid{0},i=0], each [i]<Cnt, each [i=[i]+1, x=Listid{i}], each List.Contains(Listpid,[x])),
    Result = Table.FromColumns(Table.ToColumns(#"Changed Type") & {Output},Table.ColumnNames(#"Changed Type") & {"Result"})
in
    Result

👍 It's been a pleasure to help you | Help Hours: 11 AM to 9 PM (UTC+05:30)

How to get your questions answered quickly -- How to provide sample data

View solution in original post

3 REPLIES 3
Vijay_A_Verma
Super User
Super User

1. If you use List.Buffer, it will speed up your operation

List.Contains(List.Buffer(List.Distinct(#"this_step"[previous_id])), [id])

In my view, List.Distinct is meaningless here. You can omit this.

2. Another way is self merge which is also quite fast. Test the code here.

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUVKK1YlWMoIxjIEMQzDLBMgyBrNMgSwjMMsMyDJRio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, previous_id = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"previous_id", Int64.Type}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"id"}, #"Changed Type", {"previous_id"}, "Changed Type", JoinKind.LeftOuter),
    #"Expanded Changed Type" = Table.ExpandTableColumn(#"Merged Queries", "Changed Type", {"previous_id"}, {"Found"})
in
    #"Expanded Changed Type"

 But ultimately everything depends upon your dataset and quantum of data and which approach is better for you if something you will have to see on your dataset

3. Otherwise, you can use below approach to make it very fast

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUVKK1YlWMoIxjIEMQzDLBMgyBrNMgSwjMMsMyDJRio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [id = _t, previous_id = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"id", Int64.Type}, {"previous_id", Int64.Type}}),
    Listid = List.Buffer(#"Changed Type"[id]),
    Listpid = List.Buffer(#"Changed Type"[previous_id]),
    Cnt = List.Count(Listid),
    Output = List.Generate(()=>[x=Listid{0},i=0], each [i]<Cnt, each [i=[i]+1, x=Listid{i}], each List.Contains(Listpid,[x])),
    Result = Table.FromColumns(Table.ToColumns(#"Changed Type") & {Output},Table.ColumnNames(#"Changed Type") & {"Result"})
in
    Result

👍 It's been a pleasure to help you | Help Hours: 11 AM to 9 PM (UTC+05:30)

How to get your questions answered quickly -- How to provide sample data

Why do you consider solution 3 faster as 1? I imagine*, that 1 does algorithmically the same as 3.
Did you simply test it or is do you have some algorithmic reasoning? 

*Documentation is neither detailed, verbose nor comprehensive

Thanks Vijay_A_Verma, all 3 options were faster and you were right, List.Distinct was meaningless.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.