Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply
dhruvgulati
Helper I
Helper I

filtering and saving into new column

I have 2 column 1st. has strings 2nd column is unique set of words.
col1 has 2lac entries
col2 has 10000 words
3rd col is blank

i want if (col1 row1)contains any word from col2 -----concate that word in col3 with space delimiter (result should be in saw row as col1)
similarly for col1 row2,col1 row3...till end
I am a java devloper how do we nest loops in power query?
i am new to power query and have been restricted only to excel
your help would be really appreciated 
thanks

1 ACCEPTED SOLUTION

You will need to remove complete Source line in my code and replace that with your source line. _t are generated when you use PQ's Enter Data feature to generate data rather than pulling from a source.

Use the below new code for your task

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    BuffList1 = List.Buffer(Source[Column1]),
    BuffList2 = List.Buffer(List.RemoveNulls(List.ReplaceValue(Source[Column2],"",null,Replacer.ReplaceValue))),
    BuffList1Count = List.Count(BuffList1),
    GenList = List.Generate(()=>[x=Text.Combine(List.Select(BuffList2,(a)=>List.Contains(Text.Split(BuffList1{0}," "),a, Comparer.OrdinalIgnoreCase)),", "),i=0], each [i]<BuffList1Count, each [i=[i]+1, x=Text.Combine(List.Select(BuffList2,(a)=>List.Contains(Text.Split(BuffList1{i}," "),a, Comparer.OrdinalIgnoreCase)),", ")], each [x]),
    Result = Table.FromColumns(Table.ToColumns(Source)&{GenList},Table.ColumnNames(Source)&{"Result"})
in
    Result

View solution in original post

18 REPLIES 18
dhruvgulati
Helper I
Helper I

over 51 views no help pls contribute

Some sample data is needed to provide the required solution.

13m ago

expected output
col1   col2   col3

Hi I am having issue with this product       product     product issue
Error productissueproduct
defective product product
issue report  issue
I am having triouble  

See the working here - Open a blank query - Home - Advanced Editor - Remove everything from there and paste the below code to test

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("Tc1RCoAgDAbgqwyfu0ZQZxAfTFcOKmXOun6kCL6N/9v+aa0WghXsBcE+dB9AOReElySABMqQOPriRE2qT2bSamaOPFi9quJxRyf04KDCsWxn89bPmCL/VLPxf9/9yXw=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
    #"Replaced Value" = Table.ReplaceValue(Source,"",null,Replacer.ReplaceValue,{"Column2"}),
    BuffList = List.Buffer(List.RemoveNulls(#"Replaced Value"[Column2])),
    #"Added Custom1" = Table.AddColumn(#"Replaced Value", "Result", each Text.Combine(List.Intersect({BuffList,Text.Split([Column1]," ")},Comparer.OrdinalIgnoreCase),", "))
in
    #"Added Custom1"

 

@Vijay_A_Verma  hi i want to use it in my current workbook it is giving error
token identifier expected 
if i remove line-------let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),-------
then it works and start generating endless errors 


let
Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),

let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t]),
#"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Column2"}),
BuffList = List.Buffer(List.RemoveNulls(#"Replaced Value"[Column2])),
#"Added Custom1" = Table.AddColumn(#"Replaced Value", "Result", each Text.Combine(List.Intersect({BuffList,Text.Split([Column1]," ")},Comparer.OrdinalIgnoreCase)," "))
in
#"Added Custom1"

Use this

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Column2"}),
    BuffList = List.Buffer(List.RemoveNulls(#"Replaced Value"[Column2])),
    #"Added Custom1" = Table.AddColumn(#"Replaced Value", "Result", each Text.Combine(List.Intersect({BuffList,Text.Split([Column1]," ")},Comparer.OrdinalIgnoreCase)," "))
in
    #"Added Custom1"

this is running endless col1 has 18k rows and col2 has 3k,

it loaded like 2.9L with over 2.5Lac errors

 

Since you have this in Excel, is it possible for you to share your workbook alongwith data and query?

i am sorry working on company confidential data
i can complete this in java easy peasy but they have restricted me to query have to learn from scratch.
In second solution you removed let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t, Column2 = _t])
what does this line do?

after giving the desired output it is running endless loop
for big data take any string dump in col

take half dictonary in col 2 and try


i just copy pasted this

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}, {"Column2", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","",null,Replacer.ReplaceValue,{"Column2"}),
    BuffList = List.Buffer(List.RemoveNulls(#"Replaced Value"[Column2])),
    #"Added Custom1" = Table.AddColumn(#"Replaced Value", "Result", each Text.Combine(List.Intersect({BuffList,Text.Split([Column1]," ")},Comparer.OrdinalIgnoreCase)," "))
in
    #"Added Custom1"

 

You will need to remove complete Source line in my code and replace that with your source line. _t are generated when you use PQ's Enter Data feature to generate data rather than pulling from a source.

Use the below new code for your task

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    BuffList1 = List.Buffer(Source[Column1]),
    BuffList2 = List.Buffer(List.RemoveNulls(List.ReplaceValue(Source[Column2],"",null,Replacer.ReplaceValue))),
    BuffList1Count = List.Count(BuffList1),
    GenList = List.Generate(()=>[x=Text.Combine(List.Select(BuffList2,(a)=>List.Contains(Text.Split(BuffList1{0}," "),a, Comparer.OrdinalIgnoreCase)),", "),i=0], each [i]<BuffList1Count, each [i=[i]+1, x=Text.Combine(List.Select(BuffList2,(a)=>List.Contains(Text.Split(BuffList1{i}," "),a, Comparer.OrdinalIgnoreCase)),", ")], each [x]),
    Result = Table.FromColumns(Table.ToColumns(Source)&{GenList},Table.ColumnNames(Source)&{"Result"})
in
    Result

Screenshot (150).png
col 1 has 18438 col2 has 3567 
while selcting table i have selected these only, they dont have blanks or nulls.

but after showing correct output in query editior when i close and load it doesnt stop running
i just copy pasted the entire code changed table number

HAPPY 75th India Independence Day to all


It worked it was my mistake i was referencing data on which i already performed another query. 

Another issue

it is skipping words like which are enclosed with doublequotes 

example

"Demo

Insert/demo

 

if i try to remove all special character from the column and replace them with space

then words like can't converts into can t then t gets skip

I need to see few samples for column1 and column2 containing these kind of words...

Screenshot (152).png

Use this

let
    Source = Excel.CurrentWorkbook(){[Name="Table5"]}[Content],
    CharsList = {"A".."Z","a".."z"," "},
    BuffList1 = List.Buffer(Source[Column1]),
    BuffList2 = List.Buffer(List.RemoveNulls(List.ReplaceValue(Source[Column2],"",null,Replacer.ReplaceValue))),
    BuffList1Count = List.Count(BuffList1),
    GenList = List.Generate(()=>[x=Text.Combine(List.Select(BuffList2,(a)=>List.Contains(Text.Split(Text.Select(BuffList1{0},CharsList)," "),a, Comparer.OrdinalIgnoreCase)),", "),i=0], each [i]<BuffList1Count, each [i=[i]+1, x=Text.Combine(List.Select(BuffList2,(a)=>List.Contains(Text.Split(Text.Select(BuffList1{i},CharsList)," "),a, Comparer.OrdinalIgnoreCase)),", ")], each [x]),
    Result = Table.FromColumns(Table.ToColumns(Source)&{GenList},Table.ColumnNames(Source)&{"Result"})
in
    Result

Thank you so much You saved my day if you have free time can you make a youtube video or a comment thread explaing the working 

Helpful resources

Announcements
OCT PBI Update Carousel

Power BI Monthly Update - October 2024

Check out the October 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.