March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I'm trying to remove all the substrings from the "substring" list, in column [column1] by adding the following custom column:
List.Accumulate(substring, [Column1],(state, current) => Text.Replace(state, substring{current}, ""))
but i get the follow error:
Expression.Error: We cannot apply indexing to the type List.
note that I'm not using Text.remove because it only works for single characters, not strings.
In the same sense, I also want a function that keeps me ONLY the substrings of the list, which could be done with text.select if instead of strings they were unique characters.
So once I have the desired function with list.accumulate (or whatever if there is a better one), I plan to conceal it twice as shown here to keep only those strings. Even so, it seems to me that there has to be a more efficient way, if you come up with something better, please let me know.
In short, I need an efficient way to remove the strings from a list from one column and another to keep only those strings.
Solved! Go to Solution.
Hello @carlpaul153
check out this solution. this code replaces all substrings defined in the step SubstringList transforms the column1 into a list with two items. first is new value, second is the removed part. Then duplicated the column and transform them again. Extracting in the column1 the first item, in the changes-column the 2nd item
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckxKVorViVZKSU0D0+kZmWC6AgTArCoQUIqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
SubstringList = {"bc", "xx", "zzz"},
TranformColumn1 = Table.TransformColumns
(
#"Changed Type",
{
{
"Column1",
(cell)=> List.Accumulate(SubstringList, {cell, ""}, (old, current)=>
let
CreateNew = Text.Replace(old{0},current,""),
CreateSubstringsRemoved = try Text.Combine(List.Difference(Text.ToList(old{0}), Text.ToList(CreateNew))) otherwise ""
in
{CreateNew, if CreateSubstringsRemoved <> "" and old{1}<> "" then old{1} & ", " & CreateSubstringsRemoved else if old{1} <>"" and CreateSubstringsRemoved = "" then old{1} else CreateSubstringsRemoved }
)
}
}
),
Duplicated = Table.DuplicateColumn(TranformColumn1, "Column1", "changes"),
Final = Table.TransformColumns
(
Duplicated,
{
{
"Column1",
each _{0}
},
{
"changes",
each _{1}
}
}
)
in
Final
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Thanks @carlpaul153 for the question, @Anonymous for comment and a solution, and @Jimmy801 for the code above.
I do something very similar. I am processing e-mails to count topics defined by the subject line. I want to remove duplicates - e-mails that contain either "FW: " or "RE:" ("Fw: " and "Re:"), place the trimmed subject in a new column, and then use 'Table.Distinct' on the new column to remove duplicates.
I have adapted @Jimmy801 code to a 'Table.Addcolumn'. Here is the relevant code - [Subject] is the column with the subject line from the e-mails. I found that 'List.Accumulate' returned a list, the first element was the processed code, the second was from the substring list. Consequence of using @Jimmy801's code as his does a few extra things.
SubstringList = {"FW: ", "Fw: ", "RE: ", "Re: "},
AddTitle = Table.AddColumn(SetTypeCompactTable, "Title", each
List.Accumulate( SubstringList, {[Subject], ""}, (old, current) =>
let
CreateNew = Text.Replace(old{0}, current,""),
CreateSubstringsRemoved =
try Text.Combine (List.Difference(Text.ToList (old{0}), Text.ToList(CreateNew) ) )
otherwise ""
in
{CreateNew, if CreateSubstringsRemoved <> "" and old{1} <> "" then
old{1} & ", " & CreateSubstringsRemoved
else
if old{1} <> "" and CreateSubstringsRemoved = "" then
old{1}
else
CreateSubstringsRemoved
}
) {0}, Text.Type ),
This works.
@CNENFRNL I appreciate your effort. The problem with your solution is that for each row the function returns a list, when the expected result is a text.
@Jimmy801 thank you very much. Honestly, I've found a way to deal with the problem in another way that works for me, so I haven't been able to test your algorithm. Either way I mark it as a solution to serve others. If anyone finds an error in it, please report it.
@Anonymous I appreciate your effort to understand the problem with List.Accumulate. As I mentioned to Jimmy I already managed to solve the problem, but answering your question, substring is not a list of lists, just a list of texts.
hi @carlpaul153
about your answer:
"@Rocco_sprmnt21 I appreciate your effort to understand the problem with List.Accumulate. As I mentioned to Jimmy I already managed to solve the problem, but answering your question, substring is not a list of lists, just a list of texts."
let me have doubts about what you say.
While waiting for you to document that type of error that you mentioned in correspondence with the code you posted in the first message, I explain the reasons for my doubts, hoping that they will be useful.
the variable substring=list of string as you claim:
the error message you got with your code and string list is different from the one you mentioned.
If you had modified your code as suggested by me (it was enough to delete only a small part of the text), you would have obtained this result:
if instead the variable substring contains lists inside it:
you get exactly the error you mentioned:
"List.Accumulate(substring, [Column1],(state, current) => Text.Replace(state, substring{current}, ""))
but i get the follow error:
Expression.Error: We cannot apply indexing to the type List."
PS
Apart from all these details, the solution with list.accumulate is very beautiful and elegant.
Hi @carlpaul153
the error you got with the following expression
List.Accumulate(substring, [Column1],(state, current) => Text.Replace(state, substring{current}, ""))
is most likely due to the following fact.
Mine, for lack of complete information, is an attempt to reconstruct what you got based on what I know to be how the List.Accumulate function works.
the third parameter of the function, which is itself a function, takes the value of the avriable "current" from the list which is the first parameter of the function, in our case "substring".
If, as it likely is but we are waiting for your confirmation, the substring list is a list of lists, then you get the error you reported.
The syntactically correct form would be the following:
List.Accumulate(substring, [Column1],(state, current) => Text.Replace(state, current, ""))
in this way, in each step, you replace the i-th substring with "" inside the string contained in the current row of column1.
But this, if my supposition is correct, is not enough.
You probably need to change the structure of the list as well:
substring, which must be a list of string, like this:
{"ad", "b","hg"}
not a list of lists,
like this
{{"ad"}, {"b"},"hg"}
But all this is my guess.
Hello @carlpaul153
check out this solution. this code replaces all substrings defined in the step SubstringList transforms the column1 into a list with two items. first is new value, second is the removed part. Then duplicated the column and transform them again. Extracting in the column1 the first item, in the changes-column the 2nd item
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WckxKVorViVZKSU0D0+kZmWC6AgTArCoQUIqNBQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
SubstringList = {"bc", "xx", "zzz"},
TranformColumn1 = Table.TransformColumns
(
#"Changed Type",
{
{
"Column1",
(cell)=> List.Accumulate(SubstringList, {cell, ""}, (old, current)=>
let
CreateNew = Text.Replace(old{0},current,""),
CreateSubstringsRemoved = try Text.Combine(List.Difference(Text.ToList(old{0}), Text.ToList(CreateNew))) otherwise ""
in
{CreateNew, if CreateSubstringsRemoved <> "" and old{1}<> "" then old{1} & ", " & CreateSubstringsRemoved else if old{1} <>"" and CreateSubstringsRemoved = "" then old{1} else CreateSubstringsRemoved }
)
}
}
),
Duplicated = Table.DuplicateColumn(TranformColumn1, "Column1", "changes"),
Final = Table.TransformColumns
(
Duplicated,
{
{
"Column1",
each _{0}
},
{
"changes",
each _{1}
}
}
)
in
Final
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
@carlpaul153 Consider using Text.Replace and replace with an empty space.
Text.Replace([Column1], "100", "")
That will turn "This Is The Number 100." to "This Is the Number ."
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingThe procedure to replace each substring in a list is to be defined as a function. Pls refer to the code as follows,
let
fn = (Str as text, SubstrList as list) =>
let
Replacing = List.Generate(
() => [str = Str, counter = 0],
each [counter] <= List.Count(SubstrList),
each [str = Text.Replace([str], SubstrList{[counter]}, ""), counter = [counter] + 1],
each Text.Trim([str])
)
in
Replacing,
SubstrList = {"Is", "This", "The"},
Str = "This Is The Number 100.",
Invocation = fn(Str, SubstrList)
in
Invocation
result reads like this,
List |
This Is The Number 100. |
This The Number 100. |
The Number 100. |
Number 100. |
Thanks to the great efforts by MS engineers to simplify syntax of DAX! Most beginners are SUCCESSFULLY MISLED to think that they could easily master DAX; but it turns out that the intricacy of the most frequently used RANKX() is still way beyond their comprehension! |
DAX is simple, but NOT EASY! |
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.