Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
I find plenty of posts on how to replace things in a certain way but not what I need to do. I have a column that is an extract from devops which, when I bring it into Excel, it has all the html commands in it, all of which start and end in '<' and '>' like <div> or </div><div><div style="box-sizing:border-box;"> etc.. all splattered throughout the cell. In excel it is easy, find <*> and replace with a space. How can I do this in power query. I am not an m coder so would need it dumbed down.
Solved! Go to Solution.
this
let
fx = (txt) => Text.Combine(
List.Alternate(
List.Select(Text.SplitAny(txt, "<>"), (x) => x <> ""),
1,
1,
Number.From(not Text.StartsWith(txt, "<"))
),
" "
),
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
result = Table.TransformColumns(Source, {"Column1", fx})
in
result
replaces <, > and everything in between with space. If you don't need space - simply remove " " in Text.Combine call.
Hi,
Thanks for the solutions everyone offered, and i want to offer some more information for user to refer to.
hello @BZimber , you can create a blank query, and put the following code to advanced editor in power query.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wiik1MDBOTsksAzNSE5OSISL6cKGU1DSlWB2YSoQ4mk5DI2MTUzO4oEJxSWVOqm2MUlJ+hW5xZlVmXrpVUn5RSmqRLlDEOgZiXKpSbCwA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each let a=Text.Split([Column1],">"),
b=List.Transform(a,each if Text.Contains(_,"<") then Text.BeforeDelimiter(_,"<") else _),
c=Text.Combine(b," ")
in c),
#"Trimmed Text" = Table.TransformColumns(#"Added Custom",{{"Custom", Text.Trim, type text}})
in
#"Trimmed Text"
Output
Best Regards!
Yolo Zhu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
The Text.BetweenDelimiters might be helpfull
It might very well be helpful. However I am a new user and I don't have the first clue on what to do or how to use that function as I am not to the level of coding. I am learning via videos using the built in function buttons that create code. I am not to the level yet to speak 'm' code.
imagine your data are on a coluumn namely X, then add a new custom column namely X2 by the next formula to extract the text between <>.
=Text.BetweenDelimiter([X],"<",">")
then replace it by your desired text by using Text.Replace
You could use regular expressions.
Create a blank query
Paste the code below into the Advanced Editor.
Then create your custom column using "invoke custom function"
In the Advanced Editor, the code would look like:
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each fnRegexReplace([Column1],"<[^>]*>"," "))
I made several attempts but failed each time. I am new to power query so when providing concepts like that I fall flat on my face becuase I don't have the experience to intuitively fill int he blanks. I went to the query editor then New Source>Other Sources>Blank Query. Then to Advanced Editor which gave me:
let
Source = ""
in
Source
Then I pasted in your string after 'Source = ""' and had to put a comma at the end of 'Source = ""' to resolve a token error thing. It then looked like:
let
Source = "",
#"Added Custom" = Table.AddColumn(#"Changed Type", "Custom", each fnRegexReplace([Column1],"<[^>]*>"," "))
in
Source
Then I went to Add Column but all the buttons were greyed out.
So by now you probably can tell that I like the idea, but I am in over my head. But the question becomes, even if I go the function to work in the blank query, how do I get it to work with my data in the original query?
Not your fault at all.
I neglected to provide the code you needed to paste into the blank query.
🤦
//Rename Query: fnRegexReplace
(text as nullable text,pattern as nullable text,replace as nullable text, optional flags as nullable text) =>
let
f=if flags = null or flags ="" then "" else flags,
l1 = List.Transform({text, pattern, replace}, each Text.Replace(_, "\", "\")),
l2 = List.Transform(l1, each Text.Replace(_, "'", "\'")),
t = Text.Format("<script>var txt='#{0}';document.write(txt.replace(new RegExp('#{1}','#{3}'),'#{2}'));</script>", List.Combine({l2,{f}})),
r=Web.Page(t)[Data]{0}[Children]{0}[Children],
Output=if List.Count(r)>1 then r{1}[Text]{0} else ""
in Output
After you do this, you should be able to add that column in your main query.
Sorry about that.
This solution mostly worked. It seems like so long as the text in the field started with '<' then it worked. But if there was text before the first occurance, it just wiped out all text.
Hi @BZimber ,
Have you already looked through this post by @Rickmaurinus, yet?
https://gorilla.bi/power-query/wildcards/
Also, the following approach might be helping you out here as well:
https://www.tackytech.blog/how-to-replace-several-characters-in-a-column-in-just-one-power-query-ste...
You would still need to specify each of those funky html tags, though...
Hope this helps 🙂
/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/
Did I answer your question❓➡️ Please, mark my post as a solution ✔️ |
Also happily accepting Kudos 🙂 |
Feel free to connect with me on LinkedIn! |
#proudtobeasuperuser |
Yes I saw the first one and couldn't figure out how to adapt it. I am not trying to clean 'pre-determined' values. I am trying to turn everything between '<' and '>' to a blank. For example <what ever html code here> to a space. There will be multiple of those html instructions in each cell. I am a very new to power query and everyone giving ideas is operating from a much higher experience level than myself which is why I cannot just apply general concepts to my case.
this
let
fx = (txt) => Text.Combine(
List.Alternate(
List.Select(Text.SplitAny(txt, "<>"), (x) => x <> ""),
1,
1,
Number.From(not Text.StartsWith(txt, "<"))
),
" "
),
Source = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
result = Table.TransformColumns(Source, {"Column1", fx})
in
result
replaces <, > and everything in between with space. If you don't need space - simply remove " " in Text.Combine call.
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
63 | |
56 | |
27 | |
17 | |
13 |