Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Anonymous
Not applicable

extract every occurrence of the text between two delimeters

I have an array column that Power BI reads as a string. An example of a record in this column is below:

 

"[{""cladding_type"":""Cement Fibre"",""ACM_ind"":false,""cladding_manufacturer"":""Celotex"",""insulation_type"":""Other"",""insulation_manufacturer"":""Other""},{""cladding_type"":""Metal composite panel - Unknown"",""ACM_ind"":false,""cladding_manufacturer"":""Valcan (aluminium / fibre cement)"",""insulation_type"":""Rigid (thermoset) foam - Polyisocyanurate (PIR)"",""insulation_manufacturer"":""Celotex""},{""cladding_type"":""Other"",""ACM_ind"":false,""cladding_manufacturer"":""Celotex"",""insulation_type"":""Rigid (thermoset) foam - Polyisocyanurate (PIR)"",""insulation_manufacturer"":""Recticel""}]"

 

I would like to extract every cladding type here. Specifically the text between every occurence of the following delimiters: """cladding_type"":""", """ and ""ACM_ind"":"

 

The output from the above should be something like:

Cement Fibre, Metal composite panel - Unknown, Other

1 ACCEPTED SOLUTION

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"col1", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","""[","[",Replacer.ReplaceText,{"col1"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","]""","]",Replacer.ReplaceText,{"col1"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","""""","""",Replacer.ReplaceText,{"col1"}),
    #"Added Custom" = Table.AddColumn(#"Replaced Value2", "Custom", each Json.Document([col1])),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom", each Text.Combine(List.Transform(_, (x) => Text.From(x[cladding_type])), ","), type text})
in
    #"Extracted Values"

 

View solution in original post

9 REPLIES 9
ThxAlot
Super User
Super User

Obviously, it's a JSON string,

let
    Source = "[{""cladding_type"":""Cement Fibre"",""ACM_ind"":false,""cladding_manufacturer"":""Celotex"",""insulation_type"":""Other"",""insulation_manufacturer"":""Other""},{""cladding_type"":""Metal composite panel - Unknown"",""ACM_ind"":false,""cladding_manufacturer"":""Valcan (aluminium / fibre cement)"",""insulation_type"":""Rigid (thermoset) foam - Polyisocyanurate (PIR)"",""insulation_manufacturer"":""Celotex""},{""cladding_type"":""Other"",""ACM_ind"":false,""cladding_manufacturer"":""Celotex"",""insulation_type"":""Rigid (thermoset) foam - Polyisocyanurate (PIR)"",""insulation_manufacturer"":""Recticel""}]",
    #"Parsed JSON" = Json.Document(Source),
    #"Extracted cladding_type" = Table.FromRecords(#"Parsed JSON")[cladding_type]
in
    #"Extracted cladding_type"

ThxAlot_0-1700518951236.png



Expertise = List.Accumulate(


        {Days as from Today},


        {Skills and Knowledge},


        (Current, Everyday) => Current & Day.LeanAndPractise(Everyday)


)



spinfuzer
Super User
Super User

Reformat your text so it is in JSON format and then you can use Json.Document([column name]), expand, and filter.

 

Replace Value "[ with [

Replace Value "] with ]

Replace value "" with "

 

Then add custom column with formula =Json.Document([Column]).

 

spinfuzer_0-1700517754731.png

 

Anonymous
Not applicable

Hi spinfuzer, thank you so much for replying promptly. Looks like this works mostly, but I would instead like each cladding type to appear in one row. in other words where there are three rows this should only be 1 row with each cladding type shown either in seperate columns or a merged column. 

Extract to values instead of new rows.  However, you will get an error and you need to edit the formula to extract the cladding_type column of each record

 

 

= Table.TransformColumns(#"Added Custom", {"Custom", each Text.Combine(List.Transform(_, (x) => Text.From(x[cladding_type])), ","), type text})

 

 

 

 

 

spinfuzer_1-1700518874686.png

 

 

Anonymous
Not applicable

Hi Spinfuzer, this looks like the exact result I need. If you don't mind would you be able to share the m code in the screenshot please? Will make my life a lot easier. I will obviously change the source. Thanks in advance.

 

let
    Source = Excel.CurrentWorkbook(){[Name="Table6"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"col1", type text}}),
    #"Replaced Value" = Table.ReplaceValue(#"Changed Type","""[","[",Replacer.ReplaceText,{"col1"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","]""","]",Replacer.ReplaceText,{"col1"}),
    #"Replaced Value2" = Table.ReplaceValue(#"Replaced Value1","""""","""",Replacer.ReplaceText,{"col1"}),
    #"Added Custom" = Table.AddColumn(#"Replaced Value2", "Custom", each Json.Document([col1])),
    #"Extracted Values" = Table.TransformColumns(#"Added Custom", {"Custom", each Text.Combine(List.Transform(_, (x) => Text.From(x[cladding_type])), ","), type text})
in
    #"Extracted Values"

 

Anonymous
Not applicable

Thank you so much. So nearly there, but I get the error message below (FYI I renamed the column in my table to col1 to match your code). The data source is a databricks table.

 

burnzy007_0-1700521290198.png

 

Your #"Changed Type" step is referencing Source instead of #"Renamed Columns" (the step before).

Anonymous
Not applicable

this is perfect, so many thanks!!! Lifesaver

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.