Solved: Tricky text extraction in Power Query

masplin · ‎07-04-2024

Hi

I have a data string [Latest Financials] that looks like this

{'vat': ['20.00', '85.26'], 'share': {'price': '426.31', 'num_parties': 4}, 'total': '511.57', 'subtotal': '1705.24', 'drug_test_fees': ['646.00', [['Hair strand testing x4 sections (3 scalp, 1 body)', ['Cannabis', 'Cocaine', 'Opiates', 'Amphetamine', 'MDMA (Ecstasy)'], '646.00']]], 'collection_info': [[None, '145.00']], 'expert_drug_fee': '131.25', 'expert_drug_rate': '105.00', 'alcohol_test_fees': ['670.99', [['Blood testing', ['CDT'], '95.00'], ['Hair strand testing x2 sections (1 scalp, 1 body)', ['FAEE'], '250.99'], ['Hair strand testing x2 sections (1 scalp, 1 body)', ['EtG'], '325.00']]], 'expert_alcohol_fee': '112.00', 'expert_alcohol_rate': '105.00', 'expert_drug_minutes': 75, 'subtotal_expert_fee': '243.25', 'expert_alcohol_minutes': 64, 'collection_info_subtotal': '145.00', 'expert_tests_drugs_subtotal': '777.25', 'expert_tests_alcohol_subtotal': '782.99'}

I need to extract the 646 that appears after the drug_test_fees bit and several other similar numbers further along. I know how to do this in DAX but not in power Query . Some rows have no entry for drug test fees and it can be in various positions in the string so has to be done by text matching.

I got as far as this which sort of works, but the 646 could be 4,5 or 6 digitts long so woudl ike ot make it variable. Essentailly I need to calcuate how many charachters are between [ and , surrounding that number. In DAX you would search for the first comma after the drug_test_fees, but can't work out how to to Text.PositionOf starting from drug_test_fees?

= Table.AddColumn(#"Removed Other Columns", "Drug Test Fee", each if Text.PositionOf([Latest Financials],"drug_test_fees")>0 then Text.Range([Latest Financials],Text.PositionOf([Latest Financials],"drug_test_fees")+18,6) else null)

Also this text is in a CSV file andwhen I pull it into Power Query it looks slightly different on that the ['646' is now [646.

{'vat': [20.0, 74.51], 'share': {'price': 372.56, 'num_parties': 4}, 'total': 447.07, 'subtotal': 1490.25, 'drug_test_fees': [646.0,

Maybe there is an easier way to do this.

Appreciate any advice

Mike

bhanu_gautam · ‎07-04-2024

@masplin < Try using this m-code in custom column

= Table.AddColumn(#"PreviousStepName", "Drug Test Fee", each
let
sourceText = [Latest Financials],
startPos = Text.PositionOf(sourceText, "drug_test_fees"),
afterKey = Text.Middle(sourceText, startPos + Text.Length("drug_test_fees") + 4),
feeText = Text.BeforeDelimiter(afterKey, ",")
in
if startPos > 0 then Text.Trim(Text.Select(feeText, {"0".."9", "."})) else null
)

Did I answer your question? Mark my post as a solution! And Kudos are appreciated

Proud to be a Super User!

View solution in original post

bhanu_gautam · ‎07-04-2024

@masplin < Try using this m-code in custom column

= Table.AddColumn(#"PreviousStepName", "Drug Test Fee", each
let
sourceText = [Latest Financials],
startPos = Text.PositionOf(sourceText, "drug_test_fees"),
afterKey = Text.Middle(sourceText, startPos + Text.Length("drug_test_fees") + 4),
feeText = Text.BeforeDelimiter(afterKey, ",")
in
if startPos > 0 then Text.Trim(Text.Select(feeText, {"0".."9", "."})) else null
)

Did I answer your question? Mark my post as a solution! And Kudos are appreciated

Proud to be a Super User!

masplin · ‎07-04-2024

Thats great but just to understand

Start pos find where the "d" is

afterkey is text everythnig to right of drug_test_fees + 4 charachters? so captures the numerical value

feeText This is now the bit 646.00', [['Hair strand testing x4 sections (3 scalp, 1 body)', ['Cannabis', 'Cocaine', 'Opiates', 'Amphetamine', 'MDMA (Ecstasy)'], '646.00']]], 'coll etc so looks left of the first comma and trims it

Text.select only picks out numbers and "."

Genius thanks so much

Tricky text extraction in Power Query

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

FabCon is coming to Atlanta