Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
masplin
Impactful Individual
Impactful Individual

Tricky text extraction in Power Query

Hi

 

I have a data string [Latest Financials] that looks like this

 

{'vat': ['20.00', '85.26'], 'share': {'price': '426.31', 'num_parties': 4}, 'total': '511.57', 'subtotal': '1705.24', 'drug_test_fees': ['646.00', [['Hair strand testing x4 sections (3 scalp, 1 body)', ['Cannabis', 'Cocaine', 'Opiates', 'Amphetamine', 'MDMA (Ecstasy)'], '646.00']]], 'collection_info': [[None, '145.00']], 'expert_drug_fee': '131.25', 'expert_drug_rate': '105.00', 'alcohol_test_fees': ['670.99', [['Blood testing', ['CDT'], '95.00'], ['Hair strand testing x2 sections (1 scalp, 1 body)', ['FAEE'], '250.99'], ['Hair strand testing x2 sections (1 scalp, 1 body)', ['EtG'], '325.00']]], 'expert_alcohol_fee': '112.00', 'expert_alcohol_rate': '105.00', 'expert_drug_minutes': 75, 'subtotal_expert_fee': '243.25', 'expert_alcohol_minutes': 64, 'collection_info_subtotal': '145.00', 'expert_tests_drugs_subtotal': '777.25', 'expert_tests_alcohol_subtotal': '782.99'}

 

I need to extract the 646 that appears after the drug_test_fees bit and several other similar numbers further along. I know how to do this in DAX but not in power Query . Some rows have no entry for drug test fees and it can be in various positions in the string so has to be done by text matching. 

 

I got as far as this which sort of works, but the 646 could be 4,5 or 6 digitts long so woudl ike ot make it variable. Essentailly I need to calcuate how many charachters are between [ and , surrounding that number. In DAX you would search for the first comma after the drug_test_fees, but can't work out how to to Text.PositionOf starting from drug_test_fees?

 

= Table.AddColumn(#"Removed Other Columns", "Drug Test Fee", each if Text.PositionOf([Latest Financials],"drug_test_fees")>0 then Text.Range([Latest Financials],Text.PositionOf([Latest Financials],"drug_test_fees")+18,6) else null)

 

 

Also this text is in a CSV file andwhen I pull it into Power Query it looks slightly different on that the ['646' is now  [646.

 

{'vat': [20.0, 74.51], 'share': {'price': 372.56, 'num_parties': 4}, 'total': 447.07, 'subtotal': 1490.25, 'drug_test_fees': [646.0,

 

Maybe there is an easier way to do this. 

 

Appreciate any advice

Mike

1 ACCEPTED SOLUTION
bhanu_gautam
Super User
Super User

@masplin < Try using this m-code in custom column

 

= Table.AddColumn(#"PreviousStepName", "Drug Test Fee", each
let
sourceText = [Latest Financials],
startPos = Text.PositionOf(sourceText, "drug_test_fees"),
afterKey = Text.Middle(sourceText, startPos + Text.Length("drug_test_fees") + 4),
feeText = Text.BeforeDelimiter(afterKey, ",")
in
if startPos > 0 then Text.Trim(Text.Select(feeText, {"0".."9", "."})) else null
)




Did I answer your question? Mark my post as a solution! And Kudos are appreciated

Proud to be a Super User!




LinkedIn






View solution in original post

2 REPLIES 2
bhanu_gautam
Super User
Super User

@masplin < Try using this m-code in custom column

 

= Table.AddColumn(#"PreviousStepName", "Drug Test Fee", each
let
sourceText = [Latest Financials],
startPos = Text.PositionOf(sourceText, "drug_test_fees"),
afterKey = Text.Middle(sourceText, startPos + Text.Length("drug_test_fees") + 4),
feeText = Text.BeforeDelimiter(afterKey, ",")
in
if startPos > 0 then Text.Trim(Text.Select(feeText, {"0".."9", "."})) else null
)




Did I answer your question? Mark my post as a solution! And Kudos are appreciated

Proud to be a Super User!




LinkedIn






Thats great but just to understand

 

Start pos find where the "d" is

afterkey is text everythnig to right of drug_test_fees + 4 charachters? so captures the numerical value

feeText  This is now the bit 646.00', [['Hair strand testing x4 sections (3 scalp, 1 body)', ['Cannabis', 'Cocaine', 'Opiates', 'Amphetamine', 'MDMA (Ecstasy)'], '646.00']]], 'coll etc   so looks left of the first comma and trims it

 

Text.select only picks out numbers and "." 

 

Genius thanks so much

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Top Solution Authors