Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
Hi,
I'm new to PowerBI and need some help to extract the data I need from cells.
The content mostly started with a text, followed by one or more space and than a number follows, either with a leading or with no leading letter. Sometimes another text follows the number. I only need to extract the number.
I tried to copy an example but every time, my posts are being categorized as spam so they are not published, yet.
Any help is highly appreciated.
Thanks
Solved! Go to Solution.
Hi @oldrunner,
These should not be simple to extract operations.
After I check the sample data, I find it is not regular and include lots of type samples that required different rules to extract the results. (some of the records has similar text but used different rules)
Here is the custom function 'ReplaceFx' that include some condition processings, you can use it if help: (I haven't added the 'Text' condition due to it having too many scenarios)
(rawText as text) as text =>
let
conditionList = {
//condition GMDN
{
"GMDN",
Function.Invoke(
(sample) =>
if
//check if inlcude number
List.IsEmpty(
List.Select(
Text.ToList(sample),
each Value.Is(Value.FromText(_), type number)
)
)
= false
then
//check if inlcude "/"
if Text.Contains(sample, "/") then
List.First(Text.Split(sample, "/"))
else
sample
else
"",
//get specific item and trim "[]"
{
Text.Trim(
Text.Trim(
List.Last(Text.Split(rawText, " ")),
"["
),
"]"
)
}
)
},
//condition EMDN
{
"EMDN",
List.Last(Text.Split(rawText, " "))
},
//condition UMDNS(ECRI)
{
"UMDNS(ECRI)",
Function.Invoke(
(sample) =>
Text.Start(sample, 2)
& "-"
& Text.End(sample, 3),
{List.Last(Text.Split(rawText, " "))}
)
},
//default return
{
rawText,
rawText
}
},
//get calculated reuslt from the first match item
Result =
List.Select(
conditionList,
each Text.Contains(rawText, _{0})
){0}{1}
in
Result
Sample table and function usage:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZDBbsIwDIZfJcppk6hqJ2naHhFUwGEgdR2gVT1wqIQEpBIw7fVnp2nXA7EU/+n/2Uld13L1sdwKWlplWsmZ7HMzq+UXOZ9vxaLcvDOAWQR5RkQQrxCIUBtGesHI0L+2xlhsyPTidTlo21ezYEKnxiT8Kp/Hfj/u4rpfR4YYP46HItx4AKSwvJM3PU3fpVNQGOs0BeP3xO/W30mOZ6viWDEbCfGNCBqSjEf1rxmab3fVuijFAO9zyHOi+jxtQ2tOhQZ4VIMaAbT0FLG6nh4Pselcd2vvYtm65+kqFu2NBE+IGV+ye57JD5fGFPyf2oehEyo0qCAuwwSCI5vmDw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Raw = _t, Expected = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Raw", type text}, {"Expected", type text}}),
#"Replaced whitespace" = Table.ReplaceValue(#"Changed Type",each [Raw], each if Text.Contains([Raw]," ") then Text.Combine(List.Select(Text.Split([Raw]," "),each _<>"")," ") else [Raw], Replacer.ReplaceText,{"Raw"}),
#"Added Custom" = Table.AddColumn(#"Replaced whitespace", "Extract Value", each ReplaceFx([Raw]))
in
#"Added Custom"
Regards,
Xiaoxin Sheng
Hi @oldrunner,
These should not be simple to extract operations.
After I check the sample data, I find it is not regular and include lots of type samples that required different rules to extract the results. (some of the records has similar text but used different rules)
Here is the custom function 'ReplaceFx' that include some condition processings, you can use it if help: (I haven't added the 'Text' condition due to it having too many scenarios)
(rawText as text) as text =>
let
conditionList = {
//condition GMDN
{
"GMDN",
Function.Invoke(
(sample) =>
if
//check if inlcude number
List.IsEmpty(
List.Select(
Text.ToList(sample),
each Value.Is(Value.FromText(_), type number)
)
)
= false
then
//check if inlcude "/"
if Text.Contains(sample, "/") then
List.First(Text.Split(sample, "/"))
else
sample
else
"",
//get specific item and trim "[]"
{
Text.Trim(
Text.Trim(
List.Last(Text.Split(rawText, " ")),
"["
),
"]"
)
}
)
},
//condition EMDN
{
"EMDN",
List.Last(Text.Split(rawText, " "))
},
//condition UMDNS(ECRI)
{
"UMDNS(ECRI)",
Function.Invoke(
(sample) =>
Text.Start(sample, 2)
& "-"
& Text.End(sample, 3),
{List.Last(Text.Split(rawText, " "))}
)
},
//default return
{
rawText,
rawText
}
},
//get calculated reuslt from the first match item
Result =
List.Select(
conditionList,
each Text.Contains(rawText, _{0})
){0}{1}
in
Result
Sample table and function usage:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("bZDBbsIwDIZfJcppk6hqJ2naHhFUwGEgdR2gVT1wqIQEpBIw7fVnp2nXA7EU/+n/2Uld13L1sdwKWlplWsmZ7HMzq+UXOZ9vxaLcvDOAWQR5RkQQrxCIUBtGesHI0L+2xlhsyPTidTlo21ezYEKnxiT8Kp/Hfj/u4rpfR4YYP46HItx4AKSwvJM3PU3fpVNQGOs0BeP3xO/W30mOZ6viWDEbCfGNCBqSjEf1rxmab3fVuijFAO9zyHOi+jxtQ2tOhQZ4VIMaAbT0FLG6nh4Pselcd2vvYtm65+kqFu2NBE+IGV+ye57JD5fGFPyf2oehEyo0qCAuwwSCI5vmDw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Raw = _t, Expected = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Raw", type text}, {"Expected", type text}}),
#"Replaced whitespace" = Table.ReplaceValue(#"Changed Type",each [Raw], each if Text.Contains([Raw]," ") then Text.Combine(List.Select(Text.Split([Raw]," "),each _<>"")," ") else [Raw], Replacer.ReplaceText,{"Raw"}),
#"Added Custom" = Table.AddColumn(#"Replaced whitespace", "Extract Value", each ReplaceFx([Raw]))
in
#"Added Custom"
Regards,
Xiaoxin Sheng
Dear Xiaoxin Sheng,
Thanks for all the effort. I thought that this might be not too easy.
Unfortunately, when I insert your code, I got a failure message. Still not sure what to use as the DAX formula to put in.
Sorry and thanks
Alex
HI @oldrunner,
They are M query functions and codes, you can't use them in the DAX formula bar. Please enter to query editor to create custom function and query table with those codes to do extract operations.
Notice: I attached the sample pbix below.
Regards,
Xiaoxin Sheng
@oldrunner , In power query transform tab, You have split column, You option to split by delimiter, digit, non digit etc
You can Use that, You can try Text.Split -https://docs.microsoft.com/en-us/powerquery-m/text-split
You can upload data on onedrive or share point and share link
Thanks,
unfortunately, I cannot share outside, so I give it another try.
The data we got are pretty messy:
Here is an example of what I try to accomplish:
| Imported Data | Output required |
| GMDN 32832 | 32832 |
| UMDNS(ECRI) 18-098 | 18-098 |
| UMDNS(ECRI) 10-134 | 10-134 |
| GMDN [64461] | 64461 |
| UMDNS(ECRI) 10036 | 10-036 |
| 37445 | 37445 |
| GMDN unknown | |
| GMDN | |
| EMDN W0101060101 | W0101060101 |
| GMDN 37021/37704/37705/37706 | 37021 |
| TEXT - Z11030582 | Z11030582 |
| ANOTHER TEXT V9099 | V9099 |
| TEXT A030404 | A030404 |
| TEXT 16704 Glass Ionomer Dental Cement | 16704 |
| Other TEXT / / W0103030304 / 1214120/R | W0103030304 |
Thanks
Alex
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.