Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi All,
I have done extensive searching and I don't believe this is a repeat, but is definitely and extension of previous questions. I am attempting to extract numbers from a text string within a Power BI function. I have successfully extracted the numbers from the string into a value using the below:
Text.Combine( List.RemoveNulls( List.Transform( Text.ToList([string_col]), each if Value.Is(Value.FromText(_), type number) then _ else null) ) )
Using this code works great when the number I am interested in is the only number in the string, for example:
"Bring on the 1234567 comments" results in 1234567
However, I can't resolve extracting my number when multiple different numbers occur in the string, for example:
"Bring on on the 1234567 comments with 50 telling me this is a repeat" results in 123456750
What I need to do is one pull the number within the string that meets conditions (one in my case). For my particular issue, the number I need to extract will always be the only 7 digit number in the string, so I feel like this should be a more straight forward answer?
Is there a way to extract only the 7 digit number using my provided function or something similar? If I am way off base, can someone please set me on the proper path?
As always, the communities help is greatly appreciated.
Solved! Go to Solution.
Here's one method:
(InputText as text) => let RequiredLength = 7, Digits = {"0".."9"}, CharacterList = Text.ToList(InputText), FirstNumber = List.Accumulate( CharacterList, "", (String,CurrentChar)=> if Text.Length(String) = RequiredLength then String else if List.Contains(Digits,CurrentChar) then String & CurrentChar else "" ) , ReturnValue = if Text.Length(FirstNumber) = RequiredLength then FirstNumber else null in ReturnValue
The function works by taking the characters of InputText from left to right, and building up a string of numbers (stopping when 7 numeric characters are accumulated), otherwise resetting to an empty string when it encounters a non-numeric character.
Here's another idea using table grouping to group consecutive digits together (also a function):
(InputText as text) =>
CharacterList = Text.ToList(InputText), CharacterTable = Table.FromList(CharacterList, Splitter.SplitByNothing(), type table[Character = text], null, ExtraValues.Error), AddedIndex = Table.AddIndexColumn(CharacterTable, "Index", 1, 1), AddedDigitFlag = Table.AddColumn(AddedIndex, "Digit", each List.Contains({"0".."9"},[Character]), type logical), DigitGroups = Table.Group(AddedDigitFlag, {"Digit"}, {{"Number", each Text.Combine(Table.Sort(_,{"Index"})[Character]), type text}}, GroupKind.Local), FilterNumbersLength7 = Table.SelectRows(DigitGroups, each [Digit] = true and Text.Length([Number])=7), FirstNumber = try FilterNumbersLength7{0}[Number] otherwise null in FirstNumber
Another option might be using some R code to find text matching an appropriate regular expression.
Regards,
Owen 🙂
Here's one method:
(InputText as text) => let RequiredLength = 7, Digits = {"0".."9"}, CharacterList = Text.ToList(InputText), FirstNumber = List.Accumulate( CharacterList, "", (String,CurrentChar)=> if Text.Length(String) = RequiredLength then String else if List.Contains(Digits,CurrentChar) then String & CurrentChar else "" ) , ReturnValue = if Text.Length(FirstNumber) = RequiredLength then FirstNumber else null in ReturnValue
The function works by taking the characters of InputText from left to right, and building up a string of numbers (stopping when 7 numeric characters are accumulated), otherwise resetting to an empty string when it encounters a non-numeric character.
Here's another idea using table grouping to group consecutive digits together (also a function):
(InputText as text) =>
CharacterList = Text.ToList(InputText), CharacterTable = Table.FromList(CharacterList, Splitter.SplitByNothing(), type table[Character = text], null, ExtraValues.Error), AddedIndex = Table.AddIndexColumn(CharacterTable, "Index", 1, 1), AddedDigitFlag = Table.AddColumn(AddedIndex, "Digit", each List.Contains({"0".."9"},[Character]), type logical), DigitGroups = Table.Group(AddedDigitFlag, {"Digit"}, {{"Number", each Text.Combine(Table.Sort(_,{"Index"})[Character]), type text}}, GroupKind.Local), FilterNumbersLength7 = Table.SelectRows(DigitGroups, each [Digit] = true and Text.Length([Number])=7), FirstNumber = try FilterNumbersLength7{0}[Number] otherwise null in FirstNumber
Another option might be using some R code to find text matching an appropriate regular expression.
Regards,
Owen 🙂
Hi Owen,
This is great,
I was wondering if you had a solution to extract 2 sets of numbers within the same row;
column x = 1 x 654321 and 2 x 123456
result = 2 columns, column 1= 654321 & column 2 = 123456
so by using your function set to 6 digits extraction, is it possible to extract two sets of (6) occurance digits from the same cell? column x extraction would create two seperate columns one with the first 6 digit number found in column x and the second column with the second occurance of 6 digit number... amended to capture if there were 3 occurance and so on?
cheers,
Hello Owen, is there any way to get this into a Flow action? I tried getting the file, copying your text into a Compose action, and replacing InputText with the Filename from Dynamic Content, but the result in Body looks like it displays your text, not the evaluated function... Thanks In Advance.
The M code I posted earlier can't be directly translated to Flow.
I'm sure there is a way of performing the same actions using Flow functions but I don't know enough to tell you what that would be. Perhaps try posting on the Flow forums.
You can use Power Query actions with a SQL Server data source using the Premium "Transform data using Power Query" action, but I suspect that may not be what you are wanting.
Regards,
Owen
@OwenAuger Thank you, no worries some bright spark in the Flow forum translated my requirement into 2 split functions: split at ' - ' and then split at '-'. For completion (and anyone else stuck on a similar problem), this is the link.
Thank you for that!
If I want to extract only a string that would start with 1 only and have a lenght of 8, what would I need to add? I've been trying a few things but nothing seems to be working.
Thank you!
Hi,
Could you expand a bit more on the possibility of getting this done using R code? I have tried this solution for a similar problem but haven't managed to get it to work. Not sure what I am doing wrong but I am not very familiar with using functions so I'm sure it's something silly.
My problem is slightly different though, I cannot use number of characters as there will be other numbers in there which are the same length as the number I am trying to extract. However, I do know my number will always start with 45313 and always be 10 characters long. Any help is appreciated.
Cheers
Just in case anyone is struggling with a problem similar to mine, I have managed to solve it using a two step solution.
I added a custom column using the following code:
First Column =
if Text.Contains([YourDataColumn], "45313") then Text.PositionOf([YourDataColumn], "45313") else null
I then added a second column which references the first one using this code:
Second Column =
if [FirstColumn]<>null then Text.Range([YourDataColumn], [FirstColumn], 10) else null
The first column calculates the character position of the sequence I know my number will always start with. The second column uses Text.Range which will select a set number of characters (in my case 10) starting from a particular number of characters onwards in the overall text. Since I'm trying to extract information from an e-mail inbox it is impossible to predict which position the information needed will be in, this is overcome by using the first column in the second one.