Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi All,
I have done extensive searching and I don't believe this is a repeat, but is definitely and extension of previous questions. I am attempting to extract numbers from a text string within a Power BI function. I have successfully extracted the numbers from the string into a value using the below:
Text.Combine( List.RemoveNulls( List.Transform( Text.ToList([string_col]), each if Value.Is(Value.FromText(_), type number) then _ else null) ) )
Using this code works great when the number I am interested in is the only number in the string, for example:
"Bring on the 1234567 comments" results in 1234567
However, I can't resolve extracting my number when multiple different numbers occur in the string, for example:
"Bring on on the 1234567 comments with 50 telling me this is a repeat" results in 123456750
What I need to do is one pull the number within the string that meets conditions (one in my case). For my particular issue, the number I need to extract will always be the only 7 digit number in the string, so I feel like this should be a more straight forward answer?
Is there a way to extract only the 7 digit number using my provided function or something similar? If I am way off base, can someone please set me on the proper path?
As always, the communities help is greatly appreciated.
Solved! Go to Solution.
Here's one method:
(InputText as text) => let RequiredLength = 7, Digits = {"0".."9"}, CharacterList = Text.ToList(InputText), FirstNumber = List.Accumulate( CharacterList, "", (String,CurrentChar)=> if Text.Length(String) = RequiredLength then String else if List.Contains(Digits,CurrentChar) then String & CurrentChar else "" ) , ReturnValue = if Text.Length(FirstNumber) = RequiredLength then FirstNumber else null in ReturnValue
The function works by taking the characters of InputText from left to right, and building up a string of numbers (stopping when 7 numeric characters are accumulated), otherwise resetting to an empty string when it encounters a non-numeric character.
Here's another idea using table grouping to group consecutive digits together (also a function):
(InputText as text) =>
CharacterList = Text.ToList(InputText), CharacterTable = Table.FromList(CharacterList, Splitter.SplitByNothing(), type table[Character = text], null, ExtraValues.Error), AddedIndex = Table.AddIndexColumn(CharacterTable, "Index", 1, 1), AddedDigitFlag = Table.AddColumn(AddedIndex, "Digit", each List.Contains({"0".."9"},[Character]), type logical), DigitGroups = Table.Group(AddedDigitFlag, {"Digit"}, {{"Number", each Text.Combine(Table.Sort(_,{"Index"})[Character]), type text}}, GroupKind.Local), FilterNumbersLength7 = Table.SelectRows(DigitGroups, each [Digit] = true and Text.Length([Number])=7), FirstNumber = try FilterNumbersLength7{0}[Number] otherwise null in FirstNumber
Another option might be using some R code to find text matching an appropriate regular expression.
Regards,
Owen 🙂
Here's one method:
(InputText as text) => let RequiredLength = 7, Digits = {"0".."9"}, CharacterList = Text.ToList(InputText), FirstNumber = List.Accumulate( CharacterList, "", (String,CurrentChar)=> if Text.Length(String) = RequiredLength then String else if List.Contains(Digits,CurrentChar) then String & CurrentChar else "" ) , ReturnValue = if Text.Length(FirstNumber) = RequiredLength then FirstNumber else null in ReturnValue
The function works by taking the characters of InputText from left to right, and building up a string of numbers (stopping when 7 numeric characters are accumulated), otherwise resetting to an empty string when it encounters a non-numeric character.
Here's another idea using table grouping to group consecutive digits together (also a function):
(InputText as text) =>
CharacterList = Text.ToList(InputText), CharacterTable = Table.FromList(CharacterList, Splitter.SplitByNothing(), type table[Character = text], null, ExtraValues.Error), AddedIndex = Table.AddIndexColumn(CharacterTable, "Index", 1, 1), AddedDigitFlag = Table.AddColumn(AddedIndex, "Digit", each List.Contains({"0".."9"},[Character]), type logical), DigitGroups = Table.Group(AddedDigitFlag, {"Digit"}, {{"Number", each Text.Combine(Table.Sort(_,{"Index"})[Character]), type text}}, GroupKind.Local), FilterNumbersLength7 = Table.SelectRows(DigitGroups, each [Digit] = true and Text.Length([Number])=7), FirstNumber = try FilterNumbersLength7{0}[Number] otherwise null in FirstNumber
Another option might be using some R code to find text matching an appropriate regular expression.
Regards,
Owen 🙂
Hi Owen,
This is great,
I was wondering if you had a solution to extract 2 sets of numbers within the same row;
column x = 1 x 654321 and 2 x 123456
result = 2 columns, column 1= 654321 & column 2 = 123456
so by using your function set to 6 digits extraction, is it possible to extract two sets of (6) occurance digits from the same cell? column x extraction would create two seperate columns one with the first 6 digit number found in column x and the second column with the second occurance of 6 digit number... amended to capture if there were 3 occurance and so on?
cheers,
Hello Owen, is there any way to get this into a Flow action? I tried getting the file, copying your text into a Compose action, and replacing InputText with the Filename from Dynamic Content, but the result in Body looks like it displays your text, not the evaluated function... Thanks In Advance.
The M code I posted earlier can't be directly translated to Flow.
I'm sure there is a way of performing the same actions using Flow functions but I don't know enough to tell you what that would be. Perhaps try posting on the Flow forums.
You can use Power Query actions with a SQL Server data source using the Premium "Transform data using Power Query" action, but I suspect that may not be what you are wanting.
Regards,
Owen
@OwenAuger Thank you, no worries some bright spark in the Flow forum translated my requirement into 2 split functions: split at ' - ' and then split at '-'. For completion (and anyone else stuck on a similar problem), this is the link.
Thank you for that!
If I want to extract only a string that would start with 1 only and have a lenght of 8, what would I need to add? I've been trying a few things but nothing seems to be working.
Thank you!
Hi,
Could you expand a bit more on the possibility of getting this done using R code? I have tried this solution for a similar problem but haven't managed to get it to work. Not sure what I am doing wrong but I am not very familiar with using functions so I'm sure it's something silly.
My problem is slightly different though, I cannot use number of characters as there will be other numbers in there which are the same length as the number I am trying to extract. However, I do know my number will always start with 45313 and always be 10 characters long. Any help is appreciated.
Cheers
Just in case anyone is struggling with a problem similar to mine, I have managed to solve it using a two step solution.
I added a custom column using the following code:
First Column =
if Text.Contains([YourDataColumn], "45313") then Text.PositionOf([YourDataColumn], "45313") else null
I then added a second column which references the first one using this code:
Second Column =
if [FirstColumn]<>null then Text.Range([YourDataColumn], [FirstColumn], 10) else null
The first column calculates the character position of the sequence I know my number will always start with. The second column uses Text.Range which will select a set number of characters (in my case 10) starting from a particular number of characters onwards in the overall text. Since I'm trying to extract information from an e-mail inbox it is impossible to predict which position the information needed will be in, this is overcome by using the first column in the second one.
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.