Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Scenario:
Often people have problems with spaces in the abbreviations when importing data in Power BI Desktop. The sample data and the expected result are first provided below for the reader's reference.
Sample Data and expected result:
Sample data:
Expected result:
Method 1:
It was found that many beginners will use the following method to solve this problem. They may replace the abbreviations one by one.
If the amount of data is small, we can do this, then in case we have thousands of abbreviations, the workload is quite voluminous.
Method 2:
Now I will introduce a dynamic way to remove spaces from abbreviations. The operations are generated by adding custom columns.
1. Separate the text of each line with spaces to get the lists.
2. Get the length of each line of text and convert them to lists.
3. Take a list of lists, lists, and returns a list of lists combining items at the same position.
To show this function in detail here, I expand it to a new row to show you.
4. The following is to determine whether the lists without the first character in the Zipped column is equal to {1,1}. Because {1,1} is part of the English abbreviation with spaces. For example, if the {B,1,1} list does not contain the first character is {1,1}, which is equal to {1, 1}, then "" & "B" are returned. If it is {"Power",5,1}, then return "Power" & " ".
5. At the end of the above steps, it looks like the result has been obtained. But there are still some differences, for example "Power B I Desktop", you will get "Power BI Desktop " with one more space. This is where we have to add another step to remove the trailing space.
6. Then remove the unnecessary columns to get the final result.
The entire M language in the advanced editor as follow:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCsgvTy1ScFLwVIrVQeIpuKQWZ5fkF4BFg1KLSxQcFQKgioIUfBSCFYJTk0uLMksqFTLzgDJgA2IB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
#"Added Custom" = Table.AddColumn(Source, "TextToList", each Text.Split([Value], " ")),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "TextLengths", each List.Transform([TextToList], each Text.Length(_))),
#"Added Custom2" = Table.AddColumn(#"Added Custom1", "Zipped", each List.Zip({[TextToList], [TextLengths], List.Skip([TextLengths])})),
#"Added Custom3" = Table.AddColumn(#"Added Custom2", "Accu", each List.Accumulate([Zipped], "",
(state, current) =>
if List.Skip(current)= {1,1}
then state & List.First(current)
else state & List.First(current) & " ")),
#"Added Custom4" = Table.AddColumn(#"Added Custom3", "Results", each Text.Trim([Accu])),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom4",{"Value", "TextToList", "TextLengths", "Zipped", "Accu"})
in
#"Removed Columns"
Hope this article helps everyone with similar questions here.
Author: Stephen Tao
Reviewer: Ula Huang, Kerry Wang
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.