Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

v-henryk-mstf

How to remove spaces from abbreviations in values in Power Query

Scenario:

Often people have problems with spaces in the abbreviations when importing data in Power BI Desktop. The sample data and the expected result are first provided below for the reader's reference.

 

Sample Data and expected result:
Sample data:

vhenrykmstf_0-1646807507143.png

Expected result:

vhenrykmstf_1-1646807524098.png

 

Method 1:

It was found that many beginners will use the following method to solve this problem. They may replace the abbreviations one by one.

vhenrykmstf_4-1646807612352.pngvhenrykmstf_5-1646807615662.pngvhenrykmstf_6-1646807627100.pngvhenrykmstf_7-1646807632031.png

 

If the amount of data is small, we can do this, then in case we have thousands of abbreviations, the workload is quite voluminous.

 

Method 2:

Now I will introduce a dynamic way to remove spaces from abbreviations. The operations are generated by adding custom columns.
1. Separate the text of each line with spaces to get the lists.

Text.Split

vhenrykmstf_8-1646807732732.pngvhenrykmstf_9-1646807738705.png

 

2. Get the length of each line of text and convert them to lists.

List.Transform

vhenrykmstf_10-1646807808911.pngvhenrykmstf_11-1646807818965.png

 

3. Take a list of lists, lists, and returns a list of lists combining items at the same position.

List.Zip

vhenrykmstf_12-1646807883585.pngvhenrykmstf_13-1646807887612.png

 

To show this function in detail here, I expand it to a new row to show you.

vhenrykmstf_14-1646807904882.png

vhenrykmstf_15-1646807909326.png

vhenrykmstf_16-1646807920212.png

 

4. The following is to determine whether the lists without the first character in the Zipped column is equal to {1,1}. Because {1,1} is part of the English abbreviation with spaces. For example, if the {B,1,1} list does not contain the first character is {1,1}, which is equal to {1, 1}, then "" & "B" are returned. If it is {"Power",5,1}, then return "Power" & " ".

List.Accumulate

vhenrykmstf_17-1646807988023.png

vhenrykmstf_19-1646808005452.png

 

5. At the end of the above steps, it looks like the result has been obtained. But there are still some differences, for example "Power B I Desktop", you will get "Power BI Desktop " with one more space. This is where we have to add another step to remove the trailing space.

vhenrykmstf_20-1646808039039.png

vhenrykmstf_21-1646808047345.png

6. Then remove the unnecessary columns to get the final result.

vhenrykmstf_22-1646808066432.png

 

The entire M language in the advanced editor as follow:

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCsgvTy1ScFLwVIrVQeIpuKQWZ5fkF4BFg1KLSxQcFQKgioIUfBSCFYJTk0uLMksqFTLzgDJgA2IB", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Value = _t]),
    #"Added Custom" = Table.AddColumn(Source, "TextToList", each Text.Split([Value], " ")),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "TextLengths", each List.Transform([TextToList], each Text.Length(_))),
    #"Added Custom2" = Table.AddColumn(#"Added Custom1", "Zipped", each List.Zip({[TextToList], [TextLengths], List.Skip([TextLengths])})),
    #"Added Custom3" = Table.AddColumn(#"Added Custom2", "Accu", each List.Accumulate([Zipped], "",
(state, current) => 
if List.Skip(current)= {1,1}
then state & List.First(current)
else state & List.First(current) & " ")),
    #"Added Custom4" = Table.AddColumn(#"Added Custom3", "Results", each Text.Trim([Accu])),
    #"Removed Columns" = Table.RemoveColumns(#"Added Custom4",{"Value", "TextToList", "TextLengths", "Zipped", "Accu"})
in
    #"Removed Columns"

 

 

Hope this article helps everyone with similar questions here.

 

Author: Stephen Tao

Reviewer: Ula Huang, Kerry Wang