Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

v-jingzhang

Make column numbering not consecutive over the whole dataset but just for equal column names

Scenario: 

In many cases, when we import data into Power BI, Power Query will automatically add suffixes if there are identical column names. However, even if there are multiple duplicate column names in the table, it will add them in numerical order without recognizing the fields and adding them based on the fields. If there isn't much data in the table, it is not a trouble for us to adjust the following names manually, but when there are many fields in the table, it will still be easy for us to improve them in the following ways.

 

Table Used: 

The following is base table:

vjingzhang_0-1658126542753.png

 

Expect output:

vjingzhang_1-1658126542755.png

 

Current output:

vjingzhang_2-1658126542756.png

 

Solution:

The problem is mainly solved by the flexible use of pivot and unpivot. However, in order to make them be matched correspondingly, we first create an index.

vjingzhang_3-1658126542757.png

 

Step 2,

Then we unpivot the table (the data in the first column is the number of rows they were originally in):

vjingzhang_4-1658126542764.png

 

Step 3,

Since many columns have the same name, we need to create another index to sort the columns with the same name correctly.

vjingzhang_5-1658126542768.png

 

Step 4,

Then group the table by values:

vjingzhang_6-1658126542770.png

 

Step 5,

Then add custom column to all with index:

vjingzhang_7-1658126542772.png

 

Expand table ,we get:

vjingzhang_8-1658126542776.png

 

Step 6,

Delete the column we don't need:

vjingzhang_9-1658126542779.png

 

Step 7 ,

Then add a suffix to the initial column name.

vjingzhang_10-1658126542782.png

 

Step 8 ,

Remove the unwanted rows,then pivot table,and remove the columns we don't need,then promot headers:

vjingzhang_11-1658126542783.png

vjingzhang_12-1658126542785.png

 

The full code could refer the Advanced editor:

 

let

    Source = Excel.Workbook(File.Contents("\\flielink\import.xlsx"), null, true),

    Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],

    #"Added Index1" = Table.AddIndexColumn(Sheet1_Sheet, "IndexALL", 1, 1, Int64.Type),

    #"Unpivoted Only Selected Columns" = Table.Unpivot(#"Added Index1", {"Column1", "Column2", "Column3", "Column4", "Column5", "Column6"}, "Attribute", "Value"),

    #"Added Index" = Table.AddIndexColumn(#"Unpivoted Only Selected Columns", "Index", 1, 1, Int64.Type),

    #"Grouped Rows" = Table.Group(#"Added Index", {"Value"}, {{"all", each _, type table [Attribute=text, Value=any, Index=number]}}),

    #"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.AddIndexColumn([all],"Index1",1)),

    #"Expanded Custom1" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"IndexALL", "Attribute", "Value", "Index", "Index1"}, {"Custom.IndexALL", "Custom.Attribute", "Custom.Value", "Custom.Index", "Custom.Index1"}),

    #"Removed Columns" = Table.RemoveColumns(#"Expanded Custom1",{"all", "Custom.Value", "Custom.Index"}),

    #"Added Custom1" = Table.AddColumn(#"Removed Columns", "newvalue", each if Value.Is([Value],type text)=true then [Value]&"_"&Number.ToText([Custom.Index1]) else [Value]),

    #"Removed Columns1" = Table.RemoveColumns(#"Added Custom1",{"Value", "Custom.Index1"}),

    #"Filtered Rows" = Table.SelectRows(#"Removed Columns1", each true),

    #"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[Custom.Attribute]), "Custom.Attribute", "newvalue"),

    #"Removed Columns2" = Table.RemoveColumns(#"Pivoted Column",{"Custom.IndexALL"}),

    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Columns2", [PromoteAllScalars=true])

in

    #"Promoted Headers"

 

 

Result:

If we have more columns in our table, how can we get the results quickly and easily based on the above steps? Referring to the image below, I have added several new columns containing the same column names as the previous columns as well as different column names. Using the above Power Query steps, I only need to complete column names when unpivoting to get the wanted data.

vjingzhang_13-1658126542786.png

 

vjingzhang_14-1658126542788.png

 

vjingzhang_15-1658126542792.png

 

Hope this article helps everyone with similar questions here. 

  

Author: Lucien Wang

Reviewer: Kerry Wang & Ula Huang