Solved: Re: Sorting multi header table in power query

Justas4478 · ‎09-20-2024

Morning, I have folder that reiceves multiple files in it and I will be reading all the files from that folder and combining them.
Good thing is that tables in the csv files are all some layout.
The problem is the layout looks like this:

I highlited what should be each independent column in same colors.
Exception to red: red parts of the table are not required in final sorted table.
As you see pick storage column spans acros all other columns(Orders, line qty and Pending(IM)).
This is how final result should look like:

As you see final result would be easier to work with and it does not introduce that many new rows and removes large amount of additional columns.

I did try to unpivot or transpose table in order to sort it out but was not successfull at the moment.
I could really use some help.

I attached sample data to this link.
https://we.tl/t-eaLcw9LvSC
Thanks

lbendlin · ‎09-20-2024

Create a function that ingests a single file

(f)=>
let
    Source = Csv.Document(f,[Delimiter=",",  Encoding=65001, QuoteStyle=QuoteStyle.None]),
    LZ = List.Skip(List.Transform(List.Zip({Record.ToList(Source{0}),Record.ToList(Source{1})}),each Text.Combine(_,"|")),2),
    #"Removed Bottom Rows" = Table.RemoveLastN(Source,1),
    #"Removed Top Rows" = Table.Skip(#"Removed Bottom Rows",2),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
    #"Replaced Headers" = Table.RenameColumns(#"Promoted Headers",List.Zip({List.Skip(Table.ColumnNames(#"Promoted Headers"),2),LZ})),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Headers", {"Status", "Created Date"}, "Attribute", "Value"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({"|"}, QuoteStyle.Csv, false), {"Pick Storage", "Attribute"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Value", Int64.Type}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Value] <> null) and ([Pick Storage] <> "Totals")),
    #"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[Attribute]), "Attribute", "Value"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column",{{"Created Date", type date}},"de")
in
    #"Changed Type1"

Then apply that to the list of files

And finally expand the table.

View solution in original post

Justas4478 · ‎09-23-2024

Thank you for all the proposed solutions.
I wish I could understand them in detail to learn better but it might take too much time to explain.

lbendlin · ‎09-23-2024

The function I proposed explains the process step by step. There is no magic to this - you need to transform the source format step by step into something usable.

The only "fancy" part is the harvesting of the dangling headers

LZ = List.Skip(List.Transform(List.Zip({Record.ToList(Source{0}),Record.ToList(Source{1})}),each Text.Combine(_,"|")),2)

You can use PowerQueryFormatter to make the code look easier to understand.

LZ
  = List.Skip(
    List.Transform(
      List.Zip({Record.ToList(Source{0}), Record.ToList(Source{1})}), 
      each Text.Combine(_, "|")
    ), 
    2
  )

We are treating the first two rows as if they were lists. We combine them via List.Zip, and then convert the result to pipe delimited strings. Since the first two columns are ok we skip them.

dufoq3 · ‎09-21-2024

Hi @Justas4478, another approach:

Replace address to your folder with csv files in Source step.

Output

let
    fn_Transform = 
      (tbl as table)=> 
        [
          // _Detail = ContentToTable{0}[Content],
          _Detail = tbl,

          _Helper = [ colNames = Table.ColumnNames(_Detail),
          totalsPos = List.PositionOf(Record.ToList(Table.First(_Detail)), "Totals", Occurrence.All),
          colNamesToRemove = List.Transform(totalsPos, (x)=> colNames{x}) ],
          _RemovedTotalsColumns = Table.RemoveColumns(_Detail, _Helper[colNamesToRemove]),
          _RemovedTotalsRow = Table.SelectRows(_RemovedTotalsColumns, each ([Column1] <> "Totals")),
          _Transposed = Table.FromColumns(Table.ToRows(_RemovedTotalsRow)),
          _MergedHeaders = Table.CombineColumns(_Transposed,{"Column1", "Column2"},Combiner.CombineTextByDelimiter("||", QuoteStyle.None),"Merged"),
          _TransposedBack = Table.PromoteHeaders(Table.FromColumns(Table.ToRows(_MergedHeaders))),
          _Unpivoted = Table.UnpivotOtherColumns(_TransposedBack, {"||", "Pick Storage||Measures"}, "Attribute", "Value"),
          _Splitted = Table.SplitColumn(_Unpivoted, "Attribute", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv), {"Pick Storage", "Headers"}),
          _FilteredRows = Table.SelectRows(_Splitted, each ([#"||"] = "IM Delivery")),
          _Pivoted = Table.Pivot(_FilteredRows, List.Distinct(_FilteredRows[Headers]), "Headers", "Value"),
          _Renamed = Table.RenameColumns(_Pivoted,{{"||", "Status"}, {"Pick Storage||Measures", "Created Date"}}),
          _ChangedType = Table.TransformColumnTypes(_Renamed,{{"Status", type text}, {"Orders", type number}, {"Line Qty", type number}, {"Pending (IM)", type number}, {"Created Date", type date}}, "sk-SK"),
          _FilteredRows1 = Table.SelectRows(_ChangedType, each not List.ContainsAll({[Orders], [Line Qty], [#"Pending (IM)"]}, {null}) ),
          _SortedRows = Table.Sort(_FilteredRows1,{{"Created Date", Order.Ascending}})
        ][_SortedRows],

    Source = Folder.Files("c:\Downloads\PowerQueryForum\Justas4478\"),
    FilteredCSV = Table.SelectRows(Source, each [Extension] = ".csv"),
    ContentToTable = Table.TransformColumns(FilteredCSV, {{"Content", each Csv.Document(_,[Delimiter=",", Columns=14, Encoding=65001, QuoteStyle=QuoteStyle.None]), type table}}),
    RemovedOtherColumns = Table.SelectColumns(ContentToTable,{"Content", "Name"}),
    Ad_Transformed = Table.AddColumn(RemovedOtherColumns, "Transformed", each Table.AddColumn(fn_Transform([Content]), "SourceName", (x)=> [Name], type text) , type table),
    CombinedFiles = Table.Combine(Ad_Transformed[Transformed])

    
in
    CombinedFiles

Note: Check this link to learn how to use my query.
Check this link if you don't know how to provide sample data.

AlienSx · ‎09-20-2024

replace path_to_folder string with yours

let
    transform_csv = (csv as binary) => 
        [data = Csv.Document(csv,[Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.None]),
        storage = List.Buffer(List.RemoveLastN(List.Alternate(List.Skip(Record.ToList(data{0}), 2), 2, 1), 1)), 
        tx = List.TransformMany(
            Table.ToRows(Table.RemoveLastN(Table.Skip(data, 3), 1)),
            (x) => ((w) => 
                List.Zip(
                    {
                        storage, 
                        List.Alternate(w, 2, 1 , 1),
                        List.Skip(List.Alternate(w, 2, 1, 2)),
                        List.Alternate(w, 2, 1)
                    }
                )
            )(List.RemoveLastN(List.Skip(x, 2), 3)),
        (x, y) => List.FirstN(x, 2) & y
        ),
        z = Table.FromRows(
            List.Select(tx, (x) => x{3} <> ""), 
            {"Status", "Created Date", "Pick Storage", "Orders", "Line Qty", "Pending"})]
        [z],
    result = Table.Combine(List.Transform(Folder.Files("PATH_TO_FOLDER_WITH_FILES")[Content], transform_csv))
in
    result

lbendlin · ‎09-20-2024

For each header column where the contents does not sort naturally you need to provide a dedicated sort column and then use the "sort one column by another column" feature.

Justas4478 · ‎09-20-2024

@lbendlin I am not sure if that is going to help in my case since my data looks like this:

When I import it in to power bi thats why I need to sort it in power query before I use it in report.

Justas4478 · ‎09-20-2024

In my data 'Pick Storage' which is in column two, row one.
Rest in row one G2P1, G2P5 and G5P0 are values of 'Pick Storage' column that ideally should be across one column and not multiple rows.
Something like this:

I hope it makes it more clear.

lbendlin · ‎09-20-2024

Create a function that ingests a single file

(f)=>
let
    Source = Csv.Document(f,[Delimiter=",",  Encoding=65001, QuoteStyle=QuoteStyle.None]),
    LZ = List.Skip(List.Transform(List.Zip({Record.ToList(Source{0}),Record.ToList(Source{1})}),each Text.Combine(_,"|")),2),
    #"Removed Bottom Rows" = Table.RemoveLastN(Source,1),
    #"Removed Top Rows" = Table.Skip(#"Removed Bottom Rows",2),
    #"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
    #"Replaced Headers" = Table.RenameColumns(#"Promoted Headers",List.Zip({List.Skip(Table.ColumnNames(#"Promoted Headers"),2),LZ})),
    #"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Headers", {"Status", "Created Date"}, "Attribute", "Value"),
    #"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({"|"}, QuoteStyle.Csv, false), {"Pick Storage", "Attribute"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Value", Int64.Type}}),
    #"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Value] <> null) and ([Pick Storage] <> "Totals")),
    #"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[Attribute]), "Attribute", "Value"),
    #"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column",{{"Created Date", type date}},"de")
in
    #"Changed Type1"

Then apply that to the list of files

And finally expand the table.

Sorting multi header table in power query

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

Join our Fabric User Panel

Party with Power BI’s own Guy in a Cube

Sorting multi header table in power query

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

Join our Fabric User Panel