Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Morning, I have folder that reiceves multiple files in it and I will be reading all the files from that folder and combining them.
Good thing is that tables in the csv files are all some layout.
The problem is the layout looks like this:
I highlited what should be each independent column in same colors.
Exception to red: red parts of the table are not required in final sorted table.
As you see pick storage column spans acros all other columns(Orders, line qty and Pending(IM)).
This is how final result should look like:
As you see final result would be easier to work with and it does not introduce that many new rows and removes large amount of additional columns.
I did try to unpivot or transpose table in order to sort it out but was not successfull at the moment.
I could really use some help.
I attached sample data to this link.
https://we.tl/t-eaLcw9LvSC
Thanks
Solved! Go to Solution.
Create a function that ingests a single file
(f)=>
let
Source = Csv.Document(f,[Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.None]),
LZ = List.Skip(List.Transform(List.Zip({Record.ToList(Source{0}),Record.ToList(Source{1})}),each Text.Combine(_,"|")),2),
#"Removed Bottom Rows" = Table.RemoveLastN(Source,1),
#"Removed Top Rows" = Table.Skip(#"Removed Bottom Rows",2),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Replaced Headers" = Table.RenameColumns(#"Promoted Headers",List.Zip({List.Skip(Table.ColumnNames(#"Promoted Headers"),2),LZ})),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Headers", {"Status", "Created Date"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({"|"}, QuoteStyle.Csv, false), {"Pick Storage", "Attribute"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Value", Int64.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Value] <> null) and ([Pick Storage] <> "Totals")),
#"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[Attribute]), "Attribute", "Value"),
#"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column",{{"Created Date", type date}},"de")
in
#"Changed Type1"
Then apply that to the list of files
And finally expand the table.
Thank you for all the proposed solutions.
I wish I could understand them in detail to learn better but it might take too much time to explain.
The function I proposed explains the process step by step. There is no magic to this - you need to transform the source format step by step into something usable.
The only "fancy" part is the harvesting of the dangling headers
LZ = List.Skip(List.Transform(List.Zip({Record.ToList(Source{0}),Record.ToList(Source{1})}),each Text.Combine(_,"|")),2)
You can use PowerQueryFormatter to make the code look easier to understand.
LZ
= List.Skip(
List.Transform(
List.Zip({Record.ToList(Source{0}), Record.ToList(Source{1})}),
each Text.Combine(_, "|")
),
2
)
We are treating the first two rows as if they were lists. We combine them via List.Zip, and then convert the result to pipe delimited strings. Since the first two columns are ok we skip them.
Hi @Justas4478, another approach:
Replace address to your folder with csv files in Source step.
Output
let
fn_Transform =
(tbl as table)=>
[
// _Detail = ContentToTable{0}[Content],
_Detail = tbl,
_Helper = [ colNames = Table.ColumnNames(_Detail),
totalsPos = List.PositionOf(Record.ToList(Table.First(_Detail)), "Totals", Occurrence.All),
colNamesToRemove = List.Transform(totalsPos, (x)=> colNames{x}) ],
_RemovedTotalsColumns = Table.RemoveColumns(_Detail, _Helper[colNamesToRemove]),
_RemovedTotalsRow = Table.SelectRows(_RemovedTotalsColumns, each ([Column1] <> "Totals")),
_Transposed = Table.FromColumns(Table.ToRows(_RemovedTotalsRow)),
_MergedHeaders = Table.CombineColumns(_Transposed,{"Column1", "Column2"},Combiner.CombineTextByDelimiter("||", QuoteStyle.None),"Merged"),
_TransposedBack = Table.PromoteHeaders(Table.FromColumns(Table.ToRows(_MergedHeaders))),
_Unpivoted = Table.UnpivotOtherColumns(_TransposedBack, {"||", "Pick Storage||Measures"}, "Attribute", "Value"),
_Splitted = Table.SplitColumn(_Unpivoted, "Attribute", Splitter.SplitTextByDelimiter("||", QuoteStyle.Csv), {"Pick Storage", "Headers"}),
_FilteredRows = Table.SelectRows(_Splitted, each ([#"||"] = "IM Delivery")),
_Pivoted = Table.Pivot(_FilteredRows, List.Distinct(_FilteredRows[Headers]), "Headers", "Value"),
_Renamed = Table.RenameColumns(_Pivoted,{{"||", "Status"}, {"Pick Storage||Measures", "Created Date"}}),
_ChangedType = Table.TransformColumnTypes(_Renamed,{{"Status", type text}, {"Orders", type number}, {"Line Qty", type number}, {"Pending (IM)", type number}, {"Created Date", type date}}, "sk-SK"),
_FilteredRows1 = Table.SelectRows(_ChangedType, each not List.ContainsAll({[Orders], [Line Qty], [#"Pending (IM)"]}, {null}) ),
_SortedRows = Table.Sort(_FilteredRows1,{{"Created Date", Order.Ascending}})
][_SortedRows],
Source = Folder.Files("c:\Downloads\PowerQueryForum\Justas4478\"),
FilteredCSV = Table.SelectRows(Source, each [Extension] = ".csv"),
ContentToTable = Table.TransformColumns(FilteredCSV, {{"Content", each Csv.Document(_,[Delimiter=",", Columns=14, Encoding=65001, QuoteStyle=QuoteStyle.None]), type table}}),
RemovedOtherColumns = Table.SelectColumns(ContentToTable,{"Content", "Name"}),
Ad_Transformed = Table.AddColumn(RemovedOtherColumns, "Transformed", each Table.AddColumn(fn_Transform([Content]), "SourceName", (x)=> [Name], type text) , type table),
CombinedFiles = Table.Combine(Ad_Transformed[Transformed])
in
CombinedFiles
replace path_to_folder string with yours
let
transform_csv = (csv as binary) =>
[data = Csv.Document(csv,[Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.None]),
storage = List.Buffer(List.RemoveLastN(List.Alternate(List.Skip(Record.ToList(data{0}), 2), 2, 1), 1)),
tx = List.TransformMany(
Table.ToRows(Table.RemoveLastN(Table.Skip(data, 3), 1)),
(x) => ((w) =>
List.Zip(
{
storage,
List.Alternate(w, 2, 1 , 1),
List.Skip(List.Alternate(w, 2, 1, 2)),
List.Alternate(w, 2, 1)
}
)
)(List.RemoveLastN(List.Skip(x, 2), 3)),
(x, y) => List.FirstN(x, 2) & y
),
z = Table.FromRows(
List.Select(tx, (x) => x{3} <> ""),
{"Status", "Created Date", "Pick Storage", "Orders", "Line Qty", "Pending"})]
[z],
result = Table.Combine(List.Transform(Folder.Files("PATH_TO_FOLDER_WITH_FILES")[Content], transform_csv))
in
result
For each header column where the contents does not sort naturally you need to provide a dedicated sort column and then use the "sort one column by another column" feature.
@lbendlin I am not sure if that is going to help in my case since my data looks like this:
When I import it in to power bi thats why I need to sort it in power query before I use it in report.
In my data 'Pick Storage' which is in column two, row one.
Rest in row one G2P1, G2P5 and G5P0 are values of 'Pick Storage' column that ideally should be across one column and not multiple rows.
Something like this:
I hope it makes it more clear.
Create a function that ingests a single file
(f)=>
let
Source = Csv.Document(f,[Delimiter=",", Encoding=65001, QuoteStyle=QuoteStyle.None]),
LZ = List.Skip(List.Transform(List.Zip({Record.ToList(Source{0}),Record.ToList(Source{1})}),each Text.Combine(_,"|")),2),
#"Removed Bottom Rows" = Table.RemoveLastN(Source,1),
#"Removed Top Rows" = Table.Skip(#"Removed Bottom Rows",2),
#"Promoted Headers" = Table.PromoteHeaders(#"Removed Top Rows", [PromoteAllScalars=true]),
#"Replaced Headers" = Table.RenameColumns(#"Promoted Headers",List.Zip({List.Skip(Table.ColumnNames(#"Promoted Headers"),2),LZ})),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Replaced Headers", {"Status", "Created Date"}, "Attribute", "Value"),
#"Split Column by Delimiter" = Table.SplitColumn(#"Unpivoted Other Columns", "Attribute", Splitter.SplitTextByEachDelimiter({"|"}, QuoteStyle.Csv, false), {"Pick Storage", "Attribute"}),
#"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"Value", Int64.Type}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each ([Value] <> null) and ([Pick Storage] <> "Totals")),
#"Pivoted Column" = Table.Pivot(#"Filtered Rows", List.Distinct(#"Filtered Rows"[Attribute]), "Attribute", "Value"),
#"Changed Type1" = Table.TransformColumnTypes(#"Pivoted Column",{{"Created Date", type date}},"de")
in
#"Changed Type1"
Then apply that to the list of files
And finally expand the table.
Check out the July 2025 Power BI update to learn about new features.