March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I have 4 different .csv files that sit in a folder. I want to combine all of these.
But there is a catch, the first 2 files have 4 top junk rows that need to be removed and 5 columns.
And the last 2 files have 5 top junks rows and 6 columns.
I am looking for a dynamic way of combining all my CSV files from a folder.
Expecting Result
Date | Settlement ID | closing fees | promo rebates | TDS | total |
01-01-2022 | 111222333 | 1 | 2.5 | null | 3.5 |
02-01-2022 | 111222333 | 1 | 2.5 | null | 3.5 |
03-01-2022 | 111222333 | 1 | 2.5 | null | 3.5 |
04-01-2022 | 111222333 | 1 | 2.5 | null | 3.5 |
05-01-2022 | 111222333 | 1 | 2.5 | null | 3.5 |
01-02-2022 | 222333444 | 2 | 3.5 | null | 4.5 |
02-02-2022 | 222333444 | 2 | 3.5 | null | 4.5 |
03-02-2022 | 222333444 | 2 | 3.5 | null | 4.5 |
04-02-2022 | 222333444 | 2 | 3.5 | null | 4.5 |
01-03-2022 | 444555666 | 3 | 5.5 | 1 | 9.5 |
02-03-2022 | 444555666 | 3 | 5.5 | 1 | 9.5 |
03-03-2022 | 444555666 | 3 | 5.5 | 1 | 9.5 |
01-04-2022 | 333777888 | 7.5 | 1.4 | 0.75 | 9.65 |
02-04-2022 | 333777888 | 7.5 | 1.4 | 0.75 | 9.65 |
03-04-2022 | 333777888 | 7.5 | 1.4 | 0.75 | 9.65 |
Solved! Go to Solution.
Hello, @asif999
let
Source = Folder.Files(<path_to_your_folder>),
file_names = List.Buffer(Table.AddColumn(Source, "fp", each [Folder Path] & [Name])[fp]),
fx_csv = (path as text) as table =>
let
csv = Csv.Document(File.Contents(path),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
remove_lines = Table.Skip( csv, List.PositionOf( csv[Column2], "Settlement ID")),
promote = Table.PromoteHeaders(remove_lines)
in
promote,
files = Table.Combine(List.Transform(file_names, fx_csv)),
reorder = Table.ReorderColumns(files,{"Date", "Settlement ID", "closing fees", "promo rebates", "TDS", "total"})
in
reorder
Hello, @asif999
let
Source = Folder.Files(<path_to_your_folder>),
file_names = List.Buffer(Table.AddColumn(Source, "fp", each [Folder Path] & [Name])[fp]),
fx_csv = (path as text) as table =>
let
csv = Csv.Document(File.Contents(path),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None]),
remove_lines = Table.Skip( csv, List.PositionOf( csv[Column2], "Settlement ID")),
promote = Table.PromoteHeaders(remove_lines)
in
promote,
files = Table.Combine(List.Transform(file_names, fx_csv)),
reorder = Table.ReorderColumns(files,{"Date", "Settlement ID", "closing fees", "promo rebates", "TDS", "total"})
in
reorder
Thank @AlienSx for your answer.
I have done it this way.
let
Source = Folder.Files("B:\Excel Learning\Power Query Learning\1. Combining CSV Files with Uneven Columns & Junk Rows\CSV Files"),
Table_Cleaned = Table.AddColumn(Source, "Custom", each
let
MyTable = Csv.Document([Content]),
MyList = MyTable[Column1],
Position_Of_Date = List.PositionOf(MyList,"Date"),
My_Table_Cleaned = Table.Skip(MyTable,Position_Of_Date),
Promote_Headers = Table.PromoteHeaders(My_Table_Cleaned)
in
Promote_Headers),
Col_Names = Table.AddColumn(Table_Cleaned, "Custom.1", each Table.ColumnNames([Custom])),
Distinct_Col_Names = List.Distinct(List.Combine(Col_Names[Custom.1])),
Expaned_Table = Table.ExpandTableColumn(Table.SelectColumns(Col_Names,"Custom"),"Custom",Distinct_Col_Names)
in
Expaned_Table
I wouldn't be so confident about "Date" column in Position_Of_Date step because you have some junk data above. What if List.PositionOf finds "Date" text in there?
It won't ever find the date in junk rows. But I will consider your suggestion to remain 100% safe. I will use the second-column header instead.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
User | Count |
---|---|
20 | |
12 | |
10 | |
9 | |
7 |
User | Count |
---|---|
40 | |
26 | |
16 | |
16 | |
10 |