Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
justlogmein
Helper III
Helper III

How to combine csv files with inconsistent columns from within a folder?

I want to use the From Folder feature to combine csv files that have different some different columns. The columns are not just different names, but also contain different data.

 

If I manually add each csv file to Power Query then just append those queries, I get what I want. All of the columns from each file are there and those rows without columns in each file show up as null as expected. But the problem is that these files will change and I need to use the From Folder feature so it picks up the files within the folder dynamically, whatever their names are.

 

So my question is, how do I do a normal Append that appends multiple csv files from within a folder so that it captures all of the columns?

 

https://jmservicescomau-my.sharepoint.com/:f:/g/personal/jacob_jmservices_com_au/Esc1sh-YWMBCgoAaeuC...

 

 

1 ACCEPTED SOLUTION
KT_Bsmart2gethe
Impactful Individual
Impactful Individual

Hi @justlogmein ,

 

I have downloaded your files and create the code below. you can copy and paste into your dashboard file sample data query in advance editor.

 

See below modified code (Import from Folder):

 

let

 

//You can replace the file with your drill down path to the blue text below
Source = Folder.Files("C:\Users\cktan\Documents\Solutions\justlogmein\Sample Data"),

 

// Add new column to get the files and promote headers before expand the columns
GetTables = Table.AddColumn(Source, "GetTbl", each Table.PromoteHeaders(Csv.Document([Content],

[Delimiter=",", Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]))),

 

//Below code is to get the column names from each csv file and combine them as a list
GetColumnNames = Table.AddColumn(GetTables, "GetColName", each Table.ColumnNames([GetTbl])),
ColNameList = Table.Distinct(Table.ExpandListColumn(Table.SelectColumns(GetColumnNames,{"GetColName"}), "GetColName"))[GetColName],

 

//Keep the name and added column before we expand
Filtered_Columns = Table.SelectColumns(GetTables,{"Name", "GetTbl"}),

 

//the above column name list is use at below code in blue text to allow dynamic expand
#"Expanded GetTbl" = Table.ExpandTableColumn(Filtered_Columns, "GetTbl", ColNameList)


in
#"Expanded GetTbl"

 

I hope this helps

View solution in original post

7 REPLIES 7
CR21
New Member

Hello All, I am trying to resolve the same problem (Combining multiple CSV file with inconsistent column names and numbers). But I do face different issues. 

After Googling, YouTubing for a solution, I did find more than videos which explains the solution for this challenge. However, I am faced with following error:

CR21_0-1660037617959.png

So, what I have done so far? 

 

let
Source = SharePoint.Files("<OneDriveLocation where files are stored>", [ApiVersion = 15]),
#"Filtered Rows" = Table.SelectRows(Source, each ([Folder Path] = "OneDrive location where Files are stored")),
#"Filtered Hidden Files" = Table.SelectRows(#"Filtered Rows", each [Attributes]?[Hidden]? <> true),
#"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files", "Transform File (3)", each #"Transform File (3)"([Content])),
#"Renamed Columns1" = Table.RenameColumns(#"Invoke Custom Function1", {"Name", "Source.Name"}),
PreExpand = Table.SelectColumns(#"Renamed Columns1", {"Source.Name", "Transform File (3)"}),
HEADINGS = List.Union( List.Transform(PreExpand[#"Transform File (3)"], each Table.ColumnNames(_))),
ReadyToExpand = PreExpand,
#"Expanded Transform File (3)" = Table.ExpandTableColumn(ReadyToExpand, "Transform File (3)", HEADINGS)
in
#"Expanded Transform File (3)"

 

The bold statement is a changed line so as to accomodate all the known columns from all the files in a folder. But it results in an error. When I change the bold line with the following (which was the original line), there is no error and all data is loaded.

#"Expanded Transform File (3)" = Table.ExpandTableColumn(ReadyToExpand, "Transform File (3)", Table.ColumnNames(#"Transform File (3)"(#"Sample File (3)"))) 

 

Please advise.

 

Anonymous
Not applicable

prova questa query

 

let
    Origine = Folder.Files("C:\Users\sprmn\OneDrive\Documents\Power BI Desktop\CSVsCombine\Sample Data"),
    #"aggiungi colonna" = Table.AddColumn(Origine, "tables", each   Table.PromoteHeaders(Csv.Document([Content],[Delimiter=",",Encoding=1252, QuoteStyle=QuoteStyle.None]))),
    #"Rimosse altre colonne" = Table.SelectColumns(#"aggiungi colonna", {"Name", "tables"}),
    allCSV=Table.Combine(#"Rimosse altre colonne"[tables])
in
    allCSV

 

 

KT_Bsmart2gethe
Impactful Individual
Impactful Individual

Hi @justlogmein ,

 

I have downloaded your files and create the code below. you can copy and paste into your dashboard file sample data query in advance editor.

 

See below modified code (Import from Folder):

 

let

 

//You can replace the file with your drill down path to the blue text below
Source = Folder.Files("C:\Users\cktan\Documents\Solutions\justlogmein\Sample Data"),

 

// Add new column to get the files and promote headers before expand the columns
GetTables = Table.AddColumn(Source, "GetTbl", each Table.PromoteHeaders(Csv.Document([Content],

[Delimiter=",", Columns=5, Encoding=1252, QuoteStyle=QuoteStyle.None]))),

 

//Below code is to get the column names from each csv file and combine them as a list
GetColumnNames = Table.AddColumn(GetTables, "GetColName", each Table.ColumnNames([GetTbl])),
ColNameList = Table.Distinct(Table.ExpandListColumn(Table.SelectColumns(GetColumnNames,{"GetColName"}), "GetColName"))[GetColName],

 

//Keep the name and added column before we expand
Filtered_Columns = Table.SelectColumns(GetTables,{"Name", "GetTbl"}),

 

//the above column name list is use at below code in blue text to allow dynamic expand
#"Expanded GetTbl" = Table.ExpandTableColumn(Filtered_Columns, "GetTbl", ColNameList)


in
#"Expanded GetTbl"

 

I hope this helps

Thank you so much for this.

I managed to combine several Excel files with different structures using your approach (with some tweaks).

Your solution on csv encouraged me to push the limits and try to resolve it!

Thank you, this solution worked and the comments helped a lot.

Migasuke
Super User
Super User

Hi @justlogmein ,

There are several approaches you can do, one of them is to have a sample file, which would cover all columns. another one is to adjust function, which is automatically created in PQ. In my case I easily combined the files without problem, only thing you need to change in my approach is to rename the column names. Please check the pbix I have done, if its not enough, let me know.



If my answer was helpful please give me a Kudos or even accept as a Solution.

Let's connect on LinkedIn!

Hi Migasuke. Thank you for your assistance with this, but I am not able to manually change column names (this is a requirement of the client). I am trying to make it dynamic so it simply combines/appends the csv files on refresh.

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors