Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
How do you make sure if you import files from a folder, that all columns are kept? Say, first 10 files have column A, B, and C. Then after come 25 with A, B, C, and D, and eventually 5 more with A, B, D, and E.
If my expectation is to have 1 query where the results is 5 columns that dependable on the file's type, the other columns it doesn't have are left blank. Say, outcome is 5 columns, A, B, C, D, and E where with the first template that has A, B, and C, in the end result the column D, and E are left blank. Second template only E is left blank and third template has C left blank.
It also needs to continue to work, if there's a fourth template coming up in the future with column F, the outcome should automatically get 6 columns. If one of the files has the column F misspelled, there are 7 column. If you correct that misspelled column, it returns to 6.
How do you get that done?
Solved! Go to Solution.
When we import a folder, several things actually happen:
The problem you mentioned occurs when "expanding the worksheet". When expanding the worksheet (using Table.ExpandTableColumn), you need to provide the column name, whether the user writes code or selects through the UI.
Solution:
The sample code is as follows:
let
源 = Folder.Files("C:\Users\Black\Desktop\新建文件夹 (3)"),
筛选的隐藏文件1 = Table.SelectRows(源, each [Attributes]?[Hidden]? <> true),
调用自定义函数1 = Table.AddColumn(筛选的隐藏文件1, "转换文件", each 转换文件([Content])),
重命名的列1 = Table.RenameColumns(调用自定义函数1, {"Name", "Source.Name"}),
删除的其他列1 = Table.SelectColumns(重命名的列1, {"Source.Name", "转换文件"}),
columnNameList = List.Distinct(List.Combine(List.Transform(删除的其他列1[转换文件], Table.ColumnNames))),
expandColumn = Table.ExpandTableColumn(删除的其他列1, "转换文件", columnNameList, columnNameList)
in
expandColumn
data:
1.xlsx
A | B | C | D |
21 | 22 | 23 | 24 |
2.xlsx
A | B | C |
11 | 12 | 13 |
3.xlsx
A | B | D | E |
31 | 32 | 33 | 34 |
result:
Source.Name | A | B | C | D | E |
1.xlsx | 21 | 22 | 23 | 24 | |
2.xlsx | 11 | 12 | 13 | ||
3.xlsx | 31 | 32 | 33 | 34 |
try to use the second argument of Table.Combine
check and amend your step of combining the tables.
=Table.Combine(YourTableList,{"A","B","C","D","E"})
Perhaps you misunderstand? Or I misunderstand what you propose.
I'm not seeing the step you propose:
Like, I'm referring to this thingy:
When we import a folder, several things actually happen:
The problem you mentioned occurs when "expanding the worksheet". When expanding the worksheet (using Table.ExpandTableColumn), you need to provide the column name, whether the user writes code or selects through the UI.
Solution:
The sample code is as follows:
let
源 = Folder.Files("C:\Users\Black\Desktop\新建文件夹 (3)"),
筛选的隐藏文件1 = Table.SelectRows(源, each [Attributes]?[Hidden]? <> true),
调用自定义函数1 = Table.AddColumn(筛选的隐藏文件1, "转换文件", each 转换文件([Content])),
重命名的列1 = Table.RenameColumns(调用自定义函数1, {"Name", "Source.Name"}),
删除的其他列1 = Table.SelectColumns(重命名的列1, {"Source.Name", "转换文件"}),
columnNameList = List.Distinct(List.Combine(List.Transform(删除的其他列1[转换文件], Table.ColumnNames))),
expandColumn = Table.ExpandTableColumn(删除的其他列1, "转换文件", columnNameList, columnNameList)
in
expandColumn
data:
1.xlsx
A | B | C | D |
21 | 22 | 23 | 24 |
2.xlsx
A | B | C |
11 | 12 | 13 |
3.xlsx
A | B | D | E |
31 | 32 | 33 | 34 |
result:
Source.Name | A | B | C | D | E |
1.xlsx | 21 | 22 | 23 | 24 | |
2.xlsx | 11 | 12 | 13 | ||
3.xlsx | 31 | 32 | 33 | 34 |
Tried to replicate your Chinese onto my English, but I tested it by having an extra column in one of the files with a newly added column, it doesn't pop-up :(.
Am surprised by what it keeps in English, and what it translates to Chinese :).
Edit: It does work! The modification I pushed onto the source file wasn't saved to the server apparently. Silly synchronization issue.
Hi @DouweMeer ,
To achieve this in Power BI using Power Query, where the final output dynamically adjusts to changing columns across multiple Excel files in a folder, follow these steps:
Place Files in a Folder:
Store all Excel files in a single folder. Ensure they have consistent data structures (e.g., headers in the first row). This folder will serve as the data source.
Connect Power BI to the Folder:
Combine Files:
Promote Headers and Handle Missing Columns:
Add a Step to Handle Future Columns:
Table.PromoteHeaders(#"Previous Step", [PromoteAllScalars=true])
Rename or Standardize Column Names (Optional):
If column names may be misspelled, consider applying transformations to standardize names:
Table.TransformColumnNames(#"Previous Step", Text.Proper)
Close and Load:
This setup ensures a flexible and scalable solution for combining Excel files with changing column structures, adapting to both new columns and corrected column names automatically.
Please mark this as solution if it helps. Appreciate Kudos.
Believe yours doesn't work, I tried to work off your suggestion.
I believe the reason it would fail is this part:
If you choose the template from the first file, future files that do not contain the headers from the sample file will be automatically removed. Sure, the files will be processed, unless it is missing columns that are in the sample file.
If you'd select the last file, if the first file misses out of a column, say F, it will hit an error in the process.
I accepted another reply as a solution. I'm happy to review your proposal and accept it as well as a solution (if it allows me), but right now I don't see it solving my challenge.