Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
DouweMeer
Post Prodigy
Post Prodigy

Import Excel file with changing columns

How do you make sure if you import files from a folder, that all columns are kept? Say, first 10 files have column A, B, and C. Then after come 25 with A, B, C, and D, and eventually 5 more with A, B, D, and E. 

 

If my expectation is to have 1 query where the results is 5 columns that dependable on the file's type, the other columns it doesn't have are left blank. Say, outcome is 5 columns, A, B, C, D, and E where with the first template that has A, B, and C, in the end result the column D, and E are left blank. Second template only E is left blank and third template has C left blank. 

 

It also needs to continue to work, if there's a fourth template coming up in the future with column F, the outcome should automatically get 6 columns. If one of the files has the column F misspelled, there are 7 column. If you correct that misspelled column, it returns to 6. 

 

How do you get that done? 

1 ACCEPTED SOLUTION
ZhangKun
Resolver V
Resolver V

When we import a folder, several things actually happen:

  • Find all files
  • Generate a calculation template based on the user's operation, and only use this template to read the worksheet
  • Execute the calculation template in the previous step for all files
  • Expand the worksheet
  • Set the type and other steps

The problem you mentioned occurs when "expanding the worksheet". When expanding the worksheet (using Table.ExpandTableColumn), you need to provide the column name, whether the user writes code or selects through the UI.

Solution:

  • Check the steps automatically generated by Power Query and delete the steps after expanding the table column
  • Write code to collect all column names, and deduplicate and sort the column names (if you need)
  • Use this list of stored column names to expand the table

The sample code is as follows:

 

 

let
    源 = Folder.Files("C:\Users\Black\Desktop\新建文件夹 (3)"),
    筛选的隐藏文件1 = Table.SelectRows(源, each [Attributes]?[Hidden]? <> true),
    调用自定义函数1 = Table.AddColumn(筛选的隐藏文件1, "转换文件", each 转换文件([Content])),
    重命名的列1 = Table.RenameColumns(调用自定义函数1, {"Name", "Source.Name"}),
    删除的其他列1 = Table.SelectColumns(重命名的列1, {"Source.Name", "转换文件"}), 
    columnNameList = List.Distinct(List.Combine(List.Transform(删除的其他列1[转换文件], Table.ColumnNames))), 
    expandColumn = Table.ExpandTableColumn(删除的其他列1, "转换文件", columnNameList, columnNameList)
in
    expandColumn

 

 

data:

1.xlsx

ABCD
21222324

2.xlsx

ABC
111213

3.xlsx

ABDE
31323334

result:

Source.NameABCDE
1.xlsx21222324 
2.xlsx111213  
3.xlsx3132 3334

 

View solution in original post

6 REPLIES 6
wdx223_Daniel
Super User
Super User

try to use the second argument of Table.Combine

wdx223_Daniel_0-1732664989510.png

check and amend your step of combining the tables.

=Table.Combine(YourTableList,{"A","B","C","D","E"})

Perhaps you misunderstand? Or I misunderstand what you propose. 

I'm not seeing the step you propose:

DouweMeer_0-1732719421249.png

Like, I'm referring to this thingy:

DouweMeer_1-1732719454440.png

 

ZhangKun
Resolver V
Resolver V

When we import a folder, several things actually happen:

  • Find all files
  • Generate a calculation template based on the user's operation, and only use this template to read the worksheet
  • Execute the calculation template in the previous step for all files
  • Expand the worksheet
  • Set the type and other steps

The problem you mentioned occurs when "expanding the worksheet". When expanding the worksheet (using Table.ExpandTableColumn), you need to provide the column name, whether the user writes code or selects through the UI.

Solution:

  • Check the steps automatically generated by Power Query and delete the steps after expanding the table column
  • Write code to collect all column names, and deduplicate and sort the column names (if you need)
  • Use this list of stored column names to expand the table

The sample code is as follows:

 

 

let
    源 = Folder.Files("C:\Users\Black\Desktop\新建文件夹 (3)"),
    筛选的隐藏文件1 = Table.SelectRows(源, each [Attributes]?[Hidden]? <> true),
    调用自定义函数1 = Table.AddColumn(筛选的隐藏文件1, "转换文件", each 转换文件([Content])),
    重命名的列1 = Table.RenameColumns(调用自定义函数1, {"Name", "Source.Name"}),
    删除的其他列1 = Table.SelectColumns(重命名的列1, {"Source.Name", "转换文件"}), 
    columnNameList = List.Distinct(List.Combine(List.Transform(删除的其他列1[转换文件], Table.ColumnNames))), 
    expandColumn = Table.ExpandTableColumn(删除的其他列1, "转换文件", columnNameList, columnNameList)
in
    expandColumn

 

 

data:

1.xlsx

ABCD
21222324

2.xlsx

ABC
111213

3.xlsx

ABDE
31323334

result:

Source.NameABCDE
1.xlsx21222324 
2.xlsx111213  
3.xlsx3132 3334

 

Tried to replicate your Chinese onto my English, but I tested it by having an extra column in one of the files with a newly added column, it doesn't pop-up :(. 

 

DouweMeer_2-1732720698657.png

 

Am surprised by what it keeps in English, and what it translates to Chinese :). 

 

Edit: It does work! The modification I pushed onto the source file wasn't saved to the server apparently. Silly synchronization issue. 

 

FarhanJeelani
Solution Sage
Solution Sage

Hi @DouweMeer ,

To achieve this in Power BI using Power Query, where the final output dynamically adjusts to changing columns across multiple Excel files in a folder, follow these steps:

Step-by-Step Process

  1. Place Files in a Folder:
    Store all Excel files in a single folder. Ensure they have consistent data structures (e.g., headers in the first row). This folder will serve as the data source.

  2. Connect Power BI to the Folder:

    • Open Power BI Desktop.
    • Go to Home > Get Data > Folder.
    • Navigate to the folder containing your files and click OK.
    • Power BI will create a query listing the files in the folder.
  3. Combine Files:

    • In the query editor, click Combine Files. Power BI will use the first file as a sample and combine all files in the folder.
    • Choose the sheet or table from the first file when prompted.
  4. Promote Headers and Handle Missing Columns:

    • In the Power Query editor, Power BI will attempt to standardize the columns across all files.
    • Any columns missing in some files will be filled with null values automatically.
    • To ensure consistency, click on Transform > Use First Row as Headers to promote the first row as headers.
  5. Add a Step to Handle Future Columns:

    • Go to the Advanced Editor in Power Query (on the Home tab).
    • Locate the step where column names are hardcoded (e.g., Table.SelectColumns or Table.TransformColumnTypes).
    • Modify or remove this step to ensure Power Query dynamically adapts to new columns. Specifically:
      • Remove any Table.SelectColumns step, which restricts the output to specific columns.
      • Replace it with a step that promotes all available columns dynamically:
        Table.PromoteHeaders(#"Previous Step", [PromoteAllScalars=true])
  6. Rename or Standardize Column Names (Optional):
    If column names may be misspelled, consider applying transformations to standardize names:

    • Use Transform > Replace Values to correct common misspellings.
    • Add a custom step to clean column names dynamically:
      Table.TransformColumnNames(#"Previous Step", Text.Proper)
  7. Close and Load:

    • Click Close & Apply to load the transformed data into Power BI.
    • Power BI will now combine all files, with columns that are missing in some files left blank (null), and dynamically adapt to new columns in future files.

Dynamic Handling of Future Changes

  • When new files with additional columns (e.g., F) are added to the folder, Power Query will automatically include them in the output as long as they exist in the header row of at least one file.
  • If a column is misspelled in a file, it will appear as a new column in the output. Correcting the spelling in the source file will merge the data back into the appropriate column upon refresh.

Testing and Validation

  1. Add files with varying column structures to the folder and refresh the query in Power BI.
  2. Verify that all columns (existing and new) appear correctly, with missing values represented as null.

This setup ensures a flexible and scalable solution for combining Excel files with changing column structures, adapting to both new columns and corrected column names automatically.

 

Please mark this as solution if it helps. Appreciate Kudos.

Believe yours doesn't work, I tried to work off your suggestion. 

I believe the reason it would fail is this part:

DouweMeer_0-1732721811526.png

If you choose the template from the first file, future files that do not contain the headers from the sample file will be automatically removed. Sure, the files will be processed, unless it is missing columns that are in the sample file. 

If you'd select the last file, if the first file misses out of a column, say F, it will hit an error in the process.

 

I accepted another reply as a solution. I'm happy to review your proposal and accept it as well as a solution (if it allows me), but right now I don't see it solving my challenge. 

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Kudoed Authors