Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
txpham16
Frequent Visitor

Power Query not recognizing header row in some of the multiple excel file from SharePoint folder

Sample Files 

Hi there,

I have multiple users uploading excel files, that follow the same template, onto a Sharepoint folder. Each of the uploaded file is following the exact same template with the header on the third row and starting on column 3. 

 

I am trying to combine the data within all the files in that sharepoint folder but am having some issue with power query not recognizing the third row as the header.

 

Within PowerBI Power Query, I have a column of the list of tables within each excel. As I am clicking throught the tables, I noticed that the query recognize the headers in only some of the files but not all of the files. In other words, when looking at each file table data, some of the files have "Column 1, column 2, etc" as the header whereas most of the files have the actual header name (3rd row) already as the header in power query tables. 

 

data table.PNG

Column header.PNG

wrong column header.PNG

wrong header 3.PNG

 

Can someone tell me how I can have power query recognize the header for all the files? 

 

I can't insert a step in the query to promote the 3rd row as the header because most of the files already have the 3rd row as the header. So if I were to insert a step to manually have the 3rd row get promoted to header, this will impact the files that already has the 3rd row as the header.

 

The next step in my query was to expand the data column, consisted of the table, to grab all the data with the header "Date, Count, Header, Value". But with some of the tables missing these names in the header, the expansion will exclude any files missing these headers. 

 

Code.PNG

Please help!

 

1 ACCEPTED SOLUTION

Here's a brute force version

let
    Source = Folder.Files("C:\Users\Lutz\Downloads"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx" and Text.StartsWith([Name], "Output"))),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each Table.SelectColumns(Table.PromoteHeaders(Table.Skip(Excel.Workbook([Content]){0}[Data],
          if Excel.Workbook([Content]){0}[Data]{2}[Column3]="Date" then 2 else 
          if Excel.Workbook([Content]){0}[Data]{1}[Column1]="Date" then 1 else 
          0
          ), [PromoteAllScalars=true]),{"Date", "Count", "Header", "Value"})),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Date", "Count", "Header", "Value"}, {"Date", "Count", "Header", "Value"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Data",{{"Date", type date}, {"Count", Int64.Type}, {"Header", type text}, {"Value", Currency.Type}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([Date] <> null)),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Filtered Rows1", {{"Count", 0},{"Value", null}})
in
    #"Replaced Errors"

but this could be refined if you have more than three scenarios.

View solution in original post

7 REPLIES 7
lbendlin
Super User
Super User

Create a custom function that calls Table.ColumnNames and then adds the processing steps if the first returned value is "Column1" 

Thank you for your suggestion.

Will Custom Functions be a workable solution if the headers starts on the 3rd column and 3rd row of the excel file?

 

yes.  You could even have code that hunts for the first filled row or column, but ideally your source format (how quirky it may be) should be reasonably stable.

 

if you want you can post some sample files. If you are unsure how to do that please refer to https://community.fabric.microsoft.com/t5/Community-Blog/How-to-provide-sample-data-in-the-Power-BI-...

I included the sample files in hyperlinked within the text. Thank you so much for your assistance.

Here would be the general approach - read the files through a folder connector and then interpret each file content with the template code. Then combine the results and clean up (your sample data is rather bumpy...)

 

let
    Source = Folder.Files("C:\Users\xxx\Downloads"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx" and Text.StartsWith([Name], "Output"))),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each Table.SelectColumns(Table.PromoteHeaders(Table.Skip(Excel.Workbook([Content]){0}[Data],2), [PromoteAllScalars=true]),{"Date", "Count", "Header", "Value"})),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Date", "Count", "Header", "Value"}, {"Date", "Count", "Header", "Value"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Data",{{"Date", type date}, {"Count", Int64.Type}, {"Header", type text}, {"Value", Currency.Type}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([Date] <> null)),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Filtered Rows1", {{"Count", 0},{"Value", null}})
in
    #"Replaced Errors"

Apologies for my sample data being unclear.

 #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each Table.SelectColumns(Table.PromoteHeaders(Table.Skip(Excel.Workbook([Content]){0}[Data],2), [PromoteAllScalars=true]),{"Date", "Count", "Header", "Value"})),

 This portion of the code is assuming the data is in the 3rd row 3rd column at all time. The problem is Power Query reads some of the file as if the headers are starting in cell A1 and some in cell D3 and some in Cell A3. 

I updated my sample files accordingly. Could you assist with how the Mcode should be given the sample files now?

Thank you so much!

Here's a brute force version

let
    Source = Folder.Files("C:\Users\Lutz\Downloads"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".xlsx" and Text.StartsWith([Name], "Output"))),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows", "Data", each Table.SelectColumns(Table.PromoteHeaders(Table.Skip(Excel.Workbook([Content]){0}[Data],
          if Excel.Workbook([Content]){0}[Data]{2}[Column3]="Date" then 2 else 
          if Excel.Workbook([Content]){0}[Data]{1}[Column1]="Date" then 1 else 
          0
          ), [PromoteAllScalars=true]),{"Date", "Count", "Header", "Value"})),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"Name", "Data"}),
    #"Expanded Data" = Table.ExpandTableColumn(#"Removed Other Columns", "Data", {"Date", "Count", "Header", "Value"}, {"Date", "Count", "Header", "Value"}),
    #"Changed Type" = Table.TransformColumnTypes(#"Expanded Data",{{"Date", type date}, {"Count", Int64.Type}, {"Header", type text}, {"Value", Currency.Type}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each ([Date] <> null)),
    #"Replaced Errors" = Table.ReplaceErrorValues(#"Filtered Rows1", {{"Count", 0},{"Value", null}})
in
    #"Replaced Errors"

but this could be refined if you have more than three scenarios.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors