Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Mederic
Post Patron
Post Patron

Tables combine

Hello everyone,
I would like to combine several tables from the same file and keep only certain columns.
The problem is that the column headings are not always in the same place.
I have attached a sample file with the desired result.
Can you please help me?

Thank you in advance.

Best regards

15 REPLIES 15
v-nmadadi-msft
Community Support
Community Support

Hi @Mederic 

I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.


Thank you.

ronrsnfld
Super User
Super User

Here is a little different approach.

 

First, create a custom function to extract the relevant data from your table:

This extracts the relevant column, starting at the location of the header

We will apply the special handling for the "Document" column in the main query.

 

//rename "fnExtractCol"

(rawTable as table, colHeader as text)=>                
    
    [a=Table.ToColumns(rawTable),
     b=List.FindText(a,colHeader){0},
     col=List.Skip(b,List.PositionOf(b,colHeader)+1)][col]

 

And the Main Query

  Use List.Accumulate and the custom function to process each table and combine them

let

//Changew next lines to create a list of all your tables
//to be used in the List.Accumulate function

    Source1 = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    Source2 = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    Source3 = Excel.CurrentWorkbook(){[Name="Table3"]}[Content],
    
    Tbls = List.Accumulate(
                {Source1,Source2,Source3},
                #table({},{}),
                (s,cur)=> Table.Combine({s,
                    [a=fnExtractCol(cur,"Document No"),
                     b={List.First(a,2)},
                     c=fnExtractCol(cur,"Order"),
                     d=fnExtractCol(cur,"Amount"),
                     e=Table.FromColumns({b,c,d},
                            type table[Document No=Int64.Type, Order=text,Amount=Currency.Type]),
                     f=Table.FillDown(e,{"Document No"})][f]}))
                    

in
    Tbls

From your data in your most recent sample file:

ronrsnfld_0-1753667574143.png

 

 

Good evening everyone,
I have carefully reviewed the various solutions.
It seems to me that we are not using the same data model (start of Data).
And it is not easy to explain when the source is a .pdf file. So please accept my apologies.
I have recreated a new file here that is representative of my personal file.
I hope this will be much clearer.
Thank you in advance.

Best regards,

The data that I was using is the data that YOU supplied (your Excel file that you linked to).

 

Your latest Excel sheet is slightly different from what you have previously supplied. And only minor changes are required in the code I supplied for it to produce your desired result (on your Result tab) given your six tables on your Template Invoice 536 tab.

 

So I don't really understand your issue.

 

The relevant differences between the data YOU supplied this time and the data YOU supplied previously has to do with the extra empty cells in the Order and Amount Columns, as well as the different capitalization of Document No vs Document no; both issues easily taken care of with minor code alterations.

Also, one of your tables has no data for Order and Amount. so we need to check for the error and return an empty table if that is the case.

 

let

//Changew next lines to create a list of all your tables
//to be used in the List.Accumulate function

    Source1 = Excel.CurrentWorkbook(){[Name="Tableau1"]}[Content],
    Source2 = Excel.CurrentWorkbook(){[Name="Tableau2"]}[Content],
    Source3 = Excel.CurrentWorkbook(){[Name="Tableau3"]}[Content],
    Source4 = Excel.CurrentWorkbook(){[Name="Tableau4"]}[Content],
    Source5 = Excel.CurrentWorkbook(){[Name="Tableau5"]}[Content],
    Source6 = Excel.CurrentWorkbook(){[Name="Tableau6"]}[Content],
    
    Tbls = List.Accumulate(
                {Source1,Source2,Source3,Source4,Source5, Source6},
                #table({},{}),
                (s,cur)=> Table.Combine({s,
                    [a=fnExtractCol(cur,"Document no"),
                     b={List.First(a,2)},
                     c=fnExtractCol(cur,"Order"),
                     d=fnExtractCol(cur,"Amount"),
                     e=Table.FromColumns({b,c,d},
                            type table[Document No=Int64.Type, Order=text,Amount=Currency.Type]),
                     f=try Table.FillDown(e,{"Document No"}) otherwise #table({},{})  ][f]}))                   

in
    Tbls

  

Of course, I also wonder about the process you use to get the PDF files into the six tables that you show, and wonder if there might be some improvement at that level.

Thanks again,
I put the tables in an Excel file because I couldn't add the PDF invoice.
Indeed the import process is different.
I don't want to charge any pages from the PDF (tables) into the tabs, but rather the result directly.
I was inspired by this video for the process.
I don't understand the language, but it allowed me to get the result that I put in my previous example files.

Best regards,

Hello @ronrsnfld,

Thank you very much for your reply,

This solution with a function is a good idea. 

I'll test it out this evening and get back to you

Have a nice day

Best regards

Mederic
Post Patron
Post Patron

Hello @MasonMA ,

Thank you again for your help.
I'm not sure I can apply some of your steps to my object of Tables without expanding them.

I'will using a Text parameter and adding it directly to my Source.
I've tested it on several invoices and "Document No" always appears in column 2 and on the 3rd line.
I'll use this method even if it's not ideal.

Have a good day.
Best regards,

Hello @Mederic 

 

Could you share your M code? 

I got confused again. In your message yesterday, it was mentioned Document No. - 'This information is sometimes in column 2 or 3.'

However in your last message, '"Document No" always appears in column 2 and on the 3rd line.'


If you could review my initial message, the logic is cleaning the tables until your have no Null or columns with unrelated info Before you use Table.Comnine and then transform your Document No. column from there. 

 

However , I wouldn't be able to validate them unless I see your PDF files. Anyways, It's great you have another solution. 


Happy to chat on platform. 

 

 

 

 

@MasonMA 

Here is the file with the code I use, which works on my personal PDF files.
Sorry, I created an example that did not correspond 100% to the PDF extraction.
The "Order" and "Amount" columns are always in the same column number.
Only the Document number changes.

Best regards,

lbendlin
Super User
Super User

The problem is that the column headings are not always in the same place.

That is not a problem.  Use Table.Combine or the & symbol, it will match the columns from all tables no matter the column order.

MasonMA
Solution Sage
Solution Sage

Hi @Mederic 

 

I would use below strategies if in your situation. 

1. Create one reusable Function for data cleaning as below and apply this function on each table

(table) =>
let
    Source = table,
    HeaderRow = List.First(List.Select(Table.ToRows(Source), each List.Contains(_, "Document No"))),
    HeaderRowIndex = List.PositionOf(Table.ToRows(Source), HeaderRow),
    PromoteHeaders = Table.PromoteHeaders(Table.Skip(Source, HeaderRowIndex)),
    RemoveNulls = Table.SelectRows(PromoteHeaders, each [Document No] <> null and [Document No] <> ""),
    SelectColumns = Table.SelectColumns(RemoveNulls, {"Document No", "Order", "Amount"}, MissingField.UseNull)
in
    SelectColumns

 

MasonMA_0-1753567503784.png

2. Use Table.Combine or 'Append Queries as New' from UI to combine all your tables.

MasonMA_1-1753567596734.png

3. Add a custom column that returns the Document No only for rows where it's a number, and blanks for name rows

MasonMA_2-1753567672285.png

4. Fill down the Document No so each order row gets matched with the correct document.

MasonMA_3-1753567805111.png

5. Remove rows where both Order and Amount are blank. 

MasonMA_5-1753567921966.png

6. Move the No column to the left as your new Document No column. Optionally, you can also remove the original Document No column if it’s no longer needed.

 

You can also wrap Steps 2–5 into another function if you're repeating this logic across multiple datasets.

MasonMA_6-1753568836902.pngMasonMA_7-1753568854685.png

In case you would like to refer to this file, please use this link

 

Hope this helps:)

 

Hello @MasonMA , @lbendlin ,

Thank you for your replies,

Masson, thank you for your clear explanations and this solution.
But Ibendlin's comment made me realise that my file was not exactly the same as the one in my first message.

And sorry for the mistake.
Here is a new file and screenshots of what I would like to do in a specific step, namely adding the column number to the location shown in the screenshot.

Thanks in advance

Best regards

 

Invoice No Finale.jpg

Invoice.jpg

If you need one solution ready for use on your file, please share some real sample data from each table. 

 

Also on your picture, what does it mean by making 'Invoice No.' dynamic? I'm not sure i understand your logic.

Thank @MasonMA you for your reply.
With "make the column dynamic", I meant "automatic" . As we can see, I added a hard-coded formula.
Regarding a real file:
- the source comes from a .pdf file with tables and pages.
- Each table in my example file corresponds to the pages of the .pdf file.
- Each page contains 10 to 11 columns and approxi. 70 rows.
- I have between 5 and 10 pages per invoice.
I want to combine the pages of a single invoice at a time, not of multiple invoices
I have managed to achieve what I wanted, but I am stuck on the "Invoice No" ("Document No") column because, This information is sometimes in column 2 or 3.
In my fictitious file, I have placed "Invoice No" in column 2 or 4 for illustration purposes.

I hope that was clear.
Thank you in advance.

Best regards

Hello @Mederic 

 

I was having the same challenge handling your Document No. when i was playing with your mock file, that's why i created one additional column for these Document No. and use this new Column as a replacement of your old Document No. column (If you see my Step 3 and 4), then filling them down so that right Document No. can be attached to the right Orders. 

 

In my proposal, if you identified that Document No. only appear on Column 2 and 3, you may just need to update AddColumn by creating an inner block (of course you can further tweak the code and make it even more dynamic if you are not sure which Columns have Document No.)

AddInvoiceNoColumn = Table.AddColumn(Source, "Document No", each 
    let
        colsToCheck = { [Column2], [Column3]},
        foundDocNo = 
            List.First(
                List.Select(colsToCheck, each Value.Is(Value.FromText(Text.Trim(_)), Int64.Type)), 
                null
            )
    in
        foundDocNo
)

 

Hope this gives you some ideas:)

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.