Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
TameshDoobay
Frequent Visitor

Extract Table from multiple tables from multiple pdfs

Hi,

 

Right now I have some PDFs which have multiple tables within each PDF.

 

I need to extract a single table from each of the PDFs. 

 

The way I'm attempting or suggesting to do it is by:

1. Import the PDF data

2. In Power Query, I expand the data. At this point, I now have 5 columns: Source.Name, Id, Name, Kind and Data. The seperated tables are within the Data Column.
3. This is where i'm confused. My goal is to extract the tables that have these headers: ITEM NO., PART NUMBER, DESCRIPTION, QTY.

So, should I create a column that somehow checks the tables and gives a true/false value based on if the table contains those headers? How would I do that?

Thanks

 

1 ACCEPTED SOLUTION
NaveenGandhi
Super User
Super User

@TameshDoobay 

The attached file has a solution which will work. For me the table where without headers, so i needed promote headers to be done. I am not sure about yours, so i did one with promote headers and another without promote headers.

Basically i have created a filter column to know which tables match the column names in ColNames list that i have created. Then expand the tables.

NaveenGandhi_0-1721244004945.png

 


Let me know if you have any questions.
If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!

Follow me on LinkedIn!!!


View solution in original post

3 REPLIES 3
NaveenGandhi
Super User
Super User

@TameshDoobay 

The attached file has a solution which will work. For me the table where without headers, so i needed promote headers to be done. I am not sure about yours, so i did one with promote headers and another without promote headers.

Basically i have created a filter column to know which tables match the column names in ColNames list that i have created. Then expand the tables.

NaveenGandhi_0-1721244004945.png

 


Let me know if you have any questions.
If this post helps, then please consider Accept it as the solution to help the others find it more quickly. Appreciate you kudos!!

Follow me on LinkedIn!!!


NaveenGandhi
Super User
Super User

Hi @TameshDoobay 

Can you share a sample file? and screenshots to understand the problem better.

Regards,
NG

Unfortunately I can't share a file due to business retrictions. However I've created a sample file and will share screenshots.

It looks like this when I import the data:

TameshDoobay_1-1721240527803.png

Then, I expand the Content Column, and it looks like this:

TameshDoobay_2-1721240600870.png

From here, I know that I want to only keep Table001. It's the only table that contains the headers ITEM NO., PART NUMBER, DESCRIPTION and QTY.

TameshDoobay_3-1721240738868.png

Basically, I want some kind of advanced filtering capability to find and "keep" these tables.

Keep in mind, I need to do this en mass. IE, with multiple PDFs loaded in. 

Thanks,

Tamesh

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.