Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
TH_2342
New Member

Extracting tables from multiple PDFs

Hello,

 

I am trying to extract the same table from multiple PDFs within Power BI. These PDF's are stored in a SharePoint Folder, and whilst the tables are in the same format in each PDF, they are not on the same page or table number. 

 

I have been able to connect Power BI to the appropriate SP folder and load in the PDF files I need, but I'm strugglign to figure out a way to automatically extract the relevant tables (other than going through each file and filtering through manually). 

 

Is there a way I can extract specific tables based on certain criteria within the tables, i.e. every data table I want to extract has the first column titled 'example'. 

 

Thank you in advance for any help. 

3 REPLIES 3
christinepayton
Super User
Super User

You might want to look into Power Automate Desktop. It will let you run actions with regular expressions on PDFs and extract tables to Excel. The regular expressions are what you use for things like "get this thing wherever you see the word xyz or where the text fits this pattern". 

 

It's usually used for files on your PC, but you could probably sync the library from SharePoint or something to run it local (make sure to check in settings and tell it to pull it all down vs storing in cloud). 

Hey Christine, 

 

Thanks for the advice, sounds like a great idea. Something new to learn too!! 

Sure! There's all kinds of AI solutions in the cloud too with AI Builder/Syntex, PAD is kind of the budget option. I only mention it because it's got a low barrier to entry because there's no need to request a license for it - usually it's already installed with Windows. 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.