- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Extracting tables from multiple PDFs
Hello,
I am trying to extract the same table from multiple PDFs within Power BI. These PDF's are stored in a SharePoint Folder, and whilst the tables are in the same format in each PDF, they are not on the same page or table number.
I have been able to connect Power BI to the appropriate SP folder and load in the PDF files I need, but I'm strugglign to figure out a way to automatically extract the relevant tables (other than going through each file and filtering through manually).
Is there a way I can extract specific tables based on certain criteria within the tables, i.e. every data table I want to extract has the first column titled 'example'.
Thank you in advance for any help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You might want to look into Power Automate Desktop. It will let you run actions with regular expressions on PDFs and extract tables to Excel. The regular expressions are what you use for things like "get this thing wherever you see the word xyz or where the text fits this pattern".
It's usually used for files on your PC, but you could probably sync the library from SharePoint or something to run it local (make sure to check in settings and tell it to pull it all down vs storing in cloud).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Christine,
Thanks for the advice, sounds like a great idea. Something new to learn too!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sure! There's all kinds of AI solutions in the cloud too with AI Builder/Syntex, PAD is kind of the budget option. I only mention it because it's got a low barrier to entry because there's no need to request a license for it - usually it's already installed with Windows.

Helpful resources
User | Count |
---|---|
32 | |
19 | |
14 | |
11 | |
10 |