The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi,
I have a folder with multiple PDFs. The page lengths are all different. I'm trying to power query the last table in each PDF but each table number is different due to the page lengths/content (columns are the same). So far, the best I can do is navigate into a PDF, sort the tables by descending, remove all but the top table (ie. the last table). This only works for PDFs that are the same length/table number. The other PDFs return an error as its not grabbing the correct table (table 19 on this PDF but the last table on other PDFs could be table 16 or something). I've manually done a few files and then appended the data but this isnt a solution for lots of files. Any ideas how to solve this?
The code so far is here:
Thanks
Solved! Go to Solution.
Update: I used Group By to group the max table names by each source name as I needed. But the table numbers are just text and i need to get the data from each now. Any ideas?
Edit: I went to the Transform Sample File and sorted the table IDs from highest to lowest and then removed all but the top row. This applied the transform to the other sheets so it seems to be working!
Thanks
Hi @asdf123 ,
Suggest method:
Regards
KT
I got an error on Id=page and unsure on try otherwise/skip error rows. What custom column formula could i use to filter each source.name to their max table names? I think Table.Max might work but not sure how to incorporate.
So i used List.Max but its showing the max table from all sheets. Is there a way to get it to show 0 if not the max and 1 if the max for each source name? From there i can filter by 1 and then open each data table i believe
Update: I used Group By to group the max table names by each source name as I needed. But the table numbers are just text and i need to get the data from each now. Any ideas?
Edit: I went to the Transform Sample File and sorted the table IDs from highest to lowest and then removed all but the top row. This applied the transform to the other sheets so it seems to be working!
Thanks
Hi @asdf123 ,
The provided information is not sufficient to provide much guidance as every PDF files are different.
One suggestion if you're trying to transform the tables from PDF which doesn't come in equal rows and columns. You can try "page" instead of "table". Pick a small sample and transform it, then turn it into a custom function and apply it to all pages. You might need to refine the transformation steps repeatedly until you reach your desired outcome.
Regards
KT
So if I try page, I get an error for PDFs that arent the same page lengths. As I need the last page in each PDF. If my sample is 6 pages and another is 4, it gives an error as it can't pull page 6 from 4 pages. What function would you suggest that could do this? Heres the current function:
Ok so I think I'm getting closer here.. I tried transforming all the sheets and now need to somehow filter the last table on each page (likely by adding a conditional column to do this). Any ideas?