Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
marcodea
Helper II
Helper II

PDF Connector: table on multiple pages

I'm trying to import data from a PDF file, but a table bigger than one page is imported in Power Bi as multiple tables.

Is there any workaround?

 

 

1 ACCEPTED SOLUTION
v-juanli-msft
Community Support
Community Support

Hi @marcodea 

I make a test for your scenario:

I have a pdf with a large table in 23 pages (from page 1 to page 23),

1.open Edit queries, create a new blank query, 

 

2.open its Advanced editor, write the code

let
Source = Pdf.Tables(File.Contents("C:\desktop\case\5\5.10\date.pdf"), [StartPage=1, EndPage=23])
in
Source

 

3.Expand "Data" column

5.png

 

4.Filter "Kind" column to keep only "Table"

6.png

 

5.Then we will get all data in one table

7.png

I could remove some useless columns,keep only last two columns, then "Use the first row as headers"

7.png

8.png

 

Best Regards
Maggie

 

Community Support Team _ Maggie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

7 REPLIES 7
Luis_Caston
Helper III
Helper III

Dear all!

I've the same problem:
Is there a way, where this code take the last page automatically?
Because imagine that once the last page is the 23 but another is the 32.

let
Source = Pdf.Tables(File.Contents("C:\desktop\case\5\5.10\date.pdf"), [StartPage=1, EndPage=23])
in
Source

 

v-juanli-msft
Community Support
Community Support

Hi @marcodea 

I make a test for your scenario:

I have a pdf with a large table in 23 pages (from page 1 to page 23),

1.open Edit queries, create a new blank query, 

 

2.open its Advanced editor, write the code

let
Source = Pdf.Tables(File.Contents("C:\desktop\case\5\5.10\date.pdf"), [StartPage=1, EndPage=23])
in
Source

 

3.Expand "Data" column

5.png

 

4.Filter "Kind" column to keep only "Table"

6.png

 

5.Then we will get all data in one table

7.png

I could remove some useless columns,keep only last two columns, then "Use the first row as headers"

7.png

8.png

 

Best Regards
Maggie

 

Community Support Team _ Maggie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

The soultion worked. Thank you so much!! Solved a problem I have been breaking my head over for the last 4 hours 🙂

You are phenomenal!!!!!!!!!!!!!!!....just spent hours and hours trying to find the solution that you so elequently presented.....not even chatgpt could help.  But you did....Thank YOUUUUUU!!!!!!!!!!

You are a lifesaver! This just saved me hours of cleanup on multiple pdfs. Thank you!

Anonymous
Not applicable

Awesome Magie! Gold tip!

 

Working around a problem like this, without result till see this. Anyway, still having an issue: on my pdf file, the table as date on it , but on two different rows:


Example:

2019-07

23

 

when it should be 2019-07-23 on the same row. Any expert advice on how to solve this?

 

Thks!


TS

 

 

@v-juanli-msft ,

thanks for your answer, I've learned something new.

 

If I have understood well, this code doesn't solve my case since I have more the one table in the same PDF and I don't know in advance the page of each table.

 

Regards

 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.