Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.
I'm trying to import data from a PDF file, but a table bigger than one page is imported in Power Bi as multiple tables.
Is there any workaround?
Hi @marcodea
I make a test for your scenario:
I have a pdf with a large table in 23 pages (from page 1 to page 23),
1.open Edit queries, create a new blank query,
2.open its Advanced editor, write the code
let
Source = Pdf.Tables(File.Contents("C:\desktop\case\5\5.10\date.pdf"), [StartPage=1, EndPage=23])
in
Source
3.Expand "Data" column
4.Filter "Kind" column to keep only "Table"
5.Then we will get all data in one table
I could remove some useless columns,keep only last two columns, then "Use the first row as headers"
Best Regards
Maggie
Community Support Team _ Maggie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
You are a lifesaver! This just saved me hours of cleanup on multiple pdfs. Thank you!
Awesome Magie! Gold tip!
Working around a problem like this, without result till see this. Anyway, still having an issue: on my pdf file, the table as date on it , but on two different rows:
Example:
2019-07
23
when it should be 2019-07-23 on the same row. Any expert advice on how to solve this?
Thks!
TS
V @v-juanli-msft ,
thanks for your answer, I've learned something new.
If I have understood well, this code doesn't solve my case since I have more the one table in the same PDF and I don't know in advance the page of each table.
Regards
Check out the November 2023 Power BI update to learn about new features.
Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.
Join us for a free, hands-on Microsoft workshop led by women trainers for women where you will learn how to build a Dashboard in a Day!