Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateJoin the OneLake & Platform Admin teams for an ask US anything on July 16th. Join now.
I have a collection of PDFs from various sources and need to extract data into OneLake using Fabric.
How can I do that? Are there any documentation or steps I should follow?
Solved! Go to Solution.
Assuming the sources are supported by either Dataflow Gen 2 or datapipelines, you can extract data from PDF via dataflow Gen 2 .
Below blog explains the same:
https://datasharkx.wordpress.com/2023/12/03/read-and-import-data-from-pdf-file-using-msft-fabric/
Assuming the sources are supported by either Dataflow Gen 2 or datapipelines, you can extract data from PDF via dataflow Gen 2 .
Below blog explains the same:
https://datasharkx.wordpress.com/2023/12/03/read-and-import-data-from-pdf-file-using-msft-fabric/
Hi @esraaE
First extract the data, you might consider using a tool or library to extract the data from the PDF.
After extracting the data, you may need to clean the data and convert it into a suitable format (for example, CSV, JSON) to load into OneLake.
Here are some information about connecting to onelake:
How do I connect to OneLake? - Microsoft Fabric | Microsoft Learn
Options to get data into the Lakehouse - Microsoft Fabric | Microsoft Learn
Access OneLake with Python - Microsoft Fabric | Microsoft Learn
Regards,
Nono Chen
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |