Solved: Re: extract data into OneLake using Fabric

esraaE · ‎07-14-2024

I have a collection of PDFs from various sources and need to extract data into OneLake using Fabric.
How can I do that? Are there any documentation or steps I should follow?

NandanHegde · ‎07-14-2024

Assuming the sources are supported by either Dataflow Gen 2 or datapipelines, you can extract data from PDF via dataflow Gen 2 .

Below blog explains the same:

https://datasharkx.wordpress.com/2023/12/03/read-and-import-data-from-pdf-file-using-msft-fabric/

----------------------------------------------------------------------------------------------
Nandan Hegde (MSFT Data MVP)
LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66
GitHUB Profile : https://github.com/NandanHegde15
Twitter Profile : @nandan_hegde15
MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942
Topmate : https://topmate.io/nandan_hegde
Blog :https://datasharkx.wordpress.com

View solution in original post

NandanHegde · ‎07-14-2024

Assuming the sources are supported by either Dataflow Gen 2 or datapipelines, you can extract data from PDF via dataflow Gen 2 .

Below blog explains the same:

https://datasharkx.wordpress.com/2023/12/03/read-and-import-data-from-pdf-file-using-msft-fabric/

----------------------------------------------------------------------------------------------
Nandan Hegde (MSFT Data MVP)
LinkedIn Profile : www.linkedin.com/in/nandan-hegde-4a195a66
GitHUB Profile : https://github.com/NandanHegde15
Twitter Profile : @nandan_hegde15
MSFT MVP Profile : https://mvp.microsoft.com/en-US/MVP/profile/8977819f-95fb-ed11-8f6d-000d3a560942
Topmate : https://topmate.io/nandan_hegde
Blog :https://datasharkx.wordpress.com

esraaE · ‎08-14-2024

How can I write to my Fabric Lakehouse via external app or Systems?

We are currently integrating multiple systems and need to push files directly into the Fabric Lakehouse. However, we would like to avoid using notebooks or pipelines for this process. Additionally, we anticipate adding more systems in the future and want to ensure the solution can scale accordingly.

Could you please advise on the best approach to achieve this?