Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
SuperFiets_
Helper I
Helper I

Ingesting .xml files from SharePoint library into Lakehouse

Hi everyone,


I was looking for some guidance and some thoughts on best practises around using Fabric, specifically Lakehouses/Warehouses and ingesting data with Dataflow Gen2 or other options if you can recommend any.

I want to ingest .xml files stored in a SharePoint document library into the Lakehouse. The only way I find this is possible is using a Dataflow Gen2, getting the sharepoint library and opening the .xml files binary content with Xml.Tables. Then expanding all the needed columns and thus loading in all the tables that come forth of the .xml file into the Lakehouse.

 

Does anyone know of some other way to parse these .xml files (a lot) faster than using Dataflow Gen2? What are my options? Would using OneLake help me getting/loading the files faster, should I use PySpark to get the tables from the .xml or maybe even SQL (we are talking about hundreds of .xml files ranging from 1mb to 100mb to 5gb)?

 

Would love to get some extra eyes and experiences on this 🙂

 

Thanks!

1 ACCEPTED SOLUTION

Hi @SuperFiets_ 
You cannot use the above scenario for a sharepoint folder. Best option would be to use Dataflow Gen2.
Thanks

View solution in original post

4 REPLIES 4
v-nikhilan-msft
Community Support
Community Support

Hi @SuperFiets_ 
Thanks for using Fabric Community.
You can refer to this link for using pipelines :
Microsoft Fabric - Ingest XML into Lakehouse - Dan Ambler

Hope this helps. Please let me know if you have any further questions.

Hi,

 

Thanks for your response!

 

How would I be able to implement the use case you specified into a pipeline that gets the data from SharePoint instead of an API? And more importantly would this be able to work for multiple .xml files?

Hi @SuperFiets_ 
You cannot use the above scenario for a sharepoint folder. Best option would be to use Dataflow Gen2.
Thanks

Thank you, in that case I will be using dataflows for now.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors