Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
SuperFiets_
Helper I
Helper I

Ingesting .xml files from SharePoint library into Lakehouse

Hi everyone,


I was looking for some guidance and some thoughts on best practises around using Fabric, specifically Lakehouses/Warehouses and ingesting data with Dataflow Gen2 or other options if you can recommend any.

I want to ingest .xml files stored in a SharePoint document library into the Lakehouse. The only way I find this is possible is using a Dataflow Gen2, getting the sharepoint library and opening the .xml files binary content with Xml.Tables. Then expanding all the needed columns and thus loading in all the tables that come forth of the .xml file into the Lakehouse.

 

Does anyone know of some other way to parse these .xml files (a lot) faster than using Dataflow Gen2? What are my options? Would using OneLake help me getting/loading the files faster, should I use PySpark to get the tables from the .xml or maybe even SQL (we are talking about hundreds of .xml files ranging from 1mb to 100mb to 5gb)?

 

Would love to get some extra eyes and experiences on this 🙂

 

Thanks!

1 ACCEPTED SOLUTION

Hi @SuperFiets_ 
You cannot use the above scenario for a sharepoint folder. Best option would be to use Dataflow Gen2.
Thanks

View solution in original post

4 REPLIES 4
v-nikhilan-msft
Community Support
Community Support

Hi @SuperFiets_ 
Thanks for using Fabric Community.
You can refer to this link for using pipelines :
Microsoft Fabric - Ingest XML into Lakehouse - Dan Ambler

Hope this helps. Please let me know if you have any further questions.

Hi,

 

Thanks for your response!

 

How would I be able to implement the use case you specified into a pipeline that gets the data from SharePoint instead of an API? And more importantly would this be able to work for multiple .xml files?

Hi @SuperFiets_ 
You cannot use the above scenario for a sharepoint folder. Best option would be to use Dataflow Gen2.
Thanks

Thank you, in that case I will be using dataflows for now.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors