Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes! Register now.

Reply
-_
Regular Visitor

Reading multiple files from Sharepoint (with different schema) to Lakehouse

In a DataFlow2 I have connected to a sharepoint folder and filtered on the Path column to just get the files I want.

 

__0-1745671995709.png

 

Now I want to load the content of each of those csv files and save it to a table on the Lakehouse.

 

By right clicking on the csv_list query and clicking "reference", I made three other queries.

 

I click '[Binary]' on one of the files and it opens it up and loads it as a table.

 

__2-1745672126171.png

 

I set up the Data destination for these 3 extra queries as the lakehouse and I think I'm done.

 

I tried saving the DataFlow and at first I got a notification that it failed to save. Then a couple of minutes later it says it saved successfully. But when I tried to run the flow, it failed to run because of errors on the 3 queries that "reference" the one pulling from sharepoint. I opened the DataFlow up and it had somehow added a "Remove Columns" thing which got rid of the Content column with the binaries.

 

 

__1-1745672068228.png

 

I removed this remove columns thing and saved it again, but the same as the above happened...

 

Am I approaching this completely wrong? How can I get it to work?

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

Since you need to manually reference (Power Query does not support dynamic query addition) you can just as well create three separate queries without the reference.

 

The "Remove Columns" step is "helpfully"  making sure that only basic column types are returned.  "Binary"  is unsupported.

View solution in original post

5 REPLIES 5
v-karpurapud
Community Support
Community Support

Hi @-_ 

Thank you for reaching out to the Microsoft Fabric Community Forum.

Thank you @lbendlin  for your response.

Regarding  issue with reading multiple files from SharePoint to Lakehouse. To address this, please follow the steps below:

  1. Connect to the SharePoint Online Folder: Filter the files to include onlyVBAP.csv, VBAK.csv, and VBRP.csv, making sure to keep the Content (binary) column intact.
  2. Create Referenced Queries: For each file (VBAP, VBAK, VBRP), right-click csv_list and select "Reference."
  3. Expand the Content Column: In each referenced query, click on [Binary], select Text/CSV format, and promote headers if necessary.
  4. Transform Each Query Independently: Handle schema differences by setting correct data types and renaming columns as needed.
  5. Set Lakehouse Destination: Create a new table for each query (e.g., table_VBAP, table_VBAK, table_VBRP) and choose the appropriate update method (Replace or Append).

If this response resolves your query, please mark it as the Accepted Solution to assist other community members. Additionally, a Kudos is appreciated if you found the response helpful.

Thank you!

This is exactly what I was already doing but was getting the issue I described. I think I will have to do what Ibendlin said and do separate flows for each CSV unless there's another way?

Hi @-_ 

Thanks for the update. Hope your issue gets resolved soon. when it does, please share the insights here and mark it or any other helpful answer 'Accept as Solution', which will help others with similar queries.

Thank you.

 

 

lbendlin
Super User
Super User

Since you need to manually reference (Power Query does not support dynamic query addition) you can just as well create three separate queries without the reference.

 

The "Remove Columns" step is "helpfully"  making sure that only basic column types are returned.  "Binary"  is unsupported.

-_
Regular Visitor

Thank you, I was trying to avoid that because there will be many more CSVs than just these, but I think it will be the simplest solution for now. Would it work to put lots of these individual dataflows into one data pipeline, or will they need one pipeline each? (the CSVs are updated daily so will need to schedule it for a regular refresh)

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Fabric Update Carousel

Fabric Monthly Update - September 2025

Check out the September 2025 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.