Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
VIvMouret
Frequent Visitor

Download Lakehouse Files

Hi everyone!

From a notebook in the default Spark environment, I save a table in the same lakehouse as a file
I need to create an application that retrieves existing files from the Lakehouse files, in order to extract them, download them, etc...

To do this, I created and monitored the creation of an API for GraphQL, and the creation of a single-page application using React NodeJS.

VIvMouret_0-1737969481422.png

 

Am I missing something to achieve my goal here, or do I have what I need?
While digging around in the topics, I came across these two potential solutions :

  • Azure Data Lake Storage Gen2
  • virtual network or local data gateways

What do you think I should use ? And if I'm wrong, what should I do instead ?
Thank you in advance !

1 ACCEPTED SOLUTION
VIvMouret
Frequent Visitor

Update of my problem:
All the technologies I've seen above (unless I'm mistaken) only serve to update and send new data to a Lakehouse, and that's not what we're looking for here.

To answer my question, which is :
I need to make the files I've created available to colleagues

Well, to do this, I changed direction: OneLake

I had to assign specific and restricted roles, install OneLake on all the machines so that each PC then had access to the desired folder
By pushing the files into a new Lakehouse, a new workspace and managing the roles, I was able to export the data.

However, I'm only keeping all this for development, I wouldn't recommend it for production.

I have other ideas for putting the export of Lakehouse files into production, such as with Power Pages, but as I can't find any way of extracting the files at the moment, the idea has been put off.

If anyone comes up with an alternative method, don't hesitate !

View solution in original post

9 REPLIES 9
VIvMouret
Frequent Visitor

Update of my problem:
All the technologies I've seen above (unless I'm mistaken) only serve to update and send new data to a Lakehouse, and that's not what we're looking for here.

To answer my question, which is :
I need to make the files I've created available to colleagues

Well, to do this, I changed direction: OneLake

I had to assign specific and restricted roles, install OneLake on all the machines so that each PC then had access to the desired folder
By pushing the files into a new Lakehouse, a new workspace and managing the roles, I was able to export the data.

However, I'm only keeping all this for development, I wouldn't recommend it for production.

I have other ideas for putting the export of Lakehouse files into production, such as with Power Pages, but as I can't find any way of extracting the files at the moment, the idea has been put off.

If anyone comes up with an alternative method, don't hesitate !

nilendraFabric
Community Champion
Community Champion

@VIvMouret 

Try this approach instead of Graphql :

 

Use the Microsoft Fabric REST API. This allows you to programmatically access and download files by authenticating your app via Microsoft Entra (Azure AD). The process involves:

  1. App Registration: Register your application in Microsoft Entra ID, assign permissions (e.g., Lakehouse.Read.All), and obtain credentials (Client ID, Client Secret, Tenant ID).
  2. Authentication: Use these credentials to obtain an access token for API requests.
  3. File Access: Use the Fabric REST API to list files in the Lakehouse and construct URLs for downloading them.

https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-api



Something like this :

import requests

file_url = "https://onelake.dfs.fabric.microsoft.com/<workspace>/<lakehouse>.lakehouse/path/to/file"
headers = {"Authorization": f"Bearer {token}"}

response = requests.get(file_url, headers=headers)
if response.status_code == 200:
with open("downloaded_file", "wb") as file:
file.write(response.content)
print("File downloaded successfully!")
else:
print(f"Failed to download file: {response.status_code}")



See if this is working 

Thanks 

@nilendraFabric

I've already followed your first paragraph as a POC

To give you an idea of my code, I made this documentation at the beginning:

https://learn.microsoft.com/en-us/fabric/data-engineering/connect-apps-api-graphql
Then I moved on to this one as I was asked along the way:
https://learn.microsoft.com/en-us/entra/identity-platform/tutorial-single-page-app-react-prepare-spa...

 

The documentation link you sent me doesn't include a query for downloading a file from a Lakehouse,
And even less in Python, I'm in NodeJS...

nilendraFabric
Community Champion
Community Champion

@VIvMouret Please accept the solution if this resolves your query, as it will help community to find the answer quickly

nilendraFabric
Community Champion
Community Champion

Hi @VIvMouret 

In Microsoft Fabric, data in your Lakehouse is automatically stored in OneLake (backed by Azure Data Lake Storage Gen2). Since you already created a GraphQL API layer, you should be able to query and download files through that endpoint. Your single-page React application can call the GraphQL API to list, extract, and download Lakehouse files.If your application and the API run within Microsoft Fabric or an environment that has direct access to Fabric resources, you do not need additional services.

https://learn.microsoft.com/en-us/fabric/data-engineering/connect-apps-api-graphql

https://community.fabric.microsoft.com/t5/Data-Science/How-to-get-lakehouse-files-into-Azure-Functio...

 

 

OneLake provides open access to Fabric items using ADLS Gen2-compatible APIs:

Assign appropriate permissions to your application in Azure (e.g., "Storage Blob Data Reader" role for ADLS Gen2)

Thanks 

thank you for your quick reply!
I'm going to test and try with ADLS Gen-2 as I already have the GraphQL API ready.

Thanks @VIvMouret please keep me posted. This is intresting to learn about this usecase.

I've just tested your code, but I can't access the "files" properties

VIvMouret_1-1737976415079.png

When I search in "Get data", I don't have direct access to the Lakehouse files
Will I normally be able to view the files in the Lakehouse?

VIvMouret_0-1737976842713.png

Because I can't see them..

I still need to check that I have read-only permissions for the API

Hi @VIvMouret 

I have tried the Graphql query too, its not working. So you are correct here that it is not supported, I am trying few other things will share soon.

And it seems like you have only access to tables from GraphQL api , not to files. SO we have to figure out different approach to query files.

Thanks


Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

May FBC25 Carousel

Fabric Monthly Update - May 2025

Check out the May 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.