Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
jasonnett
Regular Visitor

How do I save a file generated in a notebook to a lakehouse?

I've just started working with Microsoft Fabric for the first time coming from a long career in software, architecture, and data science at a much lower level--I'm definitely more comfortable with terminal than UIs for technical work.  Here's one example why:

 

I have a scenario in which I have loaded CSV data into a lakehouse, then load it into a Delta-Parquet table.  Assume I perform some transformation on that into another table, then I want to generate an Excel file (NOT a CSV file) from a python notebook using a package like `xlsxwriter`, which includes formatting and formulas in addtion to data.

 

How do I save that output `*.xlsx` file to a retrievable location, such as the same lakehouse I loaded the original CSV file to?

 

More specifically, suppose I have a notebook open in a Fabric workspace and execute:

expenses = (
    ['Rent', 1000],
    ['Gas',   100],
    ['Food',  300],
    ['Gym',    50],
)

!pip install xlsxwriter
import xlsxwriter

# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Expenses01.xlsx')
worksheet = workbook.add_worksheet()

# Some data we want to write to the worksheet.
expenses = (
    ['Rent', 1000],
    ['Gas',   100],
    ['Food',  300],
    ['Gym',    50],
)

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0

# Iterate over the data and write it out row by row.
for item, cost in (expenses):
    worksheet.write(row, col,     item)
    worksheet.write(row, col + 1, cost)
    row += 1

# Write a total using a formula.
worksheet.write(row, 0, 'Total')
worksheet.write(row, 1, '=SUM(B1:B4)')
workbook.close()


I know from this that the file is created:

# We created an Excel file, but where is it?

import os
files = [f for f in os.listdir('.') if os.path.isfile(f)]
files

because the print statement shows it:

['container_tokens',
 '.launch_container.sh.crc',
 'default_container_executor.sh',
 'default_container_executor_session.sh',
 'launch_container.sh',
 '.container_tokens.crc',
 '.default_container_executor_session.sh.crc',
 '.default_container_executor.sh.crc',
 'Expenses01.xlsx']

 But I don't know how to retrieve that file or direct it to a retrievable location.

1 ACCEPTED SOLUTION
jasonnett
Regular Visitor

Okay, I found a solution, but it doesn't strike me as a *Fabric-intended* solution.  

 

First, executing 

import os
os.getcwd()
from the notebook shows me a `/mnt/...` location, so it look like a Linux/Unix mount point.  That gave me hope that the lakehouse location also has an accessible mount.  So, if I explore
import os
os.listdir('/')
 I do indeed see a `lakehouse/` folder and a quick exploration of that reveals that it is the lakehouse location that I want to copy my generated Excel file to.  Now I can use python's `shutil` to do a file copy
 
import shutil
generated_file = 'Expenses.xlsx'
target_abfss = '/lakehouse/default/Files/Sales/Expenses.xlsx'
shutil.copyfile(generated_file, target_abfss)
 That does it.  The file I generated in the workbook is now in the lakehouse files.
 
My suggestion for any MS developers out there working on this is to make a more accessible Fabric-y way to do this kind of file transfer (`notebookutils` turned out to be a red-herring as far as I can tell).
 

View solution in original post

5 REPLIES 5
tharunkumarRTK
Super User
Super User

@jasonnett 

 

To save a file you can use this code 

import os, json
with open('/lakehouse/default/Files/<location>/file1.json','w') as f:
    json.dump(reponse.json(),f)

 

Need a Power BI Consultation? Hire me on Upwork

 

 

 

Connect on LinkedIn

 

 

 








Did I answer your question? Mark my post as a solution!
If I helped you, click on the Thumbs Up to give Kudos.

Proud to be a Super User!


PBI_SuperUser_Rank@2x.png
jasonnett
Regular Visitor

Okay, I found a solution, but it doesn't strike me as a *Fabric-intended* solution.  

 

First, executing 

import os
os.getcwd()
from the notebook shows me a `/mnt/...` location, so it look like a Linux/Unix mount point.  That gave me hope that the lakehouse location also has an accessible mount.  So, if I explore
import os
os.listdir('/')
 I do indeed see a `lakehouse/` folder and a quick exploration of that reveals that it is the lakehouse location that I want to copy my generated Excel file to.  Now I can use python's `shutil` to do a file copy
 
import shutil
generated_file = 'Expenses.xlsx'
target_abfss = '/lakehouse/default/Files/Sales/Expenses.xlsx'
shutil.copyfile(generated_file, target_abfss)
 That does it.  The file I generated in the workbook is now in the lakehouse files.
 
My suggestion for any MS developers out there working on this is to make a more accessible Fabric-y way to do this kind of file transfer (`notebookutils` turned out to be a red-herring as far as I can tell).
 

Awesome workaround @jasonnett ,

an interesting observation I had was that on the line below I couldn't put the absolute ADFSS filepath and I had to use the relative path with the default lakehouse .

target_abfss = '/lakehouse/default/Files/Sales/Expenses.xlsx'

  Therefore, I needed to set the default lakehouse first using the mechanism described in the post below:
https://community.fabric.microsoft.com/t5/General-Discussion/How-to-set-default-Lakehouse-in-the-not...

 

I was wondering if anyone had a better way of doing this.

jasonnett
Regular Visitor

Okay, I found

NotebookUtils (former MSSparkUtils) for Fabric - Microsoft Fabric | Microsoft Learn

but I can't get that to work yet either.

 

The file I'm generating and want to move to a lakehouse starts its life on a Spark cluster, so there's no ABFS path.

jasonnett
Regular Visitor

I realize that compute and storage are treated separately in a platform like Fabric.  So, I guess the question could also be reframed as: if you generate any generic blob file in a notebook environment, how do you post it to the storage location corresponding to the lakehouse files? 

I tried using python's standard `requests` package, but I was denied because of a missing bearer token that I don't have.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.