Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
jasonnett
Regular Visitor

How do I save a file generated in a notebook to a lakehouse?

I've just started working with Microsoft Fabric for the first time coming from a long career in software, architecture, and data science at a much lower level--I'm definitely more comfortable with terminal than UIs for technical work.  Here's one example why:

 

I have a scenario in which I have loaded CSV data into a lakehouse, then load it into a Delta-Parquet table.  Assume I perform some transformation on that into another table, then I want to generate an Excel file (NOT a CSV file) from a python notebook using a package like `xlsxwriter`, which includes formatting and formulas in addtion to data.

 

How do I save that output `*.xlsx` file to a retrievable location, such as the same lakehouse I loaded the original CSV file to?

 

More specifically, suppose I have a notebook open in a Fabric workspace and execute:

expenses = (
    ['Rent', 1000],
    ['Gas',   100],
    ['Food',  300],
    ['Gym',    50],
)

!pip install xlsxwriter
import xlsxwriter

# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('Expenses01.xlsx')
worksheet = workbook.add_worksheet()

# Some data we want to write to the worksheet.
expenses = (
    ['Rent', 1000],
    ['Gas',   100],
    ['Food',  300],
    ['Gym',    50],
)

# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0

# Iterate over the data and write it out row by row.
for item, cost in (expenses):
    worksheet.write(row, col,     item)
    worksheet.write(row, col + 1, cost)
    row += 1

# Write a total using a formula.
worksheet.write(row, 0, 'Total')
worksheet.write(row, 1, '=SUM(B1:B4)')
workbook.close()


I know from this that the file is created:

# We created an Excel file, but where is it?

import os
files = [f for f in os.listdir('.') if os.path.isfile(f)]
files

because the print statement shows it:

['container_tokens',
 '.launch_container.sh.crc',
 'default_container_executor.sh',
 'default_container_executor_session.sh',
 'launch_container.sh',
 '.container_tokens.crc',
 '.default_container_executor_session.sh.crc',
 '.default_container_executor.sh.crc',
 'Expenses01.xlsx']

 But I don't know how to retrieve that file or direct it to a retrievable location.

1 ACCEPTED SOLUTION
jasonnett
Regular Visitor

Okay, I found a solution, but it doesn't strike me as a *Fabric-intended* solution.  

 

First, executing 

import os
os.getcwd()
from the notebook shows me a `/mnt/...` location, so it look like a Linux/Unix mount point.  That gave me hope that the lakehouse location also has an accessible mount.  So, if I explore
import os
os.listdir('/')
 I do indeed see a `lakehouse/` folder and a quick exploration of that reveals that it is the lakehouse location that I want to copy my generated Excel file to.  Now I can use python's `shutil` to do a file copy
 
import shutil
generated_file = 'Expenses.xlsx'
target_abfss = '/lakehouse/default/Files/Sales/Expenses.xlsx'
shutil.copyfile(generated_file, target_abfss)
 That does it.  The file I generated in the workbook is now in the lakehouse files.
 
My suggestion for any MS developers out there working on this is to make a more accessible Fabric-y way to do this kind of file transfer (`notebookutils` turned out to be a red-herring as far as I can tell).
 

View solution in original post

5 REPLIES 5
tharunkumarRTK
Super User
Super User

@jasonnett 

 

To save a file you can use this code 

import os, json
with open('/lakehouse/default/Files/<location>/file1.json','w') as f:
    json.dump(reponse.json(),f)

 

Need a Power BI Consultation? Hire me on Upwork

 

 

 

Connect on LinkedIn

 

 

 








Did I answer your question? Mark my post as a solution!
If I helped you, click on the Thumbs Up to give Kudos.

Proud to be a Super User!


PBI_SuperUser_Rank@2x.png
jasonnett
Regular Visitor

Okay, I found a solution, but it doesn't strike me as a *Fabric-intended* solution.  

 

First, executing 

import os
os.getcwd()
from the notebook shows me a `/mnt/...` location, so it look like a Linux/Unix mount point.  That gave me hope that the lakehouse location also has an accessible mount.  So, if I explore
import os
os.listdir('/')
 I do indeed see a `lakehouse/` folder and a quick exploration of that reveals that it is the lakehouse location that I want to copy my generated Excel file to.  Now I can use python's `shutil` to do a file copy
 
import shutil
generated_file = 'Expenses.xlsx'
target_abfss = '/lakehouse/default/Files/Sales/Expenses.xlsx'
shutil.copyfile(generated_file, target_abfss)
 That does it.  The file I generated in the workbook is now in the lakehouse files.
 
My suggestion for any MS developers out there working on this is to make a more accessible Fabric-y way to do this kind of file transfer (`notebookutils` turned out to be a red-herring as far as I can tell).
 

Awesome workaround @jasonnett ,

an interesting observation I had was that on the line below I couldn't put the absolute ADFSS filepath and I had to use the relative path with the default lakehouse .

target_abfss = '/lakehouse/default/Files/Sales/Expenses.xlsx'

  Therefore, I needed to set the default lakehouse first using the mechanism described in the post below:
https://community.fabric.microsoft.com/t5/General-Discussion/How-to-set-default-Lakehouse-in-the-not...

 

I was wondering if anyone had a better way of doing this.

jasonnett
Regular Visitor

Okay, I found

NotebookUtils (former MSSparkUtils) for Fabric - Microsoft Fabric | Microsoft Learn

but I can't get that to work yet either.

 

The file I'm generating and want to move to a lakehouse starts its life on a Spark cluster, so there's no ABFS path.

jasonnett
Regular Visitor

I realize that compute and storage are treated separately in a platform like Fabric.  So, I guess the question could also be reframed as: if you generate any generic blob file in a notebook environment, how do you post it to the storage location corresponding to the lakehouse files? 

I tried using python's standard `requests` package, but I was denied because of a missing bearer token that I don't have.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.