Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
VenDaFabricator
Frequent Visitor

Writing Base64 string as file into Onelake /Files/<folder>

I has this existing solution in synapse notebook wherein was getting base64 string from a db table and writing it as a 'File' in Blob Storage (ADLS Gen2), existing code base in notebook is a below:

 

*was using sas token for connection string*

 

# Initialize Azure Blob Service Client
  connection_string = "DefaultEndpointsProtocol=https;AccountName=xxxxx;AccountKey=xxxxxxx;EndpointSuffix=core.windows.net"  # Replace with your connection string
  container_name = "sandpit/Attachments"  # Replace with your container name
  blob_service_client = BlobServiceClient.from_connection_string(connection_string)

  def write_file_to_blob(data, filename):
     blob_client = blob_service_client.get_blob_client(container=container_name,  blob=filename)
     blob_client.upload_blob(data, overwrite=True)

   # UDF to decode base64
   def decode_base64(base64_str):
      return base64.b64decode(base64_str)

   # Register UDF
   decode_udf = udf(decode_base64, BinaryType())

 

and above def's being consumed as :

 

    collected_data = df_with_decoded_data.collect()

# Write each file to blob storage
for row in collected_data:
    write_file_to_blob(row['DecodedData'], row['FinalFileName'])

 

this all works fine.  
Now i wanted to update this solution to write to fabric onelake/workspace/Files/<customFolder>

what sort of credentials are there for onelake to be passed as ... any code example would be great



1 ACCEPTED SOLUTION
VenDaFabricator
Frequent Visitor

I managed to get it functioning because the user account running the notebook had complete access to the /Files directory. Since this was a one-time task, I didn't proceed to integrate it with a Service Principal or Managed Identity.

 

The following code worked well for me:

 

collected_data = df_with_decoded_data.collect()

base_dir = "/lakehouse/default/Files/attachments"  # File API Path
# Ensure the base directory exists
os.makedirs(base_dir, exist_ok=True)

# Write the decoded bytes to the file
for item in collected_data:
    # Construct the filename using AttachmentId and FileName
    filename = item["AttachmentId"] + item["FileName"][-4:]
    # Full path for the file
    file_path = os.path.join(base_dir, filename)

    # Ensure the directory for the file exists (in case the filename includes subdirectories)
    os.makedirs(os.path.dirname(file_path), exist_ok=True)

    # Write the Body content to the file
    with open(file_path, "wb") as file:
        file.write(item["DecodedData"])

 

View solution in original post

4 REPLIES 4
VenDaFabricator
Frequent Visitor

I managed to get it functioning because the user account running the notebook had complete access to the /Files directory. Since this was a one-time task, I didn't proceed to integrate it with a Service Principal or Managed Identity.

 

The following code worked well for me:

 

collected_data = df_with_decoded_data.collect()

base_dir = "/lakehouse/default/Files/attachments"  # File API Path
# Ensure the base directory exists
os.makedirs(base_dir, exist_ok=True)

# Write the decoded bytes to the file
for item in collected_data:
    # Construct the filename using AttachmentId and FileName
    filename = item["AttachmentId"] + item["FileName"][-4:]
    # Full path for the file
    file_path = os.path.join(base_dir, filename)

    # Ensure the directory for the file exists (in case the filename includes subdirectories)
    os.makedirs(os.path.dirname(file_path), exist_ok=True)

    # Write the Body content to the file
    with open(file_path, "wb") as file:
        file.write(item["DecodedData"])

 

Hi @VenDaFabricator ,

Glad to know you got resolution for your query.
Please continue using Fabric Community for your further queries.

v-gchenna-msft
Community Support
Community Support

Hi @VenDaFabricator ,

Thanks for using Fabric Community.
As I understand you were using Azure Synapse Analytics and would like to write data in One Lake from Azure Synapse Notebooks. In order to acheive this we need to use Fabric Rest API.

You can refer this document for implementation details -
On-Premise Python Code to Load Data to Microsoft Fabric Lakehouse | by Amit Chandak | Medium

Additional References:
Access OneLake with Python - Microsoft Fabric | Microsoft Learn

We can also integrate Fabric Onelake in Azure Synapse Analytics - Integrate OneLake with Azure Synapse Analytics - Microsoft Fabric | Microsoft Learn 

Hope this is helpful. Please let me know incase of further queries.

Hi @VenDaFabricator ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Expanding the Synapse Forums

New forum boards available in Synapse

Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

Top Solution Authors
Top Kudoed Authors