cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
myon
Frequent Visitor

Iceberg, hudi for Fabric?

As a business user I love the seemlessness of fabric and I realize its potential. User experience is a big deal, and it seems really polished in Fabric. I started digging and asking our architects about our current mesh journey.

To my dissapointment it seems that Fabric is locked into delta lake extension of parquet? Is this going to be extended to support Iceberg/Hudi and be cloud store agnostic?

This is the answer I got from our architect:

Atleast from the data mesh perspective, we are trying to be technology agnostic. Every year, there will always be something new and something trend setting. Datamesh needs to adopt it or incorporate it and continue as an operating model which is not stuck on the tech.

1 ACCEPTED SOLUTION
cmaneu
Microsoft
Microsoft

Delta Lake format also supports ACID, and time travel can be implemented as well.

I haven't tested it myself, but I'm pretty sure you can read iceberg/parquet files stored into your S3 from OneLake with a Fabric Notebook. What you won't be able to do is to mount it as a Table in the Fabric world. But you can imagine having a bronze layer stored in S3 as Iceberg/parquet, and your upper layers stored in Fabric/Onelake as Delta.

 

About your question regarding Shortcuts, you can make a shortcut to an S3 Bucket either at the root of the bucket, or to a specific folder. The only limitation if you don't have delta-parquet file in you rS3 is that you won't be able to make that shortcut on the tables folder of your Fabric Lakehouse.

 

Here is a poster about OneLake shortcuts (from https://aka.ms/fabric-notes)

cmaneu_0-1685353994023.png

 

View solution in original post

4 REPLIES 4
cmaneu
Microsoft
Microsoft

Delta Lake format also supports ACID, and time travel can be implemented as well.

I haven't tested it myself, but I'm pretty sure you can read iceberg/parquet files stored into your S3 from OneLake with a Fabric Notebook. What you won't be able to do is to mount it as a Table in the Fabric world. But you can imagine having a bronze layer stored in S3 as Iceberg/parquet, and your upper layers stored in Fabric/Onelake as Delta.

 

About your question regarding Shortcuts, you can make a shortcut to an S3 Bucket either at the root of the bucket, or to a specific folder. The only limitation if you don't have delta-parquet file in you rS3 is that you won't be able to make that shortcut on the tables folder of your Fabric Lakehouse.

 

Here is a poster about OneLake shortcuts (from https://aka.ms/fabric-notes)

cmaneu_0-1685353994023.png

 

myon
Frequent Visitor

Thanks, this is good information.

cmaneu
Microsoft
Microsoft

Hello @myon,

Thanks for starting the discussion. 
First, when you said "locked", we need to clarify a few things: 
- The parquet format is open, as the delta format is open too Home | Delta Lake

- You can read and write other type of file formats within Fabric. But yes, all the storage for Fabric engines is written in the delta/parquet format
- You can integrate S3 buckets and soon Google Cloud Storage into OneLake (Fabric Storage) through shortcuts.
- You can implement a data mesh architecture with Fabric, based on an open lake (accessible APIs and open file format).

- From what I understand from Iceberg

That being said, I woud like to understand more of your interest on the Iceberg format, and why choosing it over the parquet format (which are both open formats from Apache, and are note tailored for the same usage). Also, You can submit that ask to the Fabric Ideas section of this site, and get people to vote on your idea :).

myon
Frequent Visitor

Iceberg is a direct competitor to Delta Lake in my understanding. They both sit on top of parquet files (or in the case of Iceberg they can be other columnar store files, like ORC) and give ACID, timetravel etc etc. I see the potential in Fabric and but I am wondering if it is compatible with our S3 + Iceberg+parquet setup.

 

A genuine question I have right now is, does OneLake need only the parquet files in S3 to make shortcuts or does it excplicitly need delta lake to be present in the S3 instance?

Helpful resources

Announcements
Welcome to the Data Factory (preview) Community

Join Today

Learn more about the Data Factory (preview) Community and how you can participate.

Get Help with Data Factory in the General Discussion Forum

General Discussion Forum

Ask your questions about Data Factory here!

Webinars and Video Gallery

Webinars and Video Gallery

Learn more about Data Factory through webinars and short videos.

Carousel_Build_v2

Check out the latest on analytics in Microsoft Fabric!

Microsoft Fabric brings everything you need into one analytics solution.

Top Solution Authors