Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join nowData Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
aws s3api --endpoint http://localhost:9878 list-objects --bucket icebergdata
In this instance, the command shows the DIM_Geographies table that was copied to the default S3 volume within the icebergdata Ozone bucket. Both the data and metadata directories for this table are accessible, which is essential for our Fabric OneLake Shortcut Table.
AI-generated content may be incorrect." />
In this example, we navigate to the virtual machine where the gateway was installed and configured. You can see my Fabric-Gateway-Ozone gateway is online and ready to communicate with my Microsoft Fabric environment.
If we navigate back to Microsoft Fabric and explore the OneLake Catalog, you will see my Cloudera-OnPrem-Data Workspace and OzoneToFabricLH Lakehouse.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
Next, we will select the OzoneToFabricLH Lakehouse to open it. Subsequently, click on the 'โฆ' next to Tables on the left side of the screen to create a new shortcut to Apache Ozone. This will enable the virtualization of our DIM_Geographies iceberg table without any data duplication.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
Next, we will create a new connection to the On-Premises Data Gateway using the Apache Ozone credentials we created earlier: AWS access key ID and AWS secret key.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
In this example, Fabric automatically detects the On-Premises Data Gateway. The user needs to provide the URL for the Ozone s3api endpoint in the Cloudera environment, the AWS access key ID, and the AWS secret key, then select Next.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
In this example, a successful connection to Ozone is demonstrated. Proceed by selecting your iceberg table directory and clicking Next. Please ensure that the data and metadata subdirectories are present, as these are required for Fabric to recognize and translate this path as an iceberg table.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
To rename the Shortcut, click on the pencil icon to make the desired changes. For this example, we will maintain all default settings and proceed by clicking Create.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
Upon the successful creation of the Shortcut, a virtualized version of the table will appear in Lakehouse within Fabric. Please observe the link icon next to the table name, which indicates that it is a Shortcut rather than a natively managed table in Fabric.
Next, choose the Table name. Fabric will execute an initial read query on the table in Ozone. To reduce network communication, Fabric caches recently queried data. As the cache reaches its limit, it replaces older result sets with those from newer queries.
With our data now integrated into Fabric, we can utilize Power BI to visualize it and develop advanced dashboards using existing data from Azure or other cloud platforms.
Virtualize_your_Cloudera_Hadoop_data_estate_into_Fabric_OneLake_with_Apache_Ozon
AI-generated content may be incorrect." />
Cloudera Ozone Doc: Introduction to Ozone
Cloudera Apache Iceberg: Apache Iceberg features
Configure Cloudera Ozone Filesystem: Working with Ozone File System (ofs)
Configure Cloudera Ozone S3 Gateway: Using Ozone S3 Gateway to work with storage elements
Create Ozone Credentials: Configure S3 credentials for working with Ozone
Migrate HDFS data to Ozone: Process of migrating the HDFS data to Ozone
Get Started with: Fabric Trial
Create Fabric Lakehouse: Bring your data to OneLake with Lakehouse
Download and Install On-premises Data Gateway
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.