March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi Community,
I'm seeking suggestions on how to best organize workspaces and lakehouses in Microsoft Fabric for a Medallion Architecture data engineering workflow. I have multiple data sources, including SharePoint Lists, SQL Server, Parquet, AWS RDS (Oracle), and SAP.
Here are the three approaches I'm considering:
Approach 1:
- One workspace per Medallion layer (Raw, Silver, Gold), with separate lakehouses for each data source.
- For DEV, QA, and PROD environments, this results in 9 workspaces with multiple lakehouses (e.g., SAP, RDS, SharePoint).
Approach 2:
- One workspace per data source (e.g., SAP, RDS), with separate lakehouses for each medallion layer within each workspace.
- For DEV, QA, and PROD environments, this results in a total of 3 * (number of data sources) workspaces.
Approach 3:
- One workspace per Medallion layer, with a single lakehouse for all data sources.
- Data is organized using schemas within the lakehouse for different sources.
- Similar to Approach 1, this results in 9 workspaces but with only one lakehouse per workspace.
Which approach would be the most effective, or is there a better structure you would recommend?
Thanks!
Madhusudan
Solved! Go to Solution.
Among the three approaches, Approach 1 seems to offer the best balance between clarity, manageability, and scalability. It allows for clear separation of processing stages and easier management of permissions and access controls. However, if administrative overhead is a significant concern, Approach 3 could be a viable alternative.
In addition, I have some other suggestions.
First of all, one workspace per Medallion layer is recommended. But within each layer, the structure of data can be different:
I've borrowed some descriptions from these documentations and blogs below, hope you will find them helpful:
Exploring the Medallion Architecture in Microsoft Fabric | by Mariusz Kujawski | Medium
Describe medallion architecture - Training | Microsoft Learn
What is the medallion lakehouse architecture? - Azure Databricks | Microsoft Learn
Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!
Thank you for your detailed answer @v-jingzhan-msft . I thought I would have to organise data in all 3 layers in similar fashion i.e. by data sources but it seems not and I can adopt a different approach.
Kind Regards,
Madhusudan
Among the three approaches, Approach 1 seems to offer the best balance between clarity, manageability, and scalability. It allows for clear separation of processing stages and easier management of permissions and access controls. However, if administrative overhead is a significant concern, Approach 3 could be a viable alternative.
In addition, I have some other suggestions.
First of all, one workspace per Medallion layer is recommended. But within each layer, the structure of data can be different:
I've borrowed some descriptions from these documentations and blogs below, hope you will find them helpful:
Exploring the Medallion Architecture in Microsoft Fabric | by Mariusz Kujawski | Medium
Describe medallion architecture - Training | Microsoft Learn
What is the medallion lakehouse architecture? - Azure Databricks | Microsoft Learn
Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
8 | |
1 | |
1 | |
1 | |
1 |
User | Count |
---|---|
13 | |
4 | |
3 | |
2 | |
2 |