The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends September 15. Request your voucher.
I’m hoping this community can help. My organization is working on setting up our medallion architecture. We have created landing zones for our bronze and silver layers. The initial bronze lakehouse has all raw data, the silver layer is broken out into several lakehouses, each based on a specific data source. No concerns from the team up to this point.
For our gold layer, we wanted to have the gold lakehouses setup in our workspace groups that tie to different areas of the organization. This is how we had things setup with Power BI. We would like to have each workspace have a gold lakehouse that has data that belongs to the group that owns the workspace. Again, no concerns there.
Where our problem is, is what about gold lakehouse(s) that house data that is universal or could belong to multiple departments. Do we have to choose a workspace? Do we setup some kind of landing zone? Is there another option? We realize we can use shortcuts, but where should that kind of data live? We don't want to add a fourth layer if we don't have to, but it almost seems like a necessity.
Solved! Go to Solution.
We created another Gold workspace to hold common data items for universally useful data - in our case often reference data. In terms of data ownership, it's owned more or less by the core data team who look after the platform rather than the individual Product Teams.
We have common Bronze and Silver workspaces as well as our data is very segmented from a security standpoint and the common data is often shortcutted into the Product Team specific lakehouses.
Hi @Anonymous
My suggestion is to have a central location to manage the data that is universal or shared across multiple departments. The setup of this central location can vary based on the complexity of your medallion architecture. Here are a few approaches you can consider:
Centralized Gold Lakehouse: You can create a centralized gold lakehouse that houses all the universal data. This lakehouse can be accessed by multiple departments, and you can use access controls to manage permissions. If the silver layer has only one workspace, you can set up the centralized gold lakehouse in the same silver layer workspace. This approach avoids the need for a fourth layer workspace.
Shared Workspace: Another option is to create a shared workspace specifically for the gold layer that contains universal data. This workspace can be managed by a central team, and shortcuts can be used to link data from this workspace to other departmental workspaces as needed. When the silver layer has multiple workspaces and the universal data is to extract from multiple silver-level workspaces, this approach would be better. Managing universal data within a single workspace can make the architecture clearer and simplify permission management.
Using Shortcuts: As you mentioned, shortcuts can be a useful tool. You can create shortcuts in the gold lakehouses of different workspaces that point to the universal data stored in a central location. This way, each department can access the data without duplicating it.
Data Virtualization: Consider using data virtualization techniques to create a virtual layer that combines data from multiple sources without physically moving it. For example, you can create virtual views that represent the data from different sources. These views can be queried just like regular database tables.
In summary, setting up a fourth layer for the universal data may be unavoidable. However, this fourth layer might not require a separate workspace and could exist within the workspaces of other layers in a different manner.
Best Regards,
Jing
If this post helps, please Accept it as Solution to help other members find it. Appreciate your Kudos!
We created another Gold workspace to hold common data items for universally useful data - in our case often reference data. In terms of data ownership, it's owned more or less by the core data team who look after the platform rather than the individual Product Teams.
We have common Bronze and Silver workspaces as well as our data is very segmented from a security standpoint and the common data is often shortcutted into the Product Team specific lakehouses.