Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi all,
I'm still fairly new to Microsoft Fabric and have been researching best practices for my use case, but haven’t found anything specific so far. I’d really appreciate hearing the community’s thoughts on this.
We currently have a highly unstructured and decentralized data estate spread across multiple systems. We're planning to use Fabric to implement a centralized, structured solution based on the medallion architecture, starting with our sales data. We've done most of the work with this data, but are stuck discussing what's the optimal way to join the different sources.
Context:
We have two systems that process our sales data, both involve large datasets with different schemas.
We’re now trying to determine the most efficient way to combine or union the two datasets in the silver layer, but merely duplicating all the data into another lakehouse sounds very ineficcient. Both datasets now follow the same curated schema in the silver layer, but we’re unsure of the best approach to merge them.
Any advice or best practices would be greatly appreciated.
Thanks in advance for your help
Solved! Go to Solution.
I would start by asking if you have to merge them? Instead of physically merging the two silver datasets into a new table in gold, define views in the gold lakehouse that logically union the two silver datasets > semantic model > power bi.
As a side note, I avoid Dataflow Gen 2s as much as possible; they eat up CUs. Notebook everything you can.
Please mark this post as a solution if it helps you. Appreciate Kudos.
Hi @Adriansm
Just checking in – did the steps I shared help resolve the issue?
✅ If it’s working now, feel free to mark the response as the Accepted Solution. This helps others who face the same issue find the fix faster.
✨ And of course, a little Kudos would be much appreciated!
If you're still running into trouble, let me know what you've tried so far and I’ll help you dig deeper. We’ll get it sorted!
Hi @Adriansm ,
we had a similar setup – multiple sources like SQL, SharePoint, SAP, etc. here’s what worked best for us when building a medallion architecture in Fabric:
1. use one workspace per medallion layer
2. inside each workspace, use one lakehouse per data source
3. combine sources in silver layer
4. gold layer = business-ready
bonus tip:
use schemas inside lakehouses to separate domains (e.g. sales, finance) if you want fewer lakehouses.
If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful.
Thank you, but I have a couple of questions about combining data. You're using "notebooks or pipelines", which one of the two and why? Do you join your data onto a new lakehouse in silver? and is that efficient?
Thank you for your reponses
You're absolutely right — the best practice in a Medallion Architecture is to:
This keeps your pipeline modular and makes debugging or reprocessing much easier. Also, if one source changes or fails, it won’t break the whole flow.
In the Gold layer, you can then build business-specific views or aggregates based on that unified Silver model.
Let me know if you want a sample structure or naming convention — happy to share!
Hi, I appreciate the AI-generated reply but unfortunately I don't think this answers the question. As I stated in my original message, we already have a bronze and silver layer set up, my question was more specifically aimed at best practices for merging/combining this data.
I'll continue exploring further and will update the post with what I find.
Thank you
I would start by asking if you have to merge them? Instead of physically merging the two silver datasets into a new table in gold, define views in the gold lakehouse that logically union the two silver datasets > semantic model > power bi.
As a side note, I avoid Dataflow Gen 2s as much as possible; they eat up CUs. Notebook everything you can.
Please mark this post as a solution if it helps you. Appreciate Kudos.
That’s a very good point. We don’t necessarily need to physically merge the datasets, but there are use cases where we need to feed or compare sales data from both sources, so having them accessible in one place is important.
If I understand correctly, you're suggesting we create a SQL view in the gold layer that uses UNION ALL
to combine the two silver tables into a single logical view. Is that right?
I suppose it all comes down to performance. Our datasets are fairly large, but from what I’ve been reading, views with UNION ALL can be quite efficient, so I’ll definitely give that a try
Thanks also for the advice regarding DFG2. A few others have flagged that as well, so we’re planning to rebuild most of it using notebooks before going live.
Hey, that’s exactly right , you got it!
Creating a logical view in the Gold layer using UNION ALL is a clean and scalable way to bring those Silver datasets together without physically merging them. It keeps things flexible and easier to maintain, especially when working with large volumes.
And yep, performance with views is usually solid — just keep an eye on indexing and partitioning if needed.
Glad to hear you're moving away from DFG2s too — notebooks give you way more control and efficiency.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
36 | |
18 | |
7 | |
6 | |
3 |
User | Count |
---|---|
48 | |
44 | |
14 | |
8 | |
6 |