This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreLevel up your Power BI skills this month - build one visual each week and tell better stories with data! Get started
Hello Everyone,
I’m looking for advice from experts who have worked with large-scale datasets in Microsoft Fabric + Power BI. We are entering Phase 2 of our reporting solution, and I want to ensure the right architecture is in place from the beginning.
Our data originally sat in Azure Synapse → Power BI refreshes used to take ~6 hours.
The data has now been migrated to Microsoft OneLake (Fabric) → refresh time has improved to ~1 hour.
In Phase 1, the approach taken was:
Import Mode in Power BI.
The entire raw table (billions of rows) was imported into Power BI.
Data modelling and transformations were handled inside Power BI instead of upstream in Fabric.
This worked for the proof of concept, but it’s clearly not scalable or efficient for long-term needs.
For the next phase, here are the key requirements we want to achieve:
Performance:
We are working with billions of rows.
Need refresh/query performance in minutes, not hours.
Separation of concerns:
Power BI should not do heavy lifting.
All transformations, joins, and aggregations must be handled in Fabric.
Power BI should only consume curated, clean, star-schema-ready data.
Technology stack:
We are fully on Microsoft Fabric (OneLake, Lakehouse, Warehouse, Dataflows Gen2).
Refresh strategy:
We’re open to Incremental Refresh or Full Refresh depending on best practice.
The goal is to minimize refresh times and ensure scalability.
Power BI connects in Direct Lake mode to Fabric tables.
Eliminates traditional refresh bottlenecks – near real-time with import-like performance.
Scales well with billions of rows.
Reduces maintenance overhead (no refresh partitions to manage).
Import curated fact + dimension tables into Power BI.
Use Incremental Refresh to only load new/changed data.
Suitable for periodic loads but still requires refresh management.
For billions of rows in Fabric, is Direct Lake Mode the clear best option, or should we still consider Incremental Refresh in Import Mode for stability?
What are the best practices for modelling in Fabric (Lakehouse/Warehouse) before exposing data to Power BI?
Example: Should we push all transformations into Fabric pipelines and only expose star schema tables?
Any recommended architecture patterns to balance:
Speed (minutes, not hours)
Scalability (billions of rows, additional columns)
Governance (no raw tables inside Power BI, maintain a clean semantic layer)
Dataset size: Billions of rows (raw table + additional columns in Phase 2).
Current Phase 1 report: Import Mode, raw tables loaded → not sustainable.
Tools available: Microsoft Fabric (OneLake, Lakehouse, Warehouse, Dataflows Gen2).
Goal: Curated, clean data in Power BI with minimal refresh times.
Any insights, architectural suggestions, or real-world experiences would be hugely valuable.
Thanks,
Arjun
Solved! Go to Solution.
Hi @maverickf17 ,
Direct Lake Mode Is the Game-Changer.
Your instincts about Direct Lake Mode are spot-on. It’s not just a theoretical improvement it’s a real-world performance enabler. In fact, your own accepted solution in the Microsoft Fabric Community outlines why Direct Lake is ideal for both dimension tables and massive fact tables:
This aligns with broader community consensus: Direct Lake offers import-like speed without the refresh overhead, making it the preferred choice for scale and simplicity.
Modelling Best Practices in Fabric
Fabric is built for heavy lifting. Experts strongly advise:
This separation of concerns ensures Power BI remains lean and focused on visualization not data prep.
You can even mix both depending on your use case. For example, use Lakehouse for raw ingestion and Warehouse for curated semantic layers. [Performanc...soft Learn]
Architecture Patterns That Work
Star Schema + Direct Lake
Partitioning & Pre-Aggregation
Use Fabric Pipelines + Dataflows Gen2
Performance & Scalability Tips
Governance & Maintainability
Final Recommendation,
For your Phase 2 goals speed, scalability, and governance Direct Lake Mode is the clear winner. It simplifies architecture, reduces refresh complexity, and scales effortlessly. But it only shines when paired with strong upstream modelling in Fabric.
If you’re still evaluating, try a hybrid approach.
Regards,
Akhil.
Hi @maverickf17 ,
I hope the response provided helped in resolving the issue. If you still have any questions, please let us know we are happy to address.
Thanks,
Akhil.
Hi @maverickf17 ,
Just following up again were you able to test the Direct Lake setup in your environment? If you ran into any roadblocks while implementing it, feel free to share the details here and the community can help troubleshoot step by step.
Thanks,
Akhil.
Hi @maverickf17 ,
Thanks for sharing your detailed solution. Your points around Direct Lake Mode and upstream modeling in Fabric are spot on. This approach clearly addresses performance, scalability, and maintainability concerns, especially for large fact tables and centralized dimension tables.
Just checking in did you get a chance to try this setup in your environment yet? If not, happy to guide you through a step-by-step implementation or clarify any part of the recommendation to make it easier to adopt.
Thanks,
Akhil.
Hi @maverickf17 ,
Direct Lake Mode Is the Game-Changer.
Your instincts about Direct Lake Mode are spot-on. It’s not just a theoretical improvement it’s a real-world performance enabler. In fact, your own accepted solution in the Microsoft Fabric Community outlines why Direct Lake is ideal for both dimension tables and massive fact tables:
This aligns with broader community consensus: Direct Lake offers import-like speed without the refresh overhead, making it the preferred choice for scale and simplicity.
Modelling Best Practices in Fabric
Fabric is built for heavy lifting. Experts strongly advise:
This separation of concerns ensures Power BI remains lean and focused on visualization not data prep.
You can even mix both depending on your use case. For example, use Lakehouse for raw ingestion and Warehouse for curated semantic layers. [Performanc...soft Learn]
Architecture Patterns That Work
Star Schema + Direct Lake
Partitioning & Pre-Aggregation
Use Fabric Pipelines + Dataflows Gen2
Performance & Scalability Tips
Governance & Maintainability
Final Recommendation,
For your Phase 2 goals speed, scalability, and governance Direct Lake Mode is the clear winner. It simplifies architecture, reduces refresh complexity, and scales effortlessly. But it only shines when paired with strong upstream modelling in Fabric.
If you’re still evaluating, try a hybrid approach.
Regards,
Akhil.
Check out the April 2026 Power BI update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
| User | Count |
|---|---|
| 30 | |
| 24 | |
| 23 | |
| 17 | |
| 15 |
| User | Count |
|---|---|
| 63 | |
| 36 | |
| 30 | |
| 23 | |
| 22 |