Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

A new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.

Reply
rbriga
Impactful Individual
Impactful Individual

What Would be Fabric's Best-Practice Storage for These Data Table Types?

Hi all,

 

We have tens of active datasets, and they query Snowflake for data (mostly import mode).

Some use the same dimension or fact tables repeatedly, and I want the datasets to extract them from a Fabric storage item rather than repeated Snowflake queries. 

 

However, the different Fabric data capabilities leave me uncertain- dataflow (which I currently use), lakehouse, direct lake, mirrored database, warehouse.

 

Scenario 1: Dimension tables

  • Calendar table, customers, products, etc.
  • Up to hundred of thousands of records
  • Simple queries

Scenario 2: Aggregated fact tables

  • Transactions, events, aggregated by month, customer, etc.
  • Heavy Snowflake queries running over hundreds of millions of records, but, after transformations,
  • Returns small, simple result tables of tens of thousands of rows

Scenario 3: Large, non-aggregated fact tables

  • Transactions, events, etc.
  • Heavy Snowflake queries pulling tens\ hundreds of millions of records with very little transformation.

My organization is on Fabric P1, so we have all options available.

I'll greatly value your input!

-------------------------
Data analyst by day, hockey goalie by night.
Did I help? Then please hit that "kudos" or "accept as a solution" button!
1 ACCEPTED SOLUTION

Hi @rbriga 

Great question. I've recently gone through a similar evaluation while optimizing data architecture in Fabric, and here’s what I found that might help you too.

When comparing Direct Lake and Import mode in Microsoft Fabric, it’s easy to assume Import is faster because it loads data into memory. But Fabric’s Direct Lake introduces a new approach that offers many of the same performance benefits plus added flexibility and reduced redundancy.

Scenario 1: Dimension Tables (e.g., Customers, Products, Calendar)

These are often reused across multiple datasets. With Import mode, you'd need to load the same dimension tables into each dataset separately. Which leads to, Redundant storage, Increased refresh complexity and Higher risk of refresh failures.

Why Direct Lake is better for dimensions, it will Centralized storage, Zero duplication, Reusable across models, Near-import-level performance and Fewer refresh issues.

Scenario 2: Large Fact Tables (hundreds of millions of rows)

Import mode can struggle here for long refresh times, High memory usage and Risk of refresh failures that can break the model.

Direct Lake avoids these issues by querying data directly from the Lakehouse or Warehouse using the Delta Parquet format. Power BI is optimized to scan this data efficiently using vectorized execution, so performance remains strong even at scale.

Why Direct Lake is better for large fact tables, It dont have No data movement, No refresh delays, Scales to billions of rows, Supports near real-time updates, Reduces memory pressure.

 Direct Lake gives you the best of both worlds:

  • The speed of Import
  • The flexibility of DirectQuery
  • Simplified data operations

If your data is already curated in a Lakehouse or Warehouse, Direct Lake can significantly reduce duplication, improve maintainability, and streamline your architecture especially for shared dimensions and massive fact tables.

--------------------------------------------------------------------------------------------------------------
If this solution works for you, please consider marking it as accepted so others facing a similar issue can benefit too.

Regards,
Akhil.


View solution in original post

3 REPLIES 3
v-agajavelly
Community Support
Community Support

Hi @rbriga 

The challenge where many Power BI datasets were repeatedly querying the same Snowflake tables as Calendar, Customer, and Transaction facts. To reduce cost and improve performance, we moved to Microsoft Fabric (P1) and implemented the following storage strategy.

Scenario 1: Shared Dimension Tables
Stored in Lakehouse (Delta format) or Warehouse, depending on team preference.
Connected to Power BI using Direct Lake for fast, no-import access.
Easily reusable across datasets.

Scenario 2: Aggregated Fact Tables
Heavy queries run once in Snowflake, results stored in Lakehouse or Warehouse as small gold-layer tables.
These are also connected to Power BI via Direct Lake, avoiding repeat processing.

Scenario 3: Large Raw Fact Tables
Used Snowflake mirroring into Fabric Warehouse.
Keeps large tables synced without reloading or ETL.
Again, used Direct Lake in Power BI for performance and scalability.

We used Dataflow Gen2 and Pipelines to automate updates and keep things clean. This setup significantly reduced Snowflake costs and made Power BI reports faster.

Hope this helps others designing a shared storage strategy in Fabric. If this solution works for you, please consider marking it as accepted so others facing a similar issue can benefit too.

Regards,
Akhil.

rbriga
Impactful Individual
Impactful Individual

Thanks for the reply.

Can you please elaborate on the choice of Direct Lake?

 

  1. Why would Direct Lake work better than import both for the dimensions (scenario 1) and huge fact tables (scenario 3)?
  2. Wouldn't it cause long loading times as the report queries Direct Lake over and over rather than wotk in-memory in import mode?
-------------------------
Data analyst by day, hockey goalie by night.
Did I help? Then please hit that "kudos" or "accept as a solution" button!

Hi @rbriga 

Great question. I've recently gone through a similar evaluation while optimizing data architecture in Fabric, and here’s what I found that might help you too.

When comparing Direct Lake and Import mode in Microsoft Fabric, it’s easy to assume Import is faster because it loads data into memory. But Fabric’s Direct Lake introduces a new approach that offers many of the same performance benefits plus added flexibility and reduced redundancy.

Scenario 1: Dimension Tables (e.g., Customers, Products, Calendar)

These are often reused across multiple datasets. With Import mode, you'd need to load the same dimension tables into each dataset separately. Which leads to, Redundant storage, Increased refresh complexity and Higher risk of refresh failures.

Why Direct Lake is better for dimensions, it will Centralized storage, Zero duplication, Reusable across models, Near-import-level performance and Fewer refresh issues.

Scenario 2: Large Fact Tables (hundreds of millions of rows)

Import mode can struggle here for long refresh times, High memory usage and Risk of refresh failures that can break the model.

Direct Lake avoids these issues by querying data directly from the Lakehouse or Warehouse using the Delta Parquet format. Power BI is optimized to scan this data efficiently using vectorized execution, so performance remains strong even at scale.

Why Direct Lake is better for large fact tables, It dont have No data movement, No refresh delays, Scales to billions of rows, Supports near real-time updates, Reduces memory pressure.

 Direct Lake gives you the best of both worlds:

  • The speed of Import
  • The flexibility of DirectQuery
  • Simplified data operations

If your data is already curated in a Lakehouse or Warehouse, Direct Lake can significantly reduce duplication, improve maintainability, and streamline your architecture especially for shared dimensions and massive fact tables.

--------------------------------------------------------------------------------------------------------------
If this solution works for you, please consider marking it as accepted so others facing a similar issue can benefit too.

Regards,
Akhil.


Helpful resources

Announcements
May Power BI Update Carousel

Power BI Monthly Update - May 2026

Check out the May 2026 Power BI update to learn about new features.

Fabric SQL PBI Data Days

Data Days 2026 coming soon!

Sign up to receive a private message when registration opens and key events begin.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.