Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
morz3d
Advocate I
Advocate I

Storage format for workspace with Fabric direct lake connection

Hi, 
I am testing Fabric in last month and i really like direct lake connection.
This made me think if we use direct lake and no actual data is stored in semantic model how does this reflect to choosen storage format for that workspace.
On workspace level i can choose between "Small semantic model storage format" and "Large semantic model storage format" so if i am only going to use direct lake connection do i have some benefits or constraints of choosing one over another?

1 ACCEPTED SOLUTION
v-jialongy-msft
Community Support
Community Support

Hi @morz3d 

 

Regarding the choice between "Small semantic model storage format" and "Large semantic model storage format" at the workspace level, this decision primarily affects how data is stored and managed within Fabric's semantic models themselves, rather than directly influencing the performance or capabilities of direct lake queries. 

Small Semantic Model Storage Format

  • Optimized for Efficiency: Designed for scenarios where the semantic model is smaller or where storage efficiency is a priority. If you're primarily using direct lake connections and only occasionally leveraging semantic models for aggregated views or simplified access patterns, this format might offer cost savings.
  • Potential Limitations: While this format is efficient, it might not be as performant for complex queries or when working with larger datasets within the semantic model itself. However, these limitations are less relevant if direct lake queries are your primary data access method.

Large Semantic Model Storage Format

  • Optimized for Performance: Tailored for larger semantic models and complex query requirements. This format is beneficial if you anticipate the need to occasionally run complex analytics or AI/ML workloads directly on data within semantic models.
  • Increased Resource Utilization: Choosing this format can lead to higher storage and possibly computation costs due to the optimized performance characteristics. It's a trade-off between speed and cost.

Since direct lake connections bypass the semantic model for querying, the choice of storage format for the semantic model doesn't directly affect the performance of these queries. Your decision should therefore be guided by how you plan to use the semantic models alongside direct lake queries.If your strategy leans heavily towards leveraging direct lake queries with minimal reliance on semantic models, the "Small semantic model storage format" might be more cost-effective. Conversely, if you anticipate needing to perform complex operations within semantic models for a subset of your analytics needs, the "Large semantic model storage format" might offer the necessary performance benefits.

 

 

 

 

Best Regards,

Jayleny

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

View solution in original post

1 REPLY 1
v-jialongy-msft
Community Support
Community Support

Hi @morz3d 

 

Regarding the choice between "Small semantic model storage format" and "Large semantic model storage format" at the workspace level, this decision primarily affects how data is stored and managed within Fabric's semantic models themselves, rather than directly influencing the performance or capabilities of direct lake queries. 

Small Semantic Model Storage Format

  • Optimized for Efficiency: Designed for scenarios where the semantic model is smaller or where storage efficiency is a priority. If you're primarily using direct lake connections and only occasionally leveraging semantic models for aggregated views or simplified access patterns, this format might offer cost savings.
  • Potential Limitations: While this format is efficient, it might not be as performant for complex queries or when working with larger datasets within the semantic model itself. However, these limitations are less relevant if direct lake queries are your primary data access method.

Large Semantic Model Storage Format

  • Optimized for Performance: Tailored for larger semantic models and complex query requirements. This format is beneficial if you anticipate the need to occasionally run complex analytics or AI/ML workloads directly on data within semantic models.
  • Increased Resource Utilization: Choosing this format can lead to higher storage and possibly computation costs due to the optimized performance characteristics. It's a trade-off between speed and cost.

Since direct lake connections bypass the semantic model for querying, the choice of storage format for the semantic model doesn't directly affect the performance of these queries. Your decision should therefore be guided by how you plan to use the semantic models alongside direct lake queries.If your strategy leans heavily towards leveraging direct lake queries with minimal reliance on semantic models, the "Small semantic model storage format" might be more cost-effective. Conversely, if you anticipate needing to perform complex operations within semantic models for a subset of your analytics needs, the "Large semantic model storage format" might offer the necessary performance benefits.

 

 

 

 

Best Regards,

Jayleny

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

Top Solution Authors