Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreShape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.
I have read about Semantic Model SKU limitation here
I am concerned about max number of rows per table in lakehouse and warehouse as well as limitations on semantic model.
Is it correct that direct lake rows per table limitation means that I can exceed the limit but that it then will use DirectQuery? Is it the same for lakehouse and warehouse?
What is Direct Lake mode size?
Are there any limitations on semantic model size?
what is the suggested method to list numer of rows and size per table in lakehouse/warehouse?
Happy if someone could clarify this for me or recommend any reading
Solved! Go to Solution.
300M is the row limit per table in an F8 SKU where DirectLake will be supported. Any tables above 300M when running on an F8 capacity will fall back to DirectQuery.
we made some tests and compared a table of 300M rows with one of +300M rows
direct lake mode could not be used for the second one
some of the descriptions below are in swedish but you should be able to understand what teh tables show
300M (for F8 capacity in our case) is not a hard limit for storing but for consumption it is significant limitation
What's the error message? 300M is small, even for a F8
there is no error message but you cant force direct lake mode in a semantic model for example
300M is the row limit per table in an F8 SKU where DirectLake will be supported. Any tables above 300M when running on an F8 capacity will fall back to DirectQuery.
I guess this is related to the fact that Direct Lake loads the entire column into the Power BI Vertipaq cache.
Even if the Power BI end user only queries a subset of the rows, there is no WHERE-filter pushdown to the Delta Tables, only column pruning. So the entire column (all rows) will be loaded into memory.
I guess the row limit is a hard limit in the semantic model memory (Direct Lake Vertipaq cache), with the size of the row limit decided by the SKU size.
Filters in the DAX queries are pushed down to the in-memory vertipaq cache, but filters are not further pushed down to the delta tables. Only column pruning is pushed down to the delta tables.
I guess also the delta table version timestamp is pushed down as part of the query to the delta tables. This is relevant if automatic refresh (automatic reframing) is disabled. In this case, the vertipaq cache will load the version from the delta tables as determined by the time of the last reframing operation of the semantic model.
It would be cool if it was possible to do time travel in a direct lake semantic model. I.e. let the owner manually pick which versions of the delta tables to use in the direct lake semantic model.
Nope no filter predicate push down into the delta tables via direct lake, all or nothing. Then filtered in the vertipaq by whatever dax/report filters are in place
If I understand correctly, it shows that Direct Lake outperforms DirectQuery significantly.
So the row limit for Direct Lake needs to be taken into account when considering which SKU to buy.
Perhaps it's possible to trick the system by splitting the fact table into 2 tables and then use measure to "append them"? 🤔😄
yes, I can see different workarounds (splitting into different tables and combining with common measures for example) but no solution that is satisfying
I did split fact tables 15 years ago before partitioning came around 🙂
You can use mssparkutils or powershell modules, for example
Getting the size of OneLake data items or folders | Microsoft Fabric Blog | Microsoft Fabric
"Is it correct that direct lake rows per table limitation means that I can exceed the limit but that it then will use DirectQuery? Is it the same for lakehouse and warehouse?"
"Are there any limitations on semantic model size?"
Yes, see the answer from @AndyDDC
"what is the suggested method to list numer of rows and size per table in lakehouse/warehouse?"
"What is Direct Lake mode size?"
Many tools can count the rows in the table. The most common approach is probably to use notebook or SQL endpoint to do this.
Regarding how to calculate model size, check out the "Memory footprint" section of this blog post:
Also check out this blog post, I think you can calculate the direct lake semantic model size in a Notebook:
https://fabric.guru/calculating-and-reducing-power-bi-dataset-size-using-semantic-link
I think the size of a direct lake semantic model depends on how many columns are cached into Power BI Memory. So I think the size of a direct lake semantic model will depend on which columns are being used ("warm", i.e. loaded into the Power BI service's (Fabric's) vertipaq engine's memory).
Hi @joakimfenno ,
Did the above suggestions help with your scenario? if that is the case, you can consider Kudo or Accept the helpful suggestions to help others who faced similar requirements.
Regards,
Xiaoxin Sheng
Hi @joakimfenno the direct lake document has the row limits and model sizes per Fabric SKU (about half way down the document in a table)
eg F64 allows for 1.5billion rows in a table and respects direct lake, exceeding this row limit will cause a fallback to direct query
table limits will be the same for lakehouse sql endpoint and warehouse
There are no row limits anywhere. Fabric is all consumption based - the more you use the more you pay.
User | Count |
---|---|
4 | |
4 | |
2 | |
1 | |
1 |
User | Count |
---|---|
13 | |
6 | |
5 | |
4 | |
4 |