Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Possibility to configure table partitions in mirrored (SQL) databases

We mirrored one of our Azure SQL databases to Fabric using the Mirrored Azure SQL database item. One of the table has about 500M records. In another workspace than the mirrored database, we have a Lakehouse in which we created shortcuts to the mirrored tables.

We query the large table using pyspark with a date filter, to get all records of one or more dates into a dataframe.

What we are noticing is that when filtering the table, a table scan occures over all 500M records. Although this takes a couple of minutes, we would like to avoid this table scan. Partitioning mirrored the table should help prevent this and should also decrease the amount of capacity CU's consumed to do this task.

 

I would love to have the possibility to configure partitions on a mirrored table. This should be possible in the initial phase when creating the mirrored database and also afterwards. When initializing the partitioned table, the partitions are immediately applied. When applying a partition afterwards, when a sync is already active, the table would probably have to resync into the new partition.

Status: Needs Votes
Comments
ClementBR
Regular Visitor
We would like to have this too. We operate a multi tenant app. Almost all tables have a tenant_id column and queries typically fetch data from a single (or a few) tenant at a time. Partitioning would drastically improve perf by avoiding full data scans.
SimplyWilson
Microsoft Employee
Status changed to: Needs Votes
 
ArmandoLacerda
Most Valuable Professional
This is a great performance enhancement idea. Voted. For another idea to unblock large source tables based on CCI, here is another idea to vote on: https://community.fabric.microsoft.com/t5/Fabric-Ideas/Support-Clustered-Columnstore-Index-CCI-table...