Don't miss your chance to take the Fabric Data Engineer (DP-600) exam for FREE! Find out how by attending the DP-600 session on April 23rd (pacific time), live or on-demand.
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
I am trying to move data from a partitioned delta table (Partitioned on Date and Hour columns) from a fabric lakehouse to an SQL Databases.
The aim of the activity is to check if querying the sql database from power bi is better when connecting to the sql database compared to the sql endpoint in lakehouse.
The size of data that i am going to test on is about 60 billion rows.
I am leaning on using Fabric Pipelines to complete the activity, but i am not sure if a simple copy activity would preserve the partitinoning or whether i should create the partitioning in the sql DB (although not sure how to do partitioning on two columns) prior to copy activity.
Any suggestion is highly appreciated.
Thanks
Hi @mkj1213 - For something like that I'd recommend creating the table as a clustered columnstore and loading it in order of your partitioning keys.
There are two schools of thought on how to load the data: 1) Load it in order and slow, forcing the pipeline to be single threaded. 2) Load the data as quickly as possible by using max settings for Intelligent throughput optimization and Degree of Parallelism on the settings tab of the pipeline.
I'm thinking let's get the data loaded fast and see how it performs. Once the data is in the database it is easy enough to rebuild it into an ordered columnstore index.
One thing to note is that tables with columnstore indexes are not currently mirrored to OneLake. For your use case, that may actually be a selling point. We are looking to mirror columnstore tables in the future but there is significant engineering effort involved.
Please let us know how your testing goes!
Dave
Is Direct Lake not an option for you?
Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.