Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
Srisakthi
Resolver II
Resolver II

Best practice for loading data into snowflake

Hello Everyone,

 

Can you please suggest best practice for loading data from SQL/Oracle to snowflake using Microsoft Fabric data pipeline.

Case 1 : Multiple tables at a time with huge volume of data in each table

Case 2: Multiple tables at a time with less volume of data in each table

 

What is the recommended way for both performance and CU consumption

 

Regards,

Srisakthi

1 ACCEPTED SOLUTION
v-shamiliv
Community Support
Community Support

Hi @Srisakthi 


Thank you for reaching out Microsoft Fabric Community Forum.

  • For high-volume data loads : We should focus on parallel loading using bulk loading techniques (COPY INTO), leveraging larger compute warehouses for performance, and batch processing to minimize compute and storage costs.
  • For low-volume data loads :  we aim to reduce compute overhead with smaller virtual warehouses, sequential loads, and incremental loading (Streams and Tasks). Also, consider using Snowpipe for real-time or near-real-time data loading to optimize cost and compute resource usage.

    If this solution helps, please consider giving us Kudos and accepting it as the solution so that it may assist other members in the community.
    Thank you.

 

 

View solution in original post

4 REPLIES 4
Ray_Minds
Responsive Resident
Responsive Resident

Hi @Srisakthi 

1. Firstly a). you will create a config table where we will mention the start and end range of date that choose either full and incremental load based on your datasets.

b). The data format should be parquet with snappy compression that helps to optimized the data performance. data should be save into small chunk files.

c). If you are quite better in pyspark, so it would be better otherwise you will do it with the help of the copy activities in the pipleines

 


If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

v-shamiliv
Community Support
Community Support

Hi @Srisakthi 
We haven’t heard back since our last response and wanted to check if your query has been resolved. If not, please feel free to reach out for further assistance. If it has been resolved, kindly mark the helpful reply as the solution to make it easier for others to find. A kudos would also be greatly appreciated!
Thank you.

v-shamiliv
Community Support
Community Support

Hi @Srisakthi 
Thank you for your patience. please let us know if anything was helpful to you, so that we can convert it into a formal answer. If so, we would appreciate it if you could Accept it as a solution and drop a 'Kudos' so other members can find it more easily.
Thank you.

v-shamiliv
Community Support
Community Support

Hi @Srisakthi 


Thank you for reaching out Microsoft Fabric Community Forum.

  • For high-volume data loads : We should focus on parallel loading using bulk loading techniques (COPY INTO), leveraging larger compute warehouses for performance, and batch processing to minimize compute and storage costs.
  • For low-volume data loads :  we aim to reduce compute overhead with smaller virtual warehouses, sequential loads, and incremental loading (Streams and Tasks). Also, consider using Snowpipe for real-time or near-real-time data loading to optimize cost and compute resource usage.

    If this solution helps, please consider giving us Kudos and accepting it as the solution so that it may assist other members in the community.
    Thank you.

 

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric Community.