- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Best practice for loading data into snowflake
Hello Everyone,
Can you please suggest best practice for loading data from SQL/Oracle to snowflake using Microsoft Fabric data pipeline.
Case 1 : Multiple tables at a time with huge volume of data in each table
Case 2: Multiple tables at a time with less volume of data in each table
What is the recommended way for both performance and CU consumption
Regards,
Srisakthi
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Srisakthi
Thank you for reaching out Microsoft Fabric Community Forum.
- For high-volume data loads : We should focus on parallel loading using bulk loading techniques (COPY INTO), leveraging larger compute warehouses for performance, and batch processing to minimize compute and storage costs.
- For low-volume data loads : we aim to reduce compute overhead with smaller virtual warehouses, sequential loads, and incremental loading (Streams and Tasks). Also, consider using Snowpipe for real-time or near-real-time data loading to optimize cost and compute resource usage.
If this solution helps, please consider giving us Kudos and accepting it as the solution so that it may assist other members in the community.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Srisakthi
1. Firstly a). you will create a config table where we will mention the start and end range of date that choose either full and incremental load based on your datasets.
b). The data format should be parquet with snappy compression that helps to optimized the data performance. data should be save into small chunk files.
c). If you are quite better in pyspark, so it would be better otherwise you will do it with the help of the copy activities in the pipleines
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Srisakthi
We haven’t heard back since our last response and wanted to check if your query has been resolved. If not, please feel free to reach out for further assistance. If it has been resolved, kindly mark the helpful reply as the solution to make it easier for others to find. A kudos would also be greatly appreciated!
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Srisakthi
Thank you for your patience. please let us know if anything was helpful to you, so that we can convert it into a formal answer. If so, we would appreciate it if you could Accept it as a solution and drop a 'Kudos' so other members can find it more easily.
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @Srisakthi
Thank you for reaching out Microsoft Fabric Community Forum.
- For high-volume data loads : We should focus on parallel loading using bulk loading techniques (COPY INTO), leveraging larger compute warehouses for performance, and batch processing to minimize compute and storage costs.
- For low-volume data loads : we aim to reduce compute overhead with smaller virtual warehouses, sequential loads, and incremental loading (Streams and Tasks). Also, consider using Snowpipe for real-time or near-real-time data loading to optimize cost and compute resource usage.
If this solution helps, please consider giving us Kudos and accepting it as the solution so that it may assist other members in the community.
Thank you.

Helpful resources
User | Count |
---|---|
11 | |
6 | |
4 | |
2 | |
2 |
User | Count |
---|---|
10 | |
9 | |
7 | |
7 | |
5 |