Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more
Hello - I'm running a copy task to pull in the last year of data from one table that exists in a DB2 on premise database. The connection is on ODBC. It's been processing now for about 6 hours without completing. Is this normal? Are there things I can do to enhance this performace? There are a lot of records, but this data in SQL is much faster. We have just started using pipelines to get data into our Lakehouse and are following the Medallion model. Are there things I can do to enhance this performace? Thanks!
Any ideas from anyone? The settings I noted didn't work. This is new to me, so any advice is helpful. Thanks!
Hello @nilendraFabric
Thanks for providing that information and the link to the document. It was very helpful. I tried what I thought I needed yesterday, but it took even longer to run and I had to finally cancel the process. Today, I'm trying again with these settings in the copy task of my pipeline. On the Destination tab, I have checked to Enable partitions and added a Partition column (Serial) which is the key field in this table. On the Settings tab, I checked Use custom value for Intelligent throughput optimization and entered 12 for the Degree of copy parallelism. Is this correct?
Using “Serial” (unique key) as partition column creates excessive small partitions. This leads to:
• High partition management overhead
• Wasted I/O from opening/closing numerous files
• Poor parallelization effectiveness
-- Use temporal partitioning instead of serial key
SELECT * FROM source_table
WHERE business_date BETWEEN '2024-03-01' AND '2025-03-04'
You are correct and that process didn't work. I'm been using the Select * with a date from the beginning to get the last year of data. Everyday I try something different in hopes something cuts down the process time. Today I'm trying the ITO setting at Maximum and Degree of copy parallelism at Auto. So far it's been processing for running for 6h 2m. I wish I knew to majic settings!
Hello @alondon
Use dynamic range partitioning with degree of parallelism adjusted to your DB2 capacity (start with 8-16 parallel copies).
https://learn.microsoft.com/en-us/fabric/data-factory/copy-performance-sql-databases
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Fabric update to learn about new features.