Solved: Shortening sql server import times

ebjim · ‎04-02-2024

I am looking for any suggestions to shorten the import time of one large table, given that in the near term, I'll be using a single node gateway and importing a binary file is not an option.

The table has about 53 million rows and 97 columns of various data types (datetime, decimal, varchar, boolean, smallint, tinyint, int). The destination is a lakehouse. Currently, I am using 5 degrees of copy parallelism, balanced intelligent throughput optimization and dynamic range partition option (the primary key is the partition column, with specified lower and upper bounds).

Are there settings, properties, etc. that I should change? Suggestions? Thanks in advance.

v-nikhilan-msft · ‎04-03-2024

Hi @ebjim
If you are importing this into a semantic model, you could look at incremental refresh, create partitions, review data types, only include columns needed (used for relationships, analysis, measures), verify if you have auto date/time turned off if not being used (this can potentially create very large hidden date time tables for each date time column they are importing), etc.

Is this historical or incremental load or truncate and load every time? If this is incremental, copy data to files section as LH (sink) does not support merge operation today. A copy activity followed by notebook or Dataflow gen2 is an option.

Hope this helps. Please let me know if you have any further queries.

View solution in original post

dbrowne · ‎04-03-2024

It's impossible to suggest anything without more information. Eg what are the gateway VM specs? what is the CPU, Disk, and Network utilization percentage? What what is the table DDL? Is there a bottleneck on SQL Server?

ebjim · ‎04-03-2024

We're using the latest Power BI Gateway version, but I don't have the resource stats available. No bottlenecks with the SQL Server as far as I know. I am more concerned with getting the parallelism and throughput optimization settings correct. As for DDL, I can send it to you separately if you are open to that.

v-nikhilan-msft · ‎04-03-2024

Hi @ebjim
Thanks for using Fabric Community.
At this time, we are reaching out to the internal team to get some help on this. We will update you once we hear back from them.
Thanks

v-nikhilan-msft · ‎04-03-2024

Hi @ebjim
If you are importing this into a semantic model, you could look at incremental refresh, create partitions, review data types, only include columns needed (used for relationships, analysis, measures), verify if you have auto date/time turned off if not being used (this can potentially create very large hidden date time tables for each date time column they are importing), etc.

Is this historical or incremental load or truncate and load every time? If this is incremental, copy data to files section as LH (sink) does not support merge operation today. A copy activity followed by notebook or Dataflow gen2 is an option.

Hope this helps. Please let me know if you have any further queries.

Shortening sql server import times

Helpful resources

Fabric certifications survey