Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!
Hey guys, a bit puzzled as to why this might be happening and was wondering if any of y'all could help.
I have a table, ~80 col x ~ 2 mil rows with historical data. I am trying to run it via Dataflow Gen2 to a Lakehouse. Every other table I run runs perfectly fine (F64 capacity), but this one hits the 4 hour run time limit no matter what. I've tried scheduling on off-peak hours, using incremental refresh, and filtering based on date col to get only a handful of months at a time - still no avail.
What I find odd is that this table itself loads via Import in PowerBI in ~5 minutes - the entire table! I'm not sure what's so different that Fabric cannot load this, but we operate on a shared business capacity, and every time I run something it affects the entire business' refresh, so I am wary to keep testing this. I've linked photos below of when the incremental refresh (1 month worth of data) working & loading in a minute, but 4 months worth of data not working for over 2 hours. Thanks y'all!
Hi @pacman,
Thank you for posting your query in the Microsoft Fabric Community Forum.
Since Microsoft Support has already recommended Fast Copy for Dataflow Gen2, that is the right next step to validate. Fast Copy can improve throughput by optimizing the data write path, which may help in scenarios where larger batches struggle to complete.
If Fast Copy does not resolve the issue for larger date ranges, the next action would be to continue working with Microsoft Support so they can review backend execution logs and advise on supported ingestion approaches for this workload.
Please feel free to update the thread with the outcome of the Fast Copy test.
Best regards,
Ganesh Singamshetty.
Hi @v-ssriganesh,
The Fast Copy attempt did not work - it failed almost instantly. It appears Fast Copy is not supported via the Databricks connector. I have been working hand in hand with Microsoft's team to try and get this issue resolved, currently it seems a bit bigger than initially suggested.
Hello @pacman,
Thank you for the update. If you have found a solution or workaround, please consider sharing it here, as it could help others facing the same issue.
Thank you for your time and contribution.
After much trial and error myself and a Microsoft employee were able to fix the issue. It is a bug dealing with the 2.0 connector - ADBC driver. Switching it to 1.0 and using the ODBC driver worked. Apparently this will be fixed in an update soon, but for now, I have swapped all Databricks connections to 1.0 and they work fine.
Hello @pacman,
Thank you for the update. Please continue to utilize the Microsoft Fabric Community Forum for any further discussions or support.
Hi @pacman
There are a couple of reasons why Power BI Import could be taking significantly less time as compared to Dataflows Gen2. The engines behind these two processes are different, and their write paths are also different.
Power Import uses VertiPaq engine to build the semantic model, whereas Dataflows Gen2 writes into many smaller Parquet files, and additionaly may also do schema drift checks, data type inference, null handling and automatically applied transformations. This increases the metadata and transaction overhead and may slow down the copy.
Can you check for a few things?
- Optimized Write is enabled?
- are you partitioning?
- can you reduce the columns?
- Check folding: Use “View Native Query” in Power Query Online.
Hope this helps - please appreciate leaving a Kudos or accepting as a Solution!
Hi @deborshi_nag,
1. Optimized write is enabled by default I believe. Of the tables I have loaded successfully (this one table being the exception), I checked their most recent json and VORDER is enabled.
2. I am partitioning based on my date column. The main table contains ~2 1/2 years worth of data, I am loading by 6 month frame, starting initially with only ~4 month timeframe, much much smaller than the table's entireity.
3. All the columns are quite important, and although this is an option, I would like to avoid it. Are you asking this because, of my options available, I could later rejoin the columns via union in the Lakehouse?
4. It is folding to one SQL query which checks out.
Also a bit more backstory to this table - it is pulling from Databricks (which the table itself flows upstream from a silver environment) - not sure if it affects anything but it is of note. In addition, could this shared capacity have any play as to why this table won't load? I've had 1 months worth of data load in < a minute, but yet the 4 months not loading whatsoever.
are you saying this table is produced by Databricks - if so, is this an external table or a managed Databricks table?
Hi, this table is in our Databricks environment, and is being pulled through the Dataflow via the Databricks connector
Hi @pacman,
Are there any really large text or binary columns in your table?
I've seen dataflows (and pipelines) struggle with really large BLOB or text columns.
Hello @tayloramy ,
Suprisingly no large text nor binary columns. Mostly consists of brief descriptionary text (age, gender, etc.) and identifying keys. Microsoft support suggested using Fast Copy for Dataflow Gen2, so that is on my list of things to try when I run again over night.