The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi guys,
I've a problems with incremental refresh with datasource in azure databricks. I've set my incremental refresh for 3 days but it took nearly an hour to finish the reload even databricks has successfully fetched all data (and some of it return 0 row). Has anyone face this problem before? Any suggestion I really appreciate that!
I have the same issue. When there are multiple partitions and if you have a high cardinality column, PBI will process all the data in the refresh, and then spend a long time compressing the high cardinality column across the partitions. I have seen that the data load in the incremental refresh can take 2 minutes, and then the compression takes 45 minutes. It therefore means that there is no way I can get my incremental refresh less than 45 minutes. Very frustrating!
Wondering if you ever got to the bottom of this? I've recently had a similar issue with databricks as the source.
Full refresh takes about 7 or 8 minutes whilst incremental akes 17-20 mins. When I look at the queries on the db they are all finishing within a few minutes, but the power bi job does not complete for some time. I thought it may be related to partition management within Power BI, but it seems way too long.
Hi @Bobby3467 ,
According to your description, here is my suggestion.
Long incremental refresh time may be related to the query folding and the complexity of the model. If possible, consider optimizing the model to reduce the amount of data. And you can refer to the following document which may be helpful to you.
Troubleshoot incremental refresh and real-time data in Power BI - Power BI | Microsoft Learn
Or you can turn to "View"->"Performance analyzer" to help you identify visuals that are impacting the performance of your reports, and identify the reason for the impact.
Please refer to the document below.
What is Performance Analyzer in PowerBI? | by Susheel Aakulu | Medium
Best Regards
Community Support Team _ Polly
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thank you Comunity Support Team,
I have check your reference but it also didn't work. But there's somthing strange when I reduce the size of storing data about 2-3 months and actually i did finish very fast. Then it all comes to my users requirement to store data at least 3 years for analysis. The problem still is that it had done the incremental load for the latest-3-days (and it only took a bout 1-2 minutes long) but in the UI of PBI Service i took another 40 more minutes to end the process. Anyway I have called the PBI Service support team and they are checking up.
Once again, I really appreciate your respone to my problem.
Best Regards.
Hi,
I'm facing the same issue when trying to refresh a dataset connected to Databricks. The incremental refresh takes almost as long as full refresh. I've tried both connectors Databricks.Catalogs and Databricks.Query but the results are the same. It seems like there is a performance issue with the Databricks connector because I can see the query history in Databricks and they ran fast.
Has anyone investigated the issue with MS support? Did you find any resolution?
Thank you!
Hi @Bobby3467 ,
Thank you very much for your reply. If you find the reason, I hope you can share it and mark your answer as the solution so that more people can find the answer.
Best Regards
Community Support Team _ Polly
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
To add more about my problem, i've checked the issue through xlma endpoint and I saw that it really did the incremental refresh very fast, but some how it had stuck somewhere and did not finish as picture attached here:
For the time before 2023-03-05 it has successfully partition and didn't need to refresh anymore as I've set the incremental refresh policy to store data up to 3 years. And from 2023-03-06 on, it had done it very fast but the actual end time in pbi service for that refresh you can see that up to 43 mins. How is that possible?