Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The ultimate Microsoft Fabric, Power BI, Azure AI & SQL learning event! Join us in Las Vegas from March 26-28, 2024. Use code MSCUST for a $100 discount. Register Now

Reply
Rfingerhut
Frequent Visitor

Dataflow intermittent long-running refresh. Not enough detail in logs to thoroughly troubleshoot.

We have been experiencing intermittent issues with a certain dataflow in the service that will randomly run much longer than usual for what seems like no apparent reason.  This particular dataflow is using IBM DB2 mainframe as it's data source, and the query is "select * from table", which has ~50k rows at any given time.  We have had our database team run a trace on the query to make sure it wasn't on their side, but they monitored it and confirmed that it runs sub-second and shouldn't be causing any delays.  We also confirmed that the end time of the Power BI refresh matched the start time of the query against the database, so there is a gap from the time Power BI starts the refresh to when the database actually gets it, but we're unsure what is happening during that time and why it's happening.

From the Power BI side, we have enabled extra logging on the gateways to try and see what might be causing the issue on our side, but there aren't enough details in the logs to give any insight.  A bit of research about concurrent dataflows in the service made us look into potential issues with too many dataflows refreshing at the same time, but we confirmed we were nowhere near the concurrent limit during the refresh times, or at any time for that matter.  

 

What are we missing?  Is there a setting or additional logging we can look into to find the root cause and fix the intermittent long-running refreshes?

SC Capacity Dataflow Refresh Times.png

6 REPLIES 6
cassidy
Impactful Individual
Impactful Individual

I'd bet it's related to other Dataflows (and Datasets) running, which I know you mentioned, but in my experience Dataflows & Datasets running in the same Workspace impact eachother.  The fact that 2 of your long runs are at the same time, something must be competing.

 

However, and I've never got an explanation for this, I see far more success when running Dataflows/Datasets in their own Workspace if they are high priority...even though they are on the same Capacity.  It seems to prevent a sort of "blocking". May be worth giving a test.

For what it's worth, we have dedicated workspaces for dataflows depending on the subject matter. We don't have any datasets in the same workspace. We only pull the dataflows into our models as needed, and then are published to separate workspaces.

@cassidy the results are in...

Here is the refresh history for the exact same dataflow, but is the only dataflow or object in a new workspace.
New test dataflow in its own workspaceNew test dataflow in its own workspace

Same exact refresh schedule for the existing dataflow in a workspace with multiple other dataflows.
Existing dataflowExisting dataflow

cassidy
Impactful Individual
Impactful Individual

Yes that is good management of Workspaces in my opinion, but I'd still run a copy of that Dataflow independently in it's own Workspace for a few days and see what happens.  Scheduling it to run concurrently with the original could give an interesting result. If the test Dataflow refreshes quickly at times when the original lags, it may point to blocking within the original Workspace.

You'd think it's either your Source prioritizing inbound queries or BI Service prioritizing Dataflows, I believe you've confirmed the first thing is not happening.

I will try that.  Thank you for the suggestion.

The primary complaint is that the Service and logs fail to provide sufficient information to easily determine the root cause.  If it's a prioritization issue, it would be nice to see that detailed out somewhere.  Running these tests may not lead to a definitive answer, which puts me right back at square one.  😑

cassidy
Impactful Individual
Impactful Individual

Oh I hear you. A lot of BI Service is "magic optimization", which both allows low experience folks to do great things ( me ), but at the same time will not reveal what's really happening (especially around refreshing).

 

For example, Dataset refreshes are also like a black box, except at least one helpful person figured out how to analyze the progress and display bottlenecks https://dax.tips/2021/02/15/visualise-your-power-bi-refresh/

Helpful resources

Announcements
Fabric Community Conference

Microsoft Fabric Community Conference

Join us at our first-ever Microsoft Fabric Community Conference, March 26-28, 2024 in Las Vegas with 100+ sessions by community experts and Microsoft engineering.

February 2024 Update Carousel

Power BI Monthly Update - February 2024

Check out the February 2024 Power BI update to learn about new features.

Fabric Career Hub

Microsoft Fabric Career Hub

Explore career paths and learn resources in Fabric.

Fabric Partner Community

Microsoft Fabric Partner Community

Engage with the Fabric engineering team, hear of product updates, business opportunities, and resources in the Fabric Partner Community.

Top Solution Authors
Top Kudoed Authors