Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Dataflow refresh didn't load newest data

Hi there, 

This is the 10th time I noticed reflect from our users that data is not refreshed. When we check all dataflow status, all was working well as scheduled.

After debugging to find missing item (using Item ID), we found that:

Dataflow A loaded correctly and have Item 1, for example

Dataflow B from another workspace refer to Dataflow A, yet didn't have that Item 1

 

We temporary fixed the issue by Applying any change in Dataflow A, save it / refresh it, then undo the change as there is no modification needed. After that, refreshing Dataflow B and data is now in (Item 1 now presented in Dataflow B).

 

I wonder if there is a way to find out the missing? Or know what is the root cause to prevent it? Our users found it very inconvenience and distrust the Power BI report system for missing data.

Status: Investigating

Hi @rdnguyen ,

 

Is your problem the same as "Data mismatch in Desktop / Service while connecting through dataflow".

we have submitted the icm and just need make sure if you have the same problem.

 

Best regards.
Community Support Team_ Caitlyn

Comments
v-xiaoyan-msft
Community Support
Status changed to: Investigating

Hi @rdnguyen ,

 

Is your problem the same as "Data mismatch in Desktop / Service while connecting through dataflow".

we have submitted the icm and just need make sure if you have the same problem.

 

Best regards.
Community Support Team_ Caitlyn

rdnguyen
Helper V

@v-xiaoyan-msft  No, this problem is totally on Service, no Desktop / Service related.

rdnguyen
Helper V

@v-xiaoyan-msft It just happend again and our users found the issue to report it to us.

This case a Dataflow B load data from Dataflow A via direct reference (same workspace, immediate connect in series). A was able to load market data for the month of Dec; B was not able to pull in that data, but stucked at the end of November. Even now it's January; no failure among dataflows in the series was exisiting) but no data sending which caused the report to our Fiannce user missing market detail.

 

Only when user pull report for mundane, they noticed the issue; otherwise, we have no way to know. Since this data is for reference; to avoid such intermittent failure in passing data like this case, we have adjust all table join to left join to mitigate the issue to minimal as inner join would return blank report for the month; which could be good for us to detect the issue but bad in the impression of our Board of Directors.

Kindly help on this as distrust is pushing back our effort in bringing Power BI to the next phase of modelizing our BI using Power Platform.

 

Thanks and best regards,

R.

rdnguyen
Helper V

rdnguyen_0-1673641474373.png

@v-xiaoyan-msft hi Caitlyn, this is the newest screenshot I collect from the issue. All dataflow refresh successfully multiple times aday to feed data to Power BI report. The screenshot is two connected dataflow, the one on the left is upstream showing record, the one on the right is downstream, when filtered for the same job number, none displayed.

Again, by apply a filter, save it, and then back to it to remove the filter in upstream dataflow, problem solved. But what I would like your team suggestion is:

1- what cause this? What trigger was pulled to ensure data sent from upstream to downstream? any configuration could I remove to prevent data blockage?

2- I would like to confirm that this case happens intermittenly at different spot of different dataflow series; but we have no way to know as there is no error message until data blockage cause huge loss in report downstream, making users distrust the service at the moment. Very inconvenient, to be honest. So, could you please suggest away to regulate if the issue ever happen?

 

rdnguyen
Helper V

rdnguyen_0-1673642140298.png

So by using the solution I did mention above, data is now loaded. 

 

rdnguyen
Helper V

rdnguyen_0-1674565988914.png

So, I add an addition step to audit data load, the panel from the left is Dataflow A data; the panel on the right is Dataflow B which get data from the Dataflow A exactly as is. You can now see the total count is different. 

Still no refresh error at all.

@v-xiaoyan-msft , any update on this problem investigation?

rdnguyen
Helper V

@v-xiaoyan-msft Any update on this, I still have issue with data loading despite staggering refresh schedule.

rdnguyen
Helper V

@v-xiaoyan-msft Hey, I am curious that no one ever got into the same issue that we experienced?

rdnguyen
Helper V

Adding to my finding today, the data could be disconnected between the upstream and downstream dataflow within the same workspace.

 

The test has been done by referring to the upstream dataflow by a new dataflow either in the same other different workspace, result showed that updated data is visible in upstream but not to any downstream refered to it.

 

I even added another entity in the upstream dataflow, and use it for linked entity in the downstream, but data is not passing through. 

 

I wonder what else I could do.

rdnguyen
Helper V

So, my current architecture was like this:

 

X is siphoning data from SQL server, and B referred to X and another dataflow A. Now that when C collect data from B, the congestion happened at B that data is showing at B, but not C.

 

Then,

The change I made is Y solely referred to X, then B is now referring to Y and A.

 

I read about Enhanced compute engine, both issues (of my case) had their dataflows remains in the default mode with is Optimized. This mode is turned off when there is dataflow connected to datasource, but turned on for those has linked entities of another dataflow. Maybe the issue is right there that if B referred to X (the staging dataflow that is immediate connected to SQL Server) and linked entities from dataflow A; this mixed mode throw B into confliction of Optimized Enhanced Compute Engine.

 

My case could be closed now. Problem seemed to be well solved after I made that change.