Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Dataflow refresh didn't load newest data

Hi there, 

This is the 10th time I noticed reflect from our users that data is not refreshed. When we check all dataflow status, all was working well as scheduled.

After debugging to find missing item (using Item ID), we found that:

Dataflow A loaded correctly and have Item 1, for example

Dataflow B from another workspace refer to Dataflow A, yet didn't have that Item 1

 

We temporary fixed the issue by Applying any change in Dataflow A, save it / refresh it, then undo the change as there is no modification needed. After that, refreshing Dataflow B and data is now in (Item 1 now presented in Dataflow B).

 

I wonder if there is a way to find out the missing? Or know what is the root cause to prevent it? Our users found it very inconvenience and distrust the Power BI report system for missing data.

Status: Investigating

Hi @rdnguyen ,

 

Is your problem the same as "Data mismatch in Desktop / Service while connecting through dataflow".

we have submitted the icm and just need make sure if you have the same problem.

 

Best regards.
Community Support Team_ Caitlyn

Comments
DaveWalton
New Member

We have the same issue.  Data in the dataflow is current/up to date, but the report online is not showing the current/up to date values.  Has this been resolved?????

MondayMorning
Frequent Visitor

Hi all,

 

i believe we might have a similar problem, where certain data from upstream dataflow is not being passed into the downstream dataflow and the dataflows are refreshed without any errors.

The following description of our setup is not to invite workarounds to the sequence of our process but rather provide the necessary insight for a knowledgeable user/microsoft employee/consultant to let us know why data is/might not being transferred from one dataflow to the another.

 

Setup:

We have 10 dataflows in the same workspace in ppu (premium per use). 9 dataflows of type A (each extracting quantity data for individual warehouses) and 1 dataflow of type B (consolidating the data from all dataflows type A).

Dataflow A: has three tables,

Date-Table: containing all dates of several years

Table 1: retrieves quantity data (product ID and number of units) from sharepoint csv files (one additional file per day). Unfortunately, each warehouse generates a differently structured CSV file, so that individual Dataflows are needed per warehouse). Data is inconsistent across warehouses in terms of dates for which data is provided (due to different holidays and weekend days). From time-to-time new Product-IDs are included in the data.

Table 2: references Table 1 and pivots said extracted data so that each column contains unit data per individual product-ID and dates per row. The pivoted data is then merged with the Date-Table and fill-down is conducted to ensure that each product has the consistent units for each and every day. Over time the number of columns increase due to new Product-IDs in the warehouse data.

Dataflow B:

Tables 1-9: each Table 2 from Dataflow A is loaded individually into Dataflow B

Table 10: merges all tables 1-9

 

 

Results:

Dataflows of type A: will contain new column for new Product-IDs

Dataflow B: in the tables 1-9 no new columns of new Product-IDs are visible.

 

We have tried different methods of refreshing the data:

  • by starting the refresh from Dataflow A, which automatically triggers refresh of Dataflow B and
  • individually refreshing Dataflow B.
  • let time (hours and days) pass between refreshes.
  • Included Table.Buffer() at the end of each table

 

All without any refresh faults. Always producing refreshed new data in terms of rows per day but never a new column for the any new product-ID.

 

Why are pivoted column data not updated in downstream dataflows?

 

However, if we modify a dataflow type A, in any way (filter, sort, change column type…or even project specific options such as “Detect column types and headers from unstructured sources“) this causes the “new” columns to appear in the dataflow type B.

 

BTW if you unselect “Detect column types and headers from unstructured sources“ in the project specific options. This is not saved, as the box is checked the next time one enters the dataflow to modify the options.

DaveWalton
New Member

I believe I found the solution.  If you go into the Power BI report that is using the dataflow, edit the query/table in question and select "Properties" from the Home menu ribbon.  Next, make sure to check the box "Include in Report Refresh"

DaveWalton_0-1708462931209.png

 

MondayMorning
Frequent Visitor

Hi @DaveWalton ,

 

thank you for your suggestions.

 

However, the retrieving entity is not a report but another dataflow (our dataset only sits upon the mentioned dataflow type B) , which does not provide the optionalities your screenshot suggests (it only has a checkbox for “ Enable load to report”, which has already been activated).

 

My assumption is that somehow changes in table structures in dataflows due to data import (in our case new product-ID’s imported from csv files resulting in new columns) are not “registered” by the underlying services system infrastructure and hence this data to be passed on.

I would appreciate if someone could make sense of this behavior, confirm or refute it and provide a solution for our problem of data inherent table structure changes and the lack of services passing on these changes from dataflow to dataflow.

 

@rdnguyen have you found a solution in the meantime?

AMarks
Advocate I

I am having the same issue.

 

Dataflow A in Workspace 1

Dataflow B in Workspace 2 that is linked to Dataflow A

Both workspaces are in premium capacity

 

Both Dataflows referesh fine, but for some reason, Dataflow B does not seem to pick up the new data in Dataflow A. It skips the refresh so I end up with stale data in Dataflow B

 

this is really frustrating