Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
chris__1
Helper III
Helper III

Datafow Gen2: inconsistent behavior/fails and CU consumption

I am having some trouble understanding why a Dataflow Gen2 is failing sometimes.

 

As you can see from the screenshot of the refresh history the dataflow is working fine for some time (running every day upon scheduled execution) and then for no apparent reason the dataflow fails to refresh.

chris__1_0-1706708618975.png

 

After manually clicking the refresh button once or twice (without changing the dataflow itself OR changeing the data that is being processed!) the flow eventually refreshes successfully - even if taking much longer (25min compared to usually about 8min) and also consumes much more CU than usually!

 

It is always a "WriteTODataDestination" activity that fails  - the exact error message is:

 

104100 Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataSource.Error: Web.Contents failed to get contents from 'https://wabi-west-europe-d-primary-redirect.analysis.windows.net/v1/workspaces' (429):  Details: Reason = DataSource.Error;DataSourceKind = Warehouse;DataSourcePath = Warehouse;Url = https://wabi-west-europe-d-primary-redirect.analysis.windows.net/v1/workspaces;Microsoft.Data.Mashup.Error.Context = User

 

This dataflow is not using a gateway. The single datasource is a Fabric Lakehouse and the destination is a Fabric Warehouse!

 

Two things that I find rather troublesome:

 

1. as the dataflow is unchanged and also the data processed is not changing between a fail and a successful refresh I have a hard time understanding why the dataflow fails and what I can do to prevent it from failing!

 

2. those fails (that do not seem to be caused by a user error) still do consume quite a number of CU. This is a screenshot from the metrics app filtered to the dataflow in question. You can see that today the consumption is much higher - as it also was 14 days ago. In between those dates the dataflow was working as expected and did consume a significantly lower amount of CU.

chris__1_1-1706709201137.png

 

I would be interested in understanding why a dataflow sometimes fails - especially if it later runs successfully without changing anything and how this is impacting the CU consumption!

Thank You!

1 ACCEPTED SOLUTION
pqian_MSFT
Employee
Employee

Hi @chris__1 , @Joshrodgers123 

 

This particular failure (calling /workspaces and receiving 429) is happening due to an internal throttling issue that has been resolved since beginning of this week.

 

Please monitor these refreshes for a bit and see if you continue to observe failures due to "429" error. If so, let me know and sharing the session\request\dataflow ID from the refresh history screen:

pqian_MSFT_0-1708021601546.png

 

 

View solution in original post

13 REPLIES 13
pqian_MSFT
Employee
Employee

Hi @chris__1 , @Joshrodgers123 

 

This particular failure (calling /workspaces and receiving 429) is happening due to an internal throttling issue that has been resolved since beginning of this week.

 

Please monitor these refreshes for a bit and see if you continue to observe failures due to "429" error. If so, let me know and sharing the session\request\dataflow ID from the refresh history screen:

pqian_MSFT_0-1708021601546.png

 

 

thanks @pqian_MSFT . in fact I did not encounter this error this week. I also could not replicate it when working with support. we did close the ticket and I'll come back if it reoccurs....

Joshrodgers123
Advocate IV
Advocate IV

We've been seeing this error on all of our dataflows causing them to fail (and consume our CU). We've had a support ticket open since November for DFG2 issues and no one has been able to fix it. 

thank you @Joshrodgers123 - its good to know that I am not alone seeing this behavior.

 

As our trial is running out soon I think its going to be difficult have to convince the right people to put this product into production and to spend money on it. Not being able to understand why a workload fails is one thing - but to even have to pay for it as your CU go up... thats not good.

Thank you for stating what everyone faced with 104100 errors and other DFg2 shenanigans should be hollering from the rooftops. I sure am. But get this: support just got back to me yesterday and told me that the workaround to handle error 104100 (when ingesting from an on-prem DB) is by design (according to the internal team, whoever they are)!  Meanwhile, if you try to do your ingest using only one DFg2, you will be blocked by a 104100 error!  How is that by design and not a bug fro crying out loud!?

@Element115 thanks for the feedback - thats certainly interesting to hear!

 

however in my particular case, as explained in my initial post, there is nothing involved thats on prem! everything is located in fabric.

and even if the flow does not fail, I still find the quite extreme fluctuations in CU consumption astonishing considering that it should always consume the same amount if neither the flow nor the data changed...

I'm not sure I understand your reply. What did support say was the workaround and what did they say was by design?

Support said this section 'Workaround: Split dataflow in a separate ingest and load dataflow' on this page of the Micrsofot documention On-premises data gateway considerations for data destinations in Dataflow Gen2 - Microsoft Fabric | ..., is the workaround and it's by design supposedly that we can't use just one DFg2 when ingesting from an on-prem DB.  Just when you think you've heard everything... 😁

Hi @chris__1 
We haven’t heard from you on the last response and was just checking back to see if you got a chance to create a support ticket. If yes please share the details here.

Otherwise, will respond back with the more details and we will try to help.
Thanks

since it happens quite a lot now (flow fails error 104100, I restart the flow, eventually it finishes successfully) I contacted support: nr: 2402050050001313

 

Has support been able to help you at all with this? All of our dataflows are still failing ~75% of the time. We're not doing any transformations at all and the volume is really small. Data loads from to staging in seconds, but times out writing to the warehouse.

 

Sometimes they run for 5+ hours and fail - consuming almost 100% of our capacity...

Hi @chris__1 
Thanks for using Fabric Community. Apologies for the issue you have been facing. 
For handling such scenarios we would need more information regarding your workspace details, session id etc. Our engineering team would better understand the issue and help you. Hence I request you to create a support ticket here:
Microsoft Fabric Support and Status | Microsoft Fabric
Please provide the ticket number here so that I can closely follow up and provide you with a solution.

Thanks.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayFabricCarousel

Fabric Monthly Update - May 2024

Check out the May 2024 Fabric update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.