Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
dbeavon3
Skilled Sharer
Skilled Sharer

Enterprise gateway has a mashup timeout that causes containers to be killed.

Every few minutes the enterprise gateway is killing my mashup containers. (Microsoft.Mashup.Container.NetFX45.exe)

 

This appears to be a relatively new problem.  I was able to run WEB-connector requests in the past, and they lasted a half hour or more.  Now the enterprise gateway seems to kill mashup containers after just 15 or 20 mins.   

 

Is anyone familiar with the timeout that is causing this?  Why won't my PQ timeouts be respected?  They specify that web requests should be allowed to take a longer than what the gateway is allowing.

 

 

Here is the error

 

dbeavon3_0-1716521519128.png

 

 

DM.EnterpriseGateway Error: 0 : 2024-05-24T02:40:38.2659380Z DM.EnterpriseGateway
4b2eab87-e66b-40b1-b29c-dd2d28ee149e 84dd25e2-0aea-4b84-b5fa-6d5cb2f052bb MGEA 4f4a95f6-346e-4756-9588-5cd1d141d866
0f4d8de9-6b6f-4494-b3e2-8fe4877e0034 0f4d8de9-6b6f-4494-b3e2-8fe4877e0034 A1272C73 [DM.Pipeline.GatewayPipelineTelemetry]
Non-gateway exception encountered in activity scope: Microsoft.Data.Mashup.MashupHostingException (0x80004005):
Timeout expired. The timeout period elapsed prior to completion of the operation.

at Microsoft.Data.Mashup.ProviderCommon.MashupResource.<>c__DisplayClass48_2`1.<StartEvaluationAndGetResultSource>b__0()
at Microsoft.Mashup.Security.Impersonation.RunAsProcessUser[T](Func`1 func)
at Microsoft.Data.Mashup.MashupCommand.EvaluateAndGetSource[T](String commandText, CommandType commandType, Int32 commandTimeout, MashupParameterCollection parameters, String resultTransform, Boolean forColumnInfo, Boolean executeAction)
at Microsoft.Data.Mashup.MashupCommand.ExecuteReader(CommandBehavior commandBehavior, MashupCommandBehavior mashupCommandBehavior)
at Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.GatewayProcessorUtils.MashupExecuteReaderAsync(DbCommand command, ExecuteQueryRequest queryRequest, Guid activityId)
at Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.GatewayProcessorUtils.ExecuteReaderAsync(DbCommand command, ExecuteQueryRequest queryRequest, Guid activityId)
at Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.GatewayProcessor.<>c__DisplayClass4_3.<<ExecuteAdoQueryAsync>b__3>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.PowerBI.DataMovement.Pipeline.GatewayPipelineTelemetry.PipelineTelemetryService.<ExecuteInActivityAsync>d__7`1.MoveNext()

 

 

 

 

Resulting in even more noise like so:

[1]Microsoft.PowerBI.DataMovement.Pipeline.Diagnostics.GatewayPipelineWrapperException: Substituted: MashupHostingException:<ccon>Timeout expired. The timeout period elapsed prior to completion of the operation.</ccon>


 

 

 

dbeavon3_1-1716521595849.png

 

 

Any help would be appreciated

 

 

7 REPLIES 7
SaiTejaTalasila
Post Prodigy
Post Prodigy

Hi @dbeavon3 ,

 

Have you tried to increase the timeout on the connections?.You can try it.

 

The WEB-connector request in the PQ script has a large timeout (1 hour).

 

The only config value I'm changing as-of now in the gateway config is "MashupDSRTestConnectionTimeout".  I'm increasing to 2 mins.  I have no idea if this does anything yet.  I will know soon....  Everything is pretty poorly documented.  I wish there were some annotations about which config values are changed most frequently, and how they impact the reliability of the mashup containers.

lbendlin
Super User
Super User

have you checked the usual suspect?  The config file ?

Hi @lbendlin 

Yes.  I saw nothing in there with this timeout.  I saw that you had replied to a question about this from another user some time ago, but I didn't find a direct answer.  I'm pretty sure this is an old topic, yet not well documented.

 

The error explains that the mashup problem was "detected" as a timeout.  The component doing this nanny work appears to be called a DSR.  There is a DSR time setting but it is just a few seconds, and says it can't be changed.  That configuration setting can't be the lifespan of a mashup process.

 

It is a really weird issue, not only that it is relatively new, but that I'm not finding a lot of noise about it in the community.  You would think that most PBI customers would object to their stuff being killed in just 15 mins.  I think the overall max for a PQ dataflow on premium is about five hours.

 

It is possible I'm doing something wrong.  But the mashup logs are hard to read, and the errors are hard to understand .  From my perspective it appears obvious that the timeout itself is the source of my problem, and I'm certainly not the source of that timeout.  Even if I am the one doing something wrong, it would have to be a very, very big offense to warrant the sudden killing of my mashup process.  When I'm watching it in taskmgr, it appears well behaved, and memory is low.  I'm stumped.  I opened a ticket and waiting to hear back.  I'm about ready to decompile some of this code as well, to see if I can find some clues; I noticed there is a timeout parameter in the callstack when sending work from the gateway process to the mashup process.

 

 

 

 

 I think the overall max for a PQ dataflow on premium is about five hours.

 

One would think that. The truth is much, much worse. There is no limitation for dataflows.  They can rack up gazillions of CUs if you are not extremely careful.  We had a couple of capacity lockups already where a dataflow failed after 17 hrs. etc.

 

We have now implemented our own process to forcibly cancel dataflows after 5 hours  (in line with the semantic model timeout), including an angry email to the dataflow owner.

 

Anyhow - for your issue, as painful as you think it is, raising a ticket is the only correct next step.

@lbendlin Thanks for the reply.  I have a meeting scheduled Monday.  They aren't really giving me anything helpful by email.

I'm beginning to think that this timeout error isn't related to the 15 min itself, but to some particular step happening within that span - once the size of the data grows, as is likely to happen after 15 min.  For example, maybe the size of the data that is spooled out to PBI is growing too big, and it can't be serialized/spooled out of the mashup process in 100 ms, or 1 sec, or 1 min, or whatever the constraint is.  So maybe I get a timeout error on that small piece of work, but it is a subtle and misleading sort of timeout that leads us to think it is related to the 15 min of PQ execution.

 

I'm a bit embarrassed to say it but I think this particular mashup results in about a 1 GB of json data (~1 MM rows of bite-sized items in json format).  It is probably not very common for that amount of raw text to come out of a PQ mashup, so perhaps Microsoft's gateway doesn't really know how to serialize/spool it efficiently.

 

In any case, I'm starting to focus on the DSR timeout again:
MashupDSRTestConnectionTimeout
it is only 50 sec by default and there is a 2 min max according to the config docs.  I guess it doesn't hurt to try.  After that I'll be looking to see if there are memory constraints in the config as well.  Of all the constraints that might be impacting me, it seems that the least likely  constraint is the enforcement of a 15 min constraint on the execution of PQ!  From my limited experience working with PQ, fifteen minutes seems pretty reasonable

 

Are you able to share any of your own customizations you made to your gateway configs that I should look at?  If we need to change something in the mashup config, it will be our very first time.  Maybe it is just a matter of  checking on the usual suspects. 

 

dbeavon3
Skilled Sharer
Skilled Sharer

I'm still trying to make sense of the logs.  I think the gateway is causing a self-inflicted problem, when things are otherwise behaving fine.  There appears to be a component called MDSR which is causing the mashup engine to be killed by the gateway:

 

 

 


DM.EnterpriseGateway Error: 0 : 2024-05-24T03:10:41.5884199Z DM.EnterpriseGateway 25cb00cd-4bbf-440b-9389-6b5984ada942 bfdd836f-fafe-489d-ac47-e98982bf11fc MDSR 4f4a95f6-346e-4756-9588-5cd1d141d866 7d4a2603-2834-455c-8246-2634e25f0899 7d4a2603-2834-455c-8246-2634e25f0899 4C1EF73E

[DM.GatewayCore] Error processing request: [0]Microsoft.PowerBI.DataMovement.Pipeline.Diagnostics.MashupDataAccessHostingException: The mashup query execution timed out
TemplateMessage: A problem hosting the mashup engine was detected. Reason: Timeout. Error code: -2147467259.

 

 

 

dbeavon3_0-1716522596027.png

 

 

 

Perhaps the MDSR needs to just chillax and not look for problems that aren't actually there.

 

Does anyone know how to suppress the killing of the mashups?  As near as I can tell it happens somewhere around 15 or 20 mins.

 

 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

PBI_Carousel_NL_June

Fabric Community Update - June 2024

Get the latest Fabric updates from Build 2024, key Skills Challenge voucher deadlines, top blogs, forum posts, and product ideas.

Top Solution Authors