Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
dbeavon3
Memorable Member
Memorable Member

StreamBeforeRequestCompletes - Guidance for Mashup Container

I have a data source that is very fast.  It contains a fact table.  I retrieve data from the source in a loop, and the source can return data for every iteration of the loop in just ~500 ms or less .  There are about 1000 iterations.  IE. ~500 seconds is spent at the data source itself, replying to queries from PQ.

 

But the mashup container process on the gateway server is *VERY* slow.  While those 1000 iterations are running, there is a single mashup container that spends about 50% of its time burying one of my CPU cores, and the remaining 50% of the time is split writing to the "cache" directory and the "spooler" directory.  This goes on and on and on for almost two hours!   Finally after that, the spool is cleaned out and about 1 GB is transmitted up to the service.

 

This seems dysfunctional.  The bulk of the work on the CPU appears to be spent serializing and deserializing to "cache" and "spooler" directories (this is probably happening since the "mashup container" doesn't want to use more than 200MB of RAM or whatever.) Ideally the data from the data source (which is available in a total of ~500 seconds) would be transmitted straight to the PBI service without all the unnecessary hoopla. 

 

I recently discovered some places on the internet where folks are suggesting a change in a config: "StreamBeforeRequestCompletes".  I suspect this may be the answer (or part of the answer).  However my version of the gateway is 3000.122.8 April 2022 and the config (Microsoft.PowerBI.DataMovement.Pipeline.GatewayCore.dll.config) says the config setting is "internal":

Internal settings - do not change:
* StreamBeforeRequestCompletes

In other words, I'm getting mixed messages.  The most recent message is from Microsoft documentation that seems to encourage people to change it:
https://docs.microsoft.com/en-us/data-integration/gateway/service-gateway-performance#optimize-perfo...

 

So what is the authoritative word on this?  Should we use it or not use it?  Also how does it work?

 

A final thought - wouldn't it be cool to have a property on a PQ where the developer gets to decide which tables are streamed and which ones aren't?  I am nervous about making a server-wide setting change, given that all I care about is a single table in a single dataset.  I don't want to be responsible for negative impacts on all the other tables in all the other PBI reports.

 

I'm not sure how this can be "internal", but is now being advertised publicly in various places.   I'd love to hear first-hand accounts from someone who has made the change ... or from from someone at Microsoft who has the official/authoritative information about whether the property is internal or not.

3 REPLIES 3
nphadro07
Frequent Visitor

Did we ever get a fine answer. I just want to confirm that this streaming dataset feature will NOT impact anything if I cancel the refresh half way or if the dataset fails at any point? It's not like I will get half the data or something like that where end users will start getting inaccurate data. Can you confirm?

Anonymous
Not applicable

You can change the container size per query in the PBI Desktop settings.

 

The "Stream" setting does seem to speed things up by not spooling everything. But all data that goes to the PowerBIgateway goes through the gateway. I don't the machine on which you have the gateway installed, but just because source to Local PowerBI is fast does not mean that the Gateway is going to be just as fast. Depends on the machine, is it working on other stuff, does it have enough RAM, processor power, etc. You said that you are worried about affecting all of the other reports, which suggests that yours isn't the only report potentially refreshing through the Gateway.

 

--Nate

@Anonymous Thanks for the reply.

 

I wasn't clear in my question, but I was discussing the on-prem enterprise gateway, not the behavior of the PBI desktop.

>> The "Stream" setting does seem to speed things up by not spooling everything
Are there negative consequences?  Do you use it by default on all your gateways?  If I ask our server team to make this config change, then how will I explain the comment in there that says "do not change"?

 

>>You can change the container size per query in the PBI Desktop settings

Yes, it affects the desktop but I don't think this affects the behavior in the enterprise gateways.   That said it might be nice if there was a way to influence the container sizes on the gateways. (on a report-by-report basis)

 

>> does not mean that the Gateway is going to be just as fast. 

Right, the gateway "mashup container" seems to do nothing but slow me down.  It runs as a .Net single-threaded CLR host.  It restricts itself on RAM, and spends a massive amount of time on unnecessary I/O.  For this particular case (a 1 GB fact table) I just wish the mashup container would "get out of the way" and let the data straight thru.  I suspect I would save a little over an hour if it wasn't for the unnecessary caching and spooling to disk.

 

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Kudoed Authors