Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
I have been waiting for about three years for the azure managed vnet gateway to become GA.
See more about that here:
About a month ago it finally went out of preview! (Congrats to that team.)
However there are some underlying problems that are still present. For example, the so-called "private endpoint" technology in Azure is extremely buggy. As a result, the network traffic between the managed-gateway and the SQL Server will encounters socket exceptions on a regular basis - many times a day, depending on the workload. When the gateway fails to retrieve data from the SQL Server, then it is the customer that will pay the price (both figuratively and literally). It can be a high cost - since customers are paying for *both* the failures in the Azure SQL resource and for the failures in the gateway.
NOTE: There are similar private endpoint bugs that are found in Azure ADF, Synapse Analytics Workspaces, and everywhere else that "private endpoints" are used. These bugs are most severe in "big data" scenarios where large numbers of rows are being transmitted from one resource to another (within the same East US region). Connectivity problems have a higher likelihood in these scenarios, since connections are open for extended periods. Despite the resiliency of the TCP/IP protocol stack, the private endpoint bugs are not avoidable and there will be many socket exceptions ("connection reset by peer").
About three years I expressed my concerns about the high rate of failure in the private networking, but even after three years the network bugs in the private enpoints have not been fixed. Instead of making the network team fix their bugs, the engineers on the PBI gateway team are forced to "mask" the problem using a pre-defined number of retries. However that solution is not perfect. Given that the network problems are caused by an unpredictable software bug, the retries aren't guaranteed to totally avoid a dataset refresh failure (ie. no matter if there were ten or twenty retries, there is still some uncertainty about whether the networking bug will persist for the entire span of time).
Whereas in Azure ADF the platform allow customers to customize the number of retries, the PBI gateway has a hard-coded number of retries. Per the engineer (lets call him P.C.) the number of retries is three:
“PBI does 3 retries for retriable errors. It's not something special to vnet data gateway. It might increase the cost using vnet data gateway since the gateway might be up an running longer because of this”
Does anyone else have concerns about the impact of this solution on their long-running dataset queries (PQ import)? Has anyone noticed that these retries are happening at the customer expense? They can be observed by (1) downloading the gateway logs from the vnet gateway, or (2) by reviewing the duration of time that it takes for dataset refreshes to complete, or (3) by auditing your P1 capacity statement at the end of each month. It is important to monitor because, unless you are actually looking for it, the underlying network bugs will continue to live happily under the covers of Power BI, and will slowly feed off of your monthly Azure budget.
If I had my preference, these implicit retries would be totally disabled (or they would be implemented manually using the refresh schedule). I would much rather have awareness of the problems, than to pretend they don't exist. IMHO it is not acceptable for software developers (even a power bi developer) to pass along these unwanted and unexpected costs to a down-stream customer or employer.
Please let me know if others are using the Azure VNET gateway, and have observed these repeated socket exceptions and subsequent retries.
Hi @dbeavon3 ,
If you would like to suggest feature improvements, you may vote the idea and comment here to improve this feature. It is a place for customers provide feedback about Microsoft Office products . What’s more, if a feedback is high voted there by other customers, it will be promising that Microsoft Product Team will take it into consideration when designing the next version in the future.
Best Regards,
Community Support Team_ Scott Chang
Hi @Anonymous
Thanks for the coment.
I was not looking for a link to the ideas portal. I think most of the community is probably aware of that (and avoid it). Microsoft is always pre-occupied with their own strategic path (and on monetization). They rarely seem to work on ideas from the community. I've waited many years for simple issues to be fixed (eg. in dataflows) and I am not likely to create any more noise in the ideas portal until Microsoft starts to work on the stuff that is already on that list.
The purpose of the post is to confirm that the socket exceptions are commonly encountered by other users of this "managed gateway". I'm hoping I'm not the only customer who is experiencing these chronic network problems . In my opinion, the (expensive) workaround (three implicit retries) is done for Microsoft's own sake. It is not in the interest of customers to simply mask over the underlying networking bugs, since there is a cost that will be paid whether the bug is seen or unseen.
Your comment speaks about "improving this feature" but we will not agree on that language. If every customer of the gateway is impacted by these chronic socket exceptions to this degree, then the right word to use for these retries is "workaround" (rather than "feature". )
I was not looking for a link to the ideas portal. I think most of the community is probably aware of that (and avoid it).
I understand the sentiment, and I agree. Nevertheless - "If you didn't vote you don't have the right to complain".
I found that raising a Pro ticket will get you answers (either way) in a reasonable amount of time.
If you have a Pro license you can open a Pro ticket at https://admin.powerplatform.microsoft.com/newsupportticket/powerbi
Otherwise you can raise an issue at https://community.fabric.microsoft.com/t5/Issues/idb-p/Issues .
@lbendlin
>> raising a Pro ticket will get you answers
I already have a pro ticket open and it has been open for the past three years (since shortly after the preview started). As I understand, the underlying socket exceptions are one of the main reasons why it took three years for this "managed" gateway to become GA.
My hope was that the network bugs themselves would be fixed by GA, but I don't think the Power BI team has control over bugs the Azure platform itself. So the bugs remain, and customers will pay the price for this particular workaround.
While I understand that Microsoft is a big company, and that the Power BI team doesn't control what happens in other teams, I think they could have at least given us a way to disable retries instead of obscuring the problem. No problem is ever solved by hiding it from sight. You just replace one problem with another. This retry functionality should be dialed down to zero by default, and customers should be the ones to increase it to 3 or 30 depending on how bad the network bugs are, and how motivated they are to avoid them at the price of their Azure bills. Retries should be customizable, if needed. That is how things are working in all of the other platforms that are impacted by the same VNET bugs (eg. ADF and Synapse). I think Microsoft often uses "kid gloves" when building software for Power BI users, and it is not always in our best interest. Some of us want to know when our queries are executed again and again and again, before the Azure bills and Power BI bills are sent out.
Has your Microsoft rep ever mentioned that you can file a DCR?
I'm assuming you are talking about the CSS engineers? No they have not. I almost always give up on CSS cases and move them to "unified" if I'm not getting anywhere.
In this case the CSS team has been very engaged and sympathetic, only they blame the problems on a different organization at Microsoft. They haven't yet put me in touch with that other organization (private VNET team that manages private endpoints). It is sort of a catch-22 where the PBI team won't help, and neither will they refer me to someone that can help.
I might move this case to unified too, after three years. I feel a bit bad about doing that but it seems necessary, now that the GA is here and Microsoft seems content with their workaround.
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 56 | |
| 55 | |
| 37 | |
| 18 | |
| 14 |