Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
c_lovestrom
Frequent Visitor

What if a gateway update causes errors?

My manager is concerned that one day we'll update our on-premise data gateway that the majority of our power bi service connections use, and the update will have errors that will cause all our Semantic Model refreshes to fail. If that happens, there's no way to reverse an update, and we'd be forced to wait on a fix from Microsoft.

 

He'd like to have a way to test the updates before rolling them out onto our production gateway (or have a way to reverse any updates installed, but I'm doubtful that's possible).

 

Is there a viable way to do this?

1 ACCEPTED SOLUTION

As you can see from my screenshot there can be many, many corrections issued between regular releases.  September 2025 had five versions, for example, and the latest August version was sneakily inserted AFTER the first September version was released. 

 

Of course you can take your own conclusions from that but we have been badly burned in the past by this, and are now extra extra cautious, updating a canary first and then waiting how it behaves.  There are tons of feature changes between versions, nearly none of them documented (the latest 0.20 Spark drama for example).  

 

There be dragons.

View solution in original post

9 REPLIES 9
Sukan1719
New Member

@c_lovestrom  Your manager is right. If an update introduces a bug (which has happened in the past), there is no simple "Undo" button in the interface to revert to the previous state.

However, there is a very viable, standard industry approach to solving this. You don't need to wait for Microsoft to fix it if you implement a Staging Strategy and a Disaster Recovery Plan.

Here is the breakdown of how to architect this safely.


1. The "Canary" Method: A Dev/Test Gateway

The most effective way to test updates before they hit production is to establish a separate, non-production Gateway.

  • How it works: You set up a Gateway on a separate server (or VM) specifically for development or testing.

  • ** The Process:**

    1. When a new Gateway version is released, install it on the Dev Gateway first.

    2. Have a specific Power BI workspace configured to use this Dev Gateway with a few critical Semantic Models (datasets) that represent your typical data sources (SQL, Oracle, etc.).

    3. Run refreshes on these test models.

    4. If the refreshes succeed and the data looks correct, wait 24–48 hours to ensure stability.

    5. Only then apply the update to the Production Gateway.

2. The Rollback Plan (The "Time Machine")

Your manager is doubtful about reversing updates. He is technically correct that there is no "Rollback" button, but you can manually revert if you prepare correctly.

The Golden Rule: Always keep the installation file (GatewayInstall.exe) of your current working version before you update.

How to revert if an update fails:

  1. Uninstall: You must completely uninstall the broken, new version of the Gateway from the server via Windows "Add or remove programs."

  2. Reinstall: Run the installer for the previous version (which you saved).

  3. Restore: During installation, it will ask if you want to register a new gateway or Restore / Takeover an existing gateway. Choose Restore.

  4. Key Entry: You will be prompted for your Gateway Recovery Key. Note: If you do not have this key saved, you cannot restore the gateway, and you will have to create a new one from scratch and re-map all data sources in Power BI Service.

Crucial Requirement: Ensure your Recovery Key is saved in a secure password manager. Without it, a rollback is impossible.

3. High Availability (HA) Clusters

To prevent downtime during the update process itself, you should use Gateway Clustering.

  • Concept: You install the Gateway on two or more servers and join them to the same Gateway "Cluster."

  • The Update Workflow:

    1. Take Node A out of the load balancer (or just let the cluster handle it).

    2. Update Node A.

    3. If Node A comes back online and works, update Node B.

    4. If Node A fails, Node B is still running the old version and handling the traffic.

Note: HA protects against the server going down, but it is less effective against "bad updates" than the Dev/Test Gateway method. If you update Node A and it reports "Healthy" to the service but creates data errors due to a bug, the Cluster might still route queries to it. Therefore, Strategy #1 (Dev/Test Gateway) is superior for detecting bugs.


Summary Recommendation for Your Manager

To satisfy your manager's request for safety, propose this standard operating procedure (SOP):

The "Safe-Update" Protocol

  1. Maintain an Archive: Create a folder on your network drive containing the installer files for the last 3 months of Gateway releases. (e.g., Gateway_Install_Sept2024.exe, Gateway_Install_Oct2024.exe).

  2. Verify Recovery Keys: Confirm that the Gateway Recovery Key is known and accessible.

  3. Implement the 3-Day Rule:

    • Day 1: Microsoft releases update. Do not install. Download it and install it on the Test Gateway.

    • Day 2: Check Test Gateway refreshes. Check the Power BI Community forums to see if other users are reporting massive failures.

    • Day 3: If tests pass and forums are quiet, install the update on the Production Gateway.


v-hashadapu
Community Support
Community Support

Hi @c_lovestrom , Hope you are doing well. Kindly let us know if the issue has been resolved or if further assistance is needed. Your input could be helpful to others in the community.

R1k91
Super User
Super User

as perfectly stated by @lbendlin you cannot revert so we use the same tecnique for all our customers even if they've a cluster or not. we store every installer of each version of the gateway monthly and we backup all the configurations files before upgrading. in case of emergency we uninstall the gateway and we install the older version with the correct configuration files.

 

another approach we used for some other clients is to snapshot the VM before upgrading and keep it alive until we have green light.

having test environment, cluster, not to install if the gateway had been released few minutes ago etc. are good suggestions obviously.


--
Riccardo Perico
BI Architect @ Lucient Italia | Microsoft MVP

Blog | GitHub

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
v-hashadapu
Community Support
Community Support

Hi @c_lovestrom , Thank you for reaching out to the Microsoft Community Forum.

 

We find the answer shared by @lbendlin  is appropriate. Can you please confirm if the solution worked for you. It will help others with similar issues find the answer easily.

 

Thank you @lbendlin  for your valuable response.

lbendlin
Super User
Super User

Your gateway should always have more than one cluster member.  Then you can update the cluster members one by one and still have a working gateway if an upgrade fails. NOTE:  YOU CANNOT REVERT !!! You will have to uninstall and reinstall and reconfigure.

 

Pro tip:  DO NOT update immediately when a new version comes out.  Wait a month or five and monitor the forums here to see if there are any glaringly obvious issues.

 

lbendlin_0-1763159647816.png

 

I've got multiple VMs in our gateway cluster, so all good there. I'm curious about your update strategy though... from what I've seen new updates come out monthly, so if I waited 5 months to install the June update, when I'd go to download it in November I'd be downloading either the October or November update instead wouldn't I?

As you can see from my screenshot there can be many, many corrections issued between regular releases.  September 2025 had five versions, for example, and the latest August version was sneakily inserted AFTER the first September version was released. 

 

Of course you can take your own conclusions from that but we have been badly burned in the past by this, and are now extra extra cautious, updating a canary first and then waiting how it behaves.  There are tons of feature changes between versions, nearly none of them documented (the latest 0.20 Spark drama for example).  

 

There be dragons.

Thank you! Do you know where I could find some feedback on various versions? My google attempts haven't pulled up anything substantial.

Zanqueta
Solution Sage
Solution Sage

Hi @c_lovestrom,

Yes, your manager's concern is valid, as there is no simple "rollback" button for a bad gateway update. The correct solution is to never update your production gateway first.

You can mitigate this risk in two ways:

  1. Create a Test Environment:

    • Install a separate "Test" gateway on a different server (a VM is fine).

    • Connect this gateway to a test workspace in Power BI containing copies of your critical semantic models.

    • When a new update is released, install it on this "Test" gateway first.

    • Validate that all refreshes succeed and the data is correct. Only after confirming it's safe do you then update your main production gateway.

  2. Use a High-Availability (HA) Cluster (Best Practice):

    • Instead of one production gateway (a single point of failure), install the gateway on at least two servers and configure them as a "cluster".

    • Power BI will automatically balance the refresh load between them.

    • During an update, you update one server (node) at a time. While one node is updating, all refreshes are automatically routed to the other, resulting in zero downtime and no failed refreshes.

 

If this response solved your problem, please mark it as correct to help other community members.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.