Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
Vishnu_G
Regular Visitor

Power BI On-Premises Data Gateway Service Keeps Stopping on AWS EC2 with Custom Connector

I'm running into an issue with the Power BI On-Premises Data Gateway installed on an AWS EC2 instance (Windows Server). I've developed a custom connector and configured it correctly using the standard steps, placing the .mez file in the appropriate Custom Connectors folder and updating the gateway settings.

Everything works fine initially. The connector is detected and I can refresh the dataset from the Power BI Fabric. However, the issue occurs for schedule refreshes — after some refreshes , the refresh fails and the Power BI Service shows that the gateway is offline or can't connect.

I logged into the EC2 instance and noticed that the gateway service (PBIEgwService) has stopped running. Restarting the service manually and re-adding the connector path resolves it temporarily, but the issue keeps recurring.

1 ACCEPTED SOLUTION
v-kathullac
Community Support
Community Support

Hi @Vishnu_G ,

Thank you for reaching out to Microsoft Fabric Community Forum.

 

  • Avoid unnecessary use of Table.Buffer in your custom connector as it loads entire datasets into memory, increasing RAM consumption.
  • Use Table.Buffer only after applying all filters and transformations to reduce memory pressure.
  • Replace the current t3.large EC2 instance (burstable) with a more suitable one like m5.2xlarge (8 vCPU, 32 GB RAM) or r6a.xlarge (4 vCPU, 32 GB RAM) for better performance.
  • Use a PowerShell script to monitor and restart the PBIEgwService if it crashes or stops:

$service = Get-Service -Name "PBIEgwService"
if ($service.Status -ne "Running") {
Restart-Service -Name "PBIEgwService" -Force
}

 

  • Schedule the PowerShell script using Windows Task Scheduler to run every 5 or 10 minutes.
  • Optimize queries to push data transformations to the source database instead of performing them in Power Query.

  • Enable gateway diagnostic logging to track CPU/memory spikes and refresh bottlenecks.

  • Open necessary firewall ports (e.g., 443, 5671, 5672, 9350–9354) to avoid TCP connection errors like 10061.

  • Set up staggered refresh schedules in Power BI Service to avoid resource spikes during concurrent refreshes.

  • Configure a daily scheduled task to stop and start the gateway service during off-hours to clear memory and refresh service state.

  • Keep the On-Premises Gateway updated to the latest version for performance and stability improvements.

  • Consider configuring a gateway cluster for load balancing and high availability if refresh loads remain high.

Regards,

Chaithanya.

 

View solution in original post

6 REPLIES 6
v-kathullac
Community Support
Community Support

Hi @Vishnu_G 

I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.

Thank you.

v-kathullac
Community Support
Community Support

Hi @Vishnu_G 

I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.

Thank you.

v-kathullac
Community Support
Community Support

Hi @Vishnu_G 

I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.

Thank you.

v-kathullac
Community Support
Community Support

Hi @Vishnu_G ,

Thank you for reaching out to Microsoft Fabric Community Forum.

 

  • Avoid unnecessary use of Table.Buffer in your custom connector as it loads entire datasets into memory, increasing RAM consumption.
  • Use Table.Buffer only after applying all filters and transformations to reduce memory pressure.
  • Replace the current t3.large EC2 instance (burstable) with a more suitable one like m5.2xlarge (8 vCPU, 32 GB RAM) or r6a.xlarge (4 vCPU, 32 GB RAM) for better performance.
  • Use a PowerShell script to monitor and restart the PBIEgwService if it crashes or stops:

$service = Get-Service -Name "PBIEgwService"
if ($service.Status -ne "Running") {
Restart-Service -Name "PBIEgwService" -Force
}

 

  • Schedule the PowerShell script using Windows Task Scheduler to run every 5 or 10 minutes.
  • Optimize queries to push data transformations to the source database instead of performing them in Power Query.

  • Enable gateway diagnostic logging to track CPU/memory spikes and refresh bottlenecks.

  • Open necessary firewall ports (e.g., 443, 5671, 5672, 9350–9354) to avoid TCP connection errors like 10061.

  • Set up staggered refresh schedules in Power BI Service to avoid resource spikes during concurrent refreshes.

  • Configure a daily scheduled task to stop and start the gateway service during off-hours to clear memory and refresh service state.

  • Keep the On-Premises Gateway updated to the latest version for performance and stability improvements.

  • Consider configuring a gateway cluster for load balancing and high availability if refresh loads remain high.

Regards,

Chaithanya.

 

SupriyaChaugule
Regular Visitor

Hi   @v-kathullac 

Thank you for the detailed recommendations.

We are currently running the On-Premises Data Gateway on a t3.large EC2 instance with 50 GB of RAM. However, during scheduled Power BI dashboard refreshes, we consistently observe CPU utilization reaching 100% and memory usage exceeding 90%. This often causes the gateway service to crash or become unresponsive.

In our custom connector, we are using:

bufferedTable = Table.Buffer(finalTable)

We're investigating whether the use of Table.Buffer may be causing excessive memory pressure or inefficient caching, especially when processing large datasets. We'd appreciate any guidance on whether Table.Buffer might be contributing to these issues and if there are best practices for managing memory more efficiently in such scenarios.

Additionally, we are encountering the following error in the gateway logs:

'Microsoft.PowerBI.DataMovement.Pipeline.GatewayClient.GatewayConfigurationClientException: Error retrieving gateway configuration. Please ensure your gateway is running on the local computer and it is updated to the latest version.
Exception message: Could not connect to net.tcp://{port}/powerbi/gatewayconfiguration/service. The connection attempt lasted for a time span of 00:00:02.0224416. TCP error code 10061: No connection could be made because the target machine actively refused it.
at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.Sockets.Socket.Connect(EndPoint remoteEP)
at System.ServiceModel.Channels.SocketConnectionInitiator.Connect(Uri uri, TimeSpan timeout) '

We have already configured the service recovery options in services.msc to restart the gateway on first, second, and subsequent failures, and we have also configured it to reset the failure count after 1 day. Despite these settings, the gateway service fails to restart automatically when it crashes—requiring manual intervention each time.

This further indicates a potentially critical issue either with how memory and resources are being handled in the custom connector, or with the overall system resource limits under load.

Could this behavior be tied to how our connector is managing in-memory data (e.g., use of Table.Buffer), or are there any other diagnostics or mitigation strategies you would recommend?

v-kathullac
Community Support
Community Support

Hi @Vishnu_G ,

Thank you for reaching out to Microsoft Fabric Community Forum.

 

This is a common issue when using custom connectors with Power BI On-Premises Data Gateway on cloud-hosted environments like AWS EC2. Below are few points that can fix the recurring service crash

  1. Restarting the PBIEgwService manually temporarily resolves the issue, indicating the connector may be crashing the service.
  2. Custom connectors can cause the gateway to crash if they trigger unhandled exceptions or use unsupported libraries.
  3. Make sure TestConnection is implemented correctly in the custom connector.
  4. Avoid blocking or synchronous calls inside asynchronous methods.
  5. Ensure proper cleanup of memory and resources in the connector logic.
  6. Increase logging level to Verbose in GatewayCore.dll.config for detailed diagnostics.
  7. Review logs in C:\Users\<GatewayServiceAccount>\AppData\Local\Microsoft\On-premises data gateway\ for crash reasons.
  8. Set gateway service recovery options to auto-restart on failure using services.msc.
  9. Use ConnectorIsolationLevel = IsolationLevel.Process in extension.pqx to isolate the connector in its own process.
  10. Monitor CPU and memory usage on the EC2 instance during scheduled refreshes.
  11. Space out scheduled refreshes to avoid simultaneous executions if multiple datasets use the connector.
  12. Test connector logic in Power BI Desktop with tools like Fiddler or ProcMon for better debugging.
  13. Consider deploying complex logic to an external API (e.g., Azure Function) and use the standard Web API connector to call it.

Regards,

Chaithanya.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.