I'm trying to run a very simple Gen2 Dataflow from an on prem sql server.
All setups fine but when I ask it to refresh I get:
"Error: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: We don't support creating directories in Azure Storage unless they are empty."
In addition to what's covered in the article (authorizing *.datawarehouse.pbidedicated.windows.net), port 1433 needs to be opened. We will be addressing the documentation gap shortly.
We are getting error while writting to lakehouse, source is onprem oracle databse, with out writting to lakehouse the gen2 dataflow refreshes succesfully.
Mashup Exception Error Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: Microsoft SQL: A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.) Details: DataSourceKind = Lakehouse;DataSourcePath = Lakehouse;Message = A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.);ErrorCode = -2146232060;Number = 121;Class = 20 GatewayObjectId: cd3ddd2d-fd2e-43c2-b2eb-d5f50a748335
this could potentially be a firewall rule set in your environment that is preventing the communications between the gateway and one of the services required to run the dataflow. Please check the article below:
If this doesn't solve the scenario, please create a new topic / thread in the forum so that we can better understand your scenario.
There was a new version of the Gateway released today (May 30th). The version number is 3000.174.12. That version addresses the "We don't support creating directories in Azure Storage unless they are empty" error.
I've updated to 3000.174.12.
This has improved things in that the Gen2 Dataflow will now refresh when a destination is not specified. However if I ask it to write to the lake house I now get:
“null Error: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: Microsoft SQL: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) Details: DataSourceKind = Lakehouse;DataSourcePath = Lakehouse;Message = A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server);ErrorCode = -2146232060;Number = 53;Class = 20 GatewayObjectId: 2bcf3e12-740c-47e3-86f3-b94626567200. .”
I wondered if I needed to configure the lakehouse connection to use the gateway but it won't let me do that.
When you're editing your dataflow, are you able to see the data from your SQL Server in there? Does your Server require IPs to be in an allow list? could it be the firewall at your data source level preventing the connection?
It's not the on prem end causing the issue now. In Gen 2 Dataflows if I don't select a destination it pulls the data fine into its staging lake house. Problem occurs with selecting a destination (lakehouse).
In this case, since the Gateway is involved, the Gateway is writing to the destination endpoint directly. This means that the Gateway (and the associated Mashup containers executed by the Gateway) need visibility to the endpoints. This article discusses the necessary configurations: Configure proxy settings for the on-premises data gateway | Microsoft Learn
Hi @bcdobbs . Is this scenario working reliably for you now, after the extra (undocumented) firewall change? If not, would you mind please sharing the Microsoft ticket #?
We are running into a very similar issue with the latest gateway. In our case the error disappears with very small data volumes.
please check the documentation that we've recently made available a couple of days ago:
@miguelYes, I saw that document, thanks.
I do think it needs to be integrated into the main data gateway networking document (Adjust communication settings for the on-premises data gateway | Microsoft Learn), otherwise folks will keep misconfiguring their firewall.
I actually inquired with Microsoft on these forums just a few weeks ago about this topic, when we had another firewall issue. The rep claimed that official data gateway is complete and that no new rules are needed for Fabric 😀
Now, the key thing in our scenario is that this error only appears for larger data volumes. Data volumes below around 150k rows work fine. What can be causing this? I would've expected the firewall rules to either work or not, regardless of the data volumes (unless there is a fancy feature in the firewall that's based on data transferred (DLP?), connection times, etc.)
Also interestingly, everything works fine when staging is disabled. Even larger data volumes of around 1M rows transfer fine.
I do understand from Data Factory Spotlight: Dataflow Gen2 | Microsoft Fabric Blog | Microsoft Fabric that staging is inherently more complicated architecturally. Does Fabric automatically disable staging at runtime for small data volumes, even if staging is enabled in the data flow?
That would explain the behaviour we are seeing, although I was unable to locate any reference to such a possible Fabric feature.
How do you disable staging in the gen2 dataflow? When I click on the enable staging button, it grays out PUBLISH and won't let me save.
We currently have a restriction where at least one query needs to be staged, but that restriction is being lifted. Depending on your region you may already have the change. We expect the change to be in all regions by next week.
If your region doesn't have the change yet, a temporary workaround would be to add one dummy query with staging enabled.
@Arentir not yet. However having put a ticket into Microsoft I got a phone call this afternoon and have a teams call with an engineer tomorrow afternoon (3pm uk time).
I'm fairly certain it's an issue with the mashup engine inside the on prem gateway. (Ran sql server profiler and can see it grab the data fine).
Will update when I have new info!
You can click on the get help option inside of Microsoft Fabric. For your convenience, I've also left the direct link below:
it's the "create support request" for Data Factory
What version of the gateway are you using? We recently released a new one (on Friday) that should solve cases like this one.
If the last version is still giving you issues, please reach out to our support team so they can further assist.
Hi @miguel ,
Really odd, so having checked I last updated my gateways on thursday which had resulted in the 3000.178.3 version shown in the screen shot. I've now downloaded the latest version from Microsoft which it initially wouldn't let me install because it was a lower number. Uninstalling everything and rebuilding the cluster I'm now running 3000.174.11 which is the latest "correct" version I can see.
Unfortunatly the error still persists. How do I contact your support teams?
Check out the November 2023 Fabric update to learn about new features.
130+ sessions, 130+ speakers, Product managers, MVPs, and experts. All about Power BI and Fabric. Attend online or watch the recordings.