Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
bcdobbs
Super User
Super User

Dataflow Gen2 Issue

Hi,

 

I'm trying to run a very simple Gen2 Dataflow from an on prem sql server.

 

All setups fine but when I ask it to refresh I get:
"Error: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: We don't support creating directories in Azure Storage unless they are empty."

 

Thanks


Ben



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!
20 REPLIES 20
SidJay
Employee
Employee

In addition to what's covered in the article (authorizing *.datawarehouse.pbidedicated.windows.net), port 1433 needs to be opened. We will be addressing the documentation gap shortly.

 

Thanks

shhemanth
Frequent Visitor

We are getting error while writting to lakehouse, source is onprem oracle databse, with out writting to lakehouse the gen2 dataflow refreshes succesfully.

 

Mashup Exception Error Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: Microsoft SQL: A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.) Details: DataSourceKind = Lakehouse;DataSourcePath = Lakehouse;Message = A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.);ErrorCode = -2146232060;Number = 121;Class = 20 GatewayObjectId: cd3ddd2d-fd2e-43c2-b2eb-d5f50a748335

this could potentially be a firewall rule set in your environment that is preventing the communications between the gateway and one of the services required to run the dataflow. Please check the article below:

Configure proxy settings for the on-premises data gateway | Microsoft Learn

 

If this doesn't solve the scenario, please create a new topic / thread in the forum so that we can better understand your scenario.

SidJay
Employee
Employee

There was a new version of the Gateway released today (May 30th). The version number is 3000.174.12. That version addresses the "We don't support creating directories in Azure Storage unless they are empty" error.

I've updated to 3000.174.12.

 

This has improved things in that the Gen2 Dataflow will now refresh when a destination is not specified. However if I ask it to write to the lake house I now get:

 

“null Error: Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: Microsoft SQL: A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server) Details: DataSourceKind = Lakehouse;DataSourcePath = Lakehouse;Message = A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: Named Pipes Provider, error: 40 - Could not open a connection to SQL Server);ErrorCode = -2146232060;Number = 53;Class = 20 GatewayObjectId: 2bcf3e12-740c-47e3-86f3-b94626567200. .”

 

I wondered if I needed to configure the lakehouse connection to use the gateway but it won't let me do that.

 

Ben



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

When you're editing your dataflow, are you able to see the data from your SQL Server in there? Does your Server require IPs to be in an allow list? could it be the firewall at your data source level preventing the connection?

It's not the on prem end causing the issue now. In Gen 2 Dataflows if I don't select a destination it pulls the data fine into its staging lake house. Problem occurs with selecting a destination (lakehouse).



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

In this case, since the Gateway is involved, the Gateway is writing to the destination endpoint directly. This means that the Gateway (and the associated Mashup containers executed by the Gateway) need visibility to the endpoints. This article discusses the necessary configurations: Configure proxy settings for the on-premises data gateway | Microsoft Learn

bcdobbs
Super User
Super User

Microsoft have confirmed it's a bug and they're working on it. 



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Hi @bcdobbs . Is this scenario working reliably for you now, after the extra (undocumented) firewall change? If not, would you mind please sharing the Microsoft ticket #?

 

We are running into a very similar issue with the latest gateway. In our case the error disappears with very small data volumes.

 

Thank you!

Hey!

please check the documentation that we've recently made available a couple of days ago:

https://learn.microsoft.com/en-us/fabric/data-factory/gateway-considerations-output-destinations

@miguelYes, I saw that document, thanks.

 

I do think it needs to be integrated into the main data gateway networking document (Adjust communication settings for the on-premises data gateway | Microsoft Learn), otherwise folks will keep misconfiguring their firewall.

 

I actually inquired with Microsoft on these forums just a few weeks ago about this topic, when we had another firewall issue. The rep claimed that official data gateway is complete and that no new rules are needed for Fabric  😀

 

Now, the key thing in our scenario is that this error only appears for larger data volumes. Data volumes below around 150k rows work fine. What can be causing this? I would've expected the firewall rules to either work or not, regardless of the data volumes (unless there is a fancy feature in the firewall that's based on data transferred (DLP?), connection times, etc.)

 

Also interestingly, everything works fine when staging is disabled. Even larger data volumes of around 1M rows transfer fine.

 

I do understand from Data Factory Spotlight: Dataflow Gen2 | Microsoft Fabric Blog | Microsoft Fabric that staging is inherently more complicated architecturally. Does Fabric automatically disable staging at runtime for small data volumes, even if staging is enabled in the data flow?

 

That would explain the behaviour we are seeing, although I was unable to locate any reference to such a possible Fabric feature.

How do you disable staging in the gen2 dataflow? When I click on the enable staging button, it grays out PUBLISH and won't let me save.

Hi Anthony,

 

We currently have a restriction where at least one query needs to be staged, but that restriction is being lifted. Depending on your region you may already have the change. We expect the change to be in all regions by next week.

 

If your region doesn't have the change yet, a temporary workaround would be to add one dummy query with staging enabled.

 

Thanks

Arentir
Resolver III
Resolver III

Hi @bcdobbs 

Did you get the issue resolved? I am facing the same error in a similar scenario (on-prem source, gateway with latest update)

@Arentir not yet. However having put a ticket into Microsoft I got a phone call this afternoon and have a teams call with an engineer tomorrow afternoon (3pm uk time).

 

I'm fairly certain it's an issue with the mashup engine inside the on prem gateway. (Ran sql server profiler and can see it grab the data fine).

 

Will update when I have new info!



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!
miguel
Community Admin
Community Admin

Hey!

You can click on the get help option inside of Microsoft Fabric. For your convenience, I've also left the direct link below:

https://support.fabric.microsoft.com/en-US/support

it's the "create support request" for Data Factory

miguel
Community Admin
Community Admin

Hey Ben!

 

What version of the gateway are you using? We recently released a new one (on Friday) that should solve cases like this one.

 

If the last version is still giving you issues, please reach out to our support team so they can further assist.

Hi @miguel ,

Really odd, so having checked I last updated my gateways on thursday which had resulted in the 3000.178.3 version shown in the screen shot. I've now downloaded the latest version from Microsoft which it initially wouldn't let me install because it was a lower number. Uninstalling everything and rebuilding the cluster I'm now running 3000.174.11 which is the latest "correct" version I can see.

 

Unfortunatly the error still persists. How do I contact your support teams?


Thanks


Ben



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

I think I might be running the latest version (the number is higher than what is listed in the blog):

bcdobbs_0-1685305611087.png

 



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.