Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
madendever
New Member

ADF Pipeline: Snowflake in AWS Performing Direct PutBlob Operations Instead of Using SHIR

Hello,

 

I'm hoping someone may be able to provide some assistance with a question I have regarding the scenario below.

 

I've created an ADF Pipeline for copying data from Snowflake (in AWS) to SQL Server (on an Azure VM) using Azure Blob as the Staging Location.  For this, I'm using a Self-Hosted Integration Runtime which is hosted on an Azure VM on the same network as the SQL Server.  However, this results in AWS performing PutBlob operations directly into Azure Blob staging instead of running it through the Self-Hosted Integration Runtime (hosted by the Azure VM).  Is this to be expected?

If so, this creates a security concern as the Blob would need to be left open to the public internet in order for AWS to access it (or left open to all AWS ip's in the region).  

The aws ip addresses hitting the blob do not even correspond to the dns of the aws snowflake instance.

 

If anyone can answer this question or provide assistance I'd greatly appreciate it. 

 

1 ACCEPTED SOLUTION
nilendraFabric
Super User
Super User

Hello @madendever 

 

the behavior you are observing is expected when using Azure Data Factory (ADF) to copy data from Snowflake (hosted on AWS) to SQL Server (hosted on Azure) with Azure Blob Storage as the staging location. This is because ADF optimizes data movement by allowing Snowflake to directly interact with Azure Blob Storage over the public internet, bypassing the Self-Hosted Integration Runtime (SHIR).

 

When using ADF’s Copy Activity, if both the source (Snowflake) and the staging location (Azure Blob Storage) are cloud-based and accessible via public endpoints, ADF uses direct cloud-to-cloud data transfer instead of routing through SHIR.

 

• “If your data store is a managed cloud data service, you can use the Azure Integration Runtime. If the access is restricted to IPs that are approved in the firewall rules, you can add Azure Integration Runtime IPs to the allowed list.”

referenced from this link :https://learn.microsoft.com/en-us/azure/data-factory/connector-snowflake?tabs=data-factory

 

According to Microsoft’s Copy Activity documentation, when copying data between two publicly accessible cloud data stores, ADF uses the Azure Integration Runtime (Azure IR) by default. This integration runtime facilitates direct communication between the source and sink over public endpoints without involving SHIR

The Copy Activity documentation outlines that ADF determines which IR to use based on the connectivity of the source and sink:
• “An integration runtime needs to be associated with each source and sink data store.”

 

If both are accessible via public endpoints, ADF uses Azure IR for direct interaction between the services.

 

To address security concerns:

 

Enable Private Connectivity:
• Use Azure Private Link for Azure Blob Storage or AWS PrivateLink for Snowflake to ensure traffic flows through private networks rather than public endpoints.
2. Restrict Access via Firewall Rules:
• Configure your storage account firewall to allow only specific IP ranges used by Snowflake in your AWS region.

 

please see if this is helpful and accept the solution 

 

thanks

 

 

View solution in original post

4 REPLIES 4
madendever
New Member

Thanks for the reply. That was very informative. Unfortunately, the Snowflake instance is hosted by a 3rd party so there is no way to secure it.

@madendever Could you please accept the solution if this is helpful 

nilendraFabric
Super User
Super User

Hello @madendever 

 

the behavior you are observing is expected when using Azure Data Factory (ADF) to copy data from Snowflake (hosted on AWS) to SQL Server (hosted on Azure) with Azure Blob Storage as the staging location. This is because ADF optimizes data movement by allowing Snowflake to directly interact with Azure Blob Storage over the public internet, bypassing the Self-Hosted Integration Runtime (SHIR).

 

When using ADF’s Copy Activity, if both the source (Snowflake) and the staging location (Azure Blob Storage) are cloud-based and accessible via public endpoints, ADF uses direct cloud-to-cloud data transfer instead of routing through SHIR.

 

• “If your data store is a managed cloud data service, you can use the Azure Integration Runtime. If the access is restricted to IPs that are approved in the firewall rules, you can add Azure Integration Runtime IPs to the allowed list.”

referenced from this link :https://learn.microsoft.com/en-us/azure/data-factory/connector-snowflake?tabs=data-factory

 

According to Microsoft’s Copy Activity documentation, when copying data between two publicly accessible cloud data stores, ADF uses the Azure Integration Runtime (Azure IR) by default. This integration runtime facilitates direct communication between the source and sink over public endpoints without involving SHIR

The Copy Activity documentation outlines that ADF determines which IR to use based on the connectivity of the source and sink:
• “An integration runtime needs to be associated with each source and sink data store.”

 

If both are accessible via public endpoints, ADF uses Azure IR for direct interaction between the services.

 

To address security concerns:

 

Enable Private Connectivity:
• Use Azure Private Link for Azure Blob Storage or AWS PrivateLink for Snowflake to ensure traffic flows through private networks rather than public endpoints.
2. Restrict Access via Firewall Rules:
• Configure your storage account firewall to allow only specific IP ranges used by Snowflake in your AWS region.

 

please see if this is helpful and accept the solution 

 

thanks

 

 

Hello @nilendraFabric , thanks for sharing this, we have a Snowflake instance running in AWS VPC and has private link, we are trying to access it from ADF, want to create a linked service to it. I am trying to create managed private endpoint in adf, but when selecting account the target resource id is only azure ones. As out data source in AWS vpc, do you know what is the best and reliable way for connection. I know there is a way to use SHIR installed in aws vpc and use that SHIR. Is there any other way?. Thanks

Helpful resources

Announcements
FabCon and SQLCon Highlights Carousel

FabCon &SQLCon Highlights

Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Fabric Update Carousel

Fabric Monthly Update - March 2026

Check out the March 2026 Fabric update to learn about new features.

Top Kudoed Authors