Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
ScottPowellECU
Advocate II
Advocate II

Copy data for on-prem SQL server

Hi, does anyone have an idea of when the "Copy data" activity might work for on-prem data sources (in our case SQL Server)? I know we can use Gen2 dataflows for this - but they are very slow. Really hoping copy data is significantly faster.

 

Thanks!

Scott

21 REPLIES 21
Scott_Powell
Advocate III
Advocate III

Hi all, quick updates on this topic.

 

1. Per the Fabric roadmap ( https://learn.microsoft.com/en-us/fabric/release-plan/data-factory#opdg ), the ability to "fast" copy from on prem to the Fabric service is estimated for Q1 2024. Note however this talks about requiring that the gateway servers be "behind a vnet". No clue exactly what this means or how difficult this is to set up

 

2. During Ignite, Microsoft announced database mirroring ( https://blog.fabric.microsoft.com/en-us/blog/introducing-mirroring-in-microsoft-fabric?ft=All ), a kind of next-gen change data capture. SQL server on-prem is not ready yet but is one of the next databases they'll be targetting.

 

Hope this helps,

Scott

Hi just for clarity I think the above "Note however this talks about requiring that the gateway servers be "behind a vnet" " is referenced in the Fabric link above: 

"This feature will enable data pipelines to use Fabric data gateways to access data that is on-premises and behind a VNet." I have bolded the and because I think this should be an OR ie this is two different scenarios where gateways can be used.  

@Dhicks that would be awesome if true. I'm a simple BI guy, don't really understand VNets, but just glancing through them I could feel heartburn at having to get this approved by security.

 

Appreciate the insights!

Scott

BryanCarmichael
Advocate I
Advocate I

Datamarts are you friend here  - it sounds odd but they are a great tool in the Fabric toolbox

Create a Datamart and load you data from your on prem SQL server to the Datamart

Then in a pipeline use a copy activity to move the data from the datamart ( which appears as an Azure SQL Server) to the target Warehouse.

It works really well and we use it estensively as Gen2 Dataflows do not have the reliability or performance that we need.

你好,请问数据仓库从datamarts复制数据的具体配置是什么样的呢

@BryanCarmichael - curious how do you get your data into the datamart performantly? I thought those used dataflows, which is still a huge performance issue for us. I need to move just over a Terabyte nightly. Understand once it's in a datamart it might be fast, but not sure how to get the data in there quickly to start with.

 

Thanks!

Scott

 

Thanks,

Scott

Hi - We use dataflows to load to datamarts  - they are reasonably fast - do have a limit of 100 GB though.

 

Do you need to load all of that terrabyte of data each day ? - could you not just do a delta load to the datamart ?

rwolburg
New Member

Hi,

 

Can anyone can share the steps to make a "copy" of a table to the lakehose using dataflow while the feature in datapipe is available?

 

thanks in advanced

makromer
Microsoft Employee
Microsoft Employee

We are working on enabling OPDG for Copy Activity to provide the same capabilities in Fabric data pipelines as already existing today in ADF with SHIR. The plan is to bring this capability to Fabric very soon during public preview.

Hey @makromer, any updates on when this might be available?

 

Thanks.

hi @makromer , just wanted to touch base and see if there are any updates on when the fast copy capability to move data from on-prem SQL server to Fabric is. Coming soon? 

 

Thanks!

Scott

bcdobbs
Super User
Super User

What sort of transfer rates ar you getting in gen2 dataflows? Mine just took an hour to load a 2gb table from On Prem SQL. Trying to work out if it's our environment or just how it is.



Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

FYI @bcdobbs  - the 2GB / hr seems roughly in line with what we get with stuff going through the gateway server. That's with a 3 gateway cluster but I haven't spent any time getting fancy with things like trying to partition the data (via incremental refresh) to get multiple loads running in parallel.

 

Hope this helps,

Scott

I'll send you some info once we get the firewalls opened. What I can say is this - if I look at Gen1 dataflow performance, it's nowhere even close to what the "COPY" utility is supposed to do. Based on this documentation ( https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance ) I should be able to move our entire DW over in under an hour. That's orders of magnitude faster than what Gen1 dataflows feels like.

 

Hope this helps!

Scott

AlexPowers
Microsoft Employee
Microsoft Employee

Hey Scott! Saw your comment in another thread with respect to being able to support on-prem connectivity via pipelines, this is a top priority to bring to Fabric as part of the team's path to General Availability. There will be some improvements brought to Dataflows to help with ingestion (leveraging the same underlying copy activity / fast copy infrastructure of pipelines) there are still few items to resolve to get us there but if you can help the team out by voting over on Fabric ideas so they can best prioritize the necessary  work: https://aka.ms/fabricideas 

Hi @AlexPowers

 

Fabric has now gone General Availability, do you have any idea of the timeframe to support Copy Data activity for pipelines, through a On Premise Data gateway?


Thanks

 

Hi @AlexPowers ,

Thank you for this statement as it makes me feel much better. "this is a top priority to bring to Fabric as part of the team's path to General Availability."

 

As-of today, is-there any news on when on-prem connectivity via pipelines would become available?

Hey @AlexPowers, any updates on when this might become available?

 

Thanks

hi @AlexPowers , just wanted to follow up and see if there's been any progress on using pipelines and the copy activity to bring on-prem SQL server data into Fabric. Been a couple of months, hoping we're getting close. Gen2 dataflow performance is just too slow for the amounts of data we're hoping to move to Fabric.

 

Thanks!

Scott

Hi Alex

Do you know if the copy data activity in pipelines to on prem systems is available yet?  And if not are there any timescales as to when it will be?

 

Thanks

 

Neil

Helpful resources

Announcements
Sept Fabric Carousel

Fabric Monthly Update - September 2024

Check out the September 2024 Fabric update to learn about new features.

Expanding the Data Factory Forums

New forum boards available in Data Factory

Ask questions in Apache Airflow Job and Mirroring.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.