Re: Copy data for on-prem SQL server

ScottPowellECU · ‎06-06-2023

Hi, does anyone have an idea of when the "Copy data" activity might work for on-prem data sources (in our case SQL Server)? I know we can use Gen2 dataflows for this - but they are very slow. Really hoping copy data is significantly faster.

Thanks!

Scott

Scott_Powell · ‎11-21-2023

Hi all, quick updates on this topic.

1. Per the Fabric roadmap ( https://learn.microsoft.com/en-us/fabric/release-plan/data-factory#opdg ), the ability to "fast" copy from on prem to the Fabric service is estimated for Q1 2024. Note however this talks about requiring that the gateway servers be "behind a vnet". No clue exactly what this means or how difficult this is to set up

2. During Ignite, Microsoft announced database mirroring ( https://blog.fabric.microsoft.com/en-us/blog/introducing-mirroring-in-microsoft-fabric?ft=All ), a kind of next-gen change data capture. SQL server on-prem is not ready yet but is one of the next databases they'll be targetting.

Hope this helps,

Scott

Dhicks · ‎11-22-2023

Hi just for clarity I think the above "Note however this talks about requiring that the gateway servers be "behind a vnet" " is referenced in the Fabric link above:

"This feature will enable data pipelines to use Fabric data gateways to access data that is on-premises and behind a VNet." I have bolded the and because I think this should be an OR ie this is two different scenarios where gateways can be used.

Scott_Powell · ‎11-22-2023

@Dhicks that would be awesome if true. I'm a simple BI guy, don't really understand VNets, but just glancing through them I could feel heartburn at having to get this approved by security.

Appreciate the insights!

Scott

Anonymous · ‎09-18-2023

Datamarts are you friend here - it sounds odd but they are a great tool in the Fabric toolbox

Create a Datamart and load you data from your on prem SQL server to the Datamart

Then in a pipeline use a copy activity to move the data from the datamart ( which appears as an Azure SQL Server) to the target Warehouse.

It works really well and we use it estensively as Gen2 Dataflows do not have the reliability or performance that we need.

Anonymous · ‎01-22-2024

你好，请问数据仓库从datamarts复制数据的具体配置是什么样的呢

Scott_Powell · ‎09-19-2023

@Anonymous - curious how do you get your data into the datamart performantly? I thought those used dataflows, which is still a huge performance issue for us. I need to move just over a Terabyte nightly. Understand once it's in a datamart it might be fast, but not sure how to get the data in there quickly to start with.

Thanks!

Scott

Thanks,

Scott

Anonymous · ‎09-19-2023

Hi - We use dataflows to load to datamarts - they are reasonably fast - do have a limit of 100 GB though.

Do you need to load all of that terrabyte of data each day ? - could you not just do a delta load to the datamart ?

rwolburg · ‎08-23-2023

Hi,

Can anyone can share the steps to make a "copy" of a table to the lakehose using dataflow while the feature in datapipe is available?

thanks in advanced

makromer · ‎07-14-2023

We are working on enabling OPDG for Copy Activity to provide the same capabilities in Fabric data pipelines as already existing today in ADF with SHIR. The plan is to bring this capability to Fabric very soon during public preview.

Trystan · ‎09-12-2023

Hey @makromer, any updates on when this might be available?

Thanks.

Scott_Powell · ‎08-18-2023

hi @makromer , just wanted to touch base and see if there are any updates on when the fast copy capability to move data from on-prem SQL server to Fabric is. Coming soon?

Thanks!

Scott

bcdobbs · ‎06-12-2023

What sort of transfer rates ar you getting in gen2 dataflows? Mine just took an hour to load a 2gb table from On Prem SQL. Trying to work out if it's our environment or just how it is.

Ben Dobbs

LinkedIn | Twitter | Blog

Did I answer your question? Mark my post as a solution! This will help others on the forum!
Appreciate your Kudos!!

Scott_Powell · ‎06-21-2023

FYI @bcdobbs - the 2GB / hr seems roughly in line with what we get with stuff going through the gateway server. That's with a 3 gateway cluster but I haven't spent any time getting fancy with things like trying to partition the data (via incremental refresh) to get multiple loads running in parallel.

Hope this helps,

Scott

Scott_Powell · ‎06-13-2023

I'll send you some info once we get the firewalls opened. What I can say is this - if I look at Gen1 dataflow performance, it's nowhere even close to what the "COPY" utility is supposed to do. Based on this documentation ( https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-performance ) I should be able to move our entire DW over in under an hour. That's orders of magnitude faster than what Gen1 dataflows feels like.

Hope this helps!

Scott

AlexPowers · ‎06-06-2023

Hey Scott! Saw your comment in another thread with respect to being able to support on-prem connectivity via pipelines, this is a top priority to bring to Fabric as part of the team's path to General Availability. There will be some improvements brought to Dataflows to help with ingestion (leveraging the same underlying copy activity / fast copy infrastructure of pipelines) there are still few items to resolve to get us there but if you can help the team out by voting over on Fabric ideas so they can best prioritize the necessary work: https://aka.ms/fabricideas

Dhicks · ‎11-20-2023

Hi @AlexPowers

Fabric has now gone General Availability, do you have any idea of the timeframe to support Copy Data activity for pipelines, through a On Premise Data gateway?

Thanks

JFTxJ · ‎09-29-2023

Hi @AlexPowers ,

Thank you for this statement as it makes me feel much better. "this is a top priority to bring to Fabric as part of the team's path to General Availability."

As-of today, is-there any news on when on-prem connectivity via pipelines would become available?

Trystan · ‎09-12-2023

Hey @AlexPowers, any updates on when this might become available?

Thanks

ScottPowellECU · ‎08-18-2023

hi @AlexPowers , just wanted to follow up and see if there's been any progress on using pipelines and the copy activity to bring on-prem SQL server data into Fabric. Been a couple of months, hoping we're getting close. Gen2 dataflow performance is just too slow for the amounts of data we're hoping to move to Fabric.

Thanks!

Scott

MisterSmith · ‎07-14-2023

Hi Alex

Do you know if the copy data activity in pipelines to on prem systems is available yet? And if not are there any timescales as to when it will be?

Thanks

Neil

Copy data for on-prem SQL server

Helpful resources

New to Fabric Survey

A new Data Days event is coming soon!

Copy data for on-prem SQL server

Helpful resources

New to Fabric Survey