Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
nikhil0511
Advocate I
Advocate I

Help related to copy acivity in data pipelines

I have different databases say db1, db2, db3 in sql server. I want to load multiple tables (always same table names from each database) to lakehouse. 

 

currently I am writing a copy activity to load a table from one database. I am thinking of using a for loop activity to achieve this. by passing the databases as the array and table names in an array. 

 

how can I achieve this. 

9 REPLIES 9
v-nuoc-msft
Community Support
Community Support

Hi @nikhil0511 

 

Can you tell me if your problem is solved? If yes, please accept it as solution.

 

Regards,

Nono Chen

MartinGeschke
Frequent Visitor

Hi I like to share different approch with you. I do this by concatenate a sql query using parameter.
So you call the pipline with different parameter for db, table, ... and build the sql query using concat 

My Example...

@concat('SELECT *
  FROM [dbo].[',substring(pipeline().parameters.NAV_Mandant,1,4),'_NAV_ValueEntry] where SourceSystem = ',pipeline().parameters.SourceSystem,' and NAV_Mandant=',pipeline().parameters.NAV_Mandant,' and Entry_No_>',string(activity('LastRecW020NAV').output.firstRow),' order by Entry_No_')
chiru_kudupudi
Resolver II
Resolver II

Hi  Nikhil,

 

I feel the below steps helpful to you:

1. Create a Loop Activity:
- In your Azure Data Factory pipeline, add a ForEach activity (loop) to iterate over the list of databases (e.g., 'db1', 'db2', 'db3').

2. Define Arrays:
- Create two arrays:
- 'databases': Contains the names of your databases ('db1', 'db2', 'db3').
- 'table_names': Contains the corresponding table names (e.g., 'table1', 'table2', 'table3').

3. Connection Strings:
- Maintain a dictionary (or another suitable data structure) that maps each database name to its connection string.
For example: in python

connection_strings = {
"db1": "connection_string_db1",
"db2": "connection_string_db2",
"db3": "connection_string_db3"
}

4. Loop Logic:
- Inside the ForEach activity, use the current database name ('db_name') to retrieve the corresponding connection string and table name.
- Use these values to create the appropriate data pipeline activity (e.g., Copy Activity) to load the table from the database to the lakehouse.
- You can use the on-premises gateway to connect to the server.

5. Logging and Monitoring:
- Consider adding logging or monitoring activities to track the progress and any potential errors during the loop.

Remember to replace the placeholders with your actual logic for loading the tables.
If you encounter any issues or need further assistance, feel free to ask! 😊

 

Regards,
Chiranjeevi Kudupudi

Hello Chiranjeevi, 

Thank you for your response. 

Correct me if I am wrong. 

I am creating two arrays names Databases and Tables under Variables section in data pipeline. 

nikhil0511_0-1720431510696.png

I am clueless where to create the python code that uses the connection strings. 

I am passing this gateway name as a variable as well named 'gateway'

My data connection normally contains like this : {gateway};{databasename}

I am novice to the data pipelines and it would be good if you can provide me the screenshots how you are applying these steps. 

Thank you

Nikhil

Hi Nikhil,
I cannot post screenshots.
I can provide the steps to achieve it

1. Connection Strings:
- You mentioned that you have a gateway variable ('gateway') and a database name variable ('databasename').
- To construct your connection string, use Python to combine these variables:

#Coding
import os

# Get the gateway and database names
gateway = os.environ.get('gateway')
database = os.environ.get('databasename')

# Construct your connection string
connection_string = f"{gateway};{database}"

2. Azure Data Factory Pipeline:
- Create an Azure Data Factory pipeline to orchestrate the data movement.
- In your pipeline, use a ForEach activity to iterate over the list of databases (e.g., 'db1', 'db2', 'db3').

3. Copy Activity:
- Inside the ForEach activity, configure a 'Copy Data activity' for each database.
- Set the source dataset to the corresponding table in the database.
- Set the sink dataset to your lakehouse (for example Azure Blob Storage, Azure SQL Data Warehouse, or Delta Lake etc).

4. Dynamic Mapping:
- Since you're dealing with multiple tables, consider using dynamic mapping.
- Define a mapping data flow that dynamically maps columns based on the source table schema.

5. Logging and Monitoring:
- Add logging and monitoring activities to track the progress and any potential errors during the pipeline execution.

6. Resources:
- Refer to the python in data factory documentation -- https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview

for detailed information on configuring the Copy Data activity.
Explore the Azure Data Factory for practical examples
https://github.com/Azure/Azure-DataFactory

Remember to adapt these steps to your specific scenario, and feel free to ask if you need further assistance!

Regards,
Chiranjeevi Kudupudi

Hello Chiranjeevi, 

Thank you for the response. 

 

Like I said earlier I am novice to Fabric and its artifacts. I would like to know where you are asking me to write the python code. 

I am having three variables in Datapipeline itself, two array type variables that each contains database names and tables names and the other variable is gateway string. 

 

Databases = ["dB1", "dB2"]

Tables = ["TB1","TB2"]

Gateway = "SQLServer"

 

Now in For activity o datapipeline, I am adding a copy activity. This copy activity asks the value. SQLServer;dB1 and database value: db1 and table name : TB1. 

similarly for next table the first two values needs to be same and Table name as TB2.

In next iteration, the values should changes as follows : SQLServer;dB2, dB2 and TB1. 

 

I tried to pass these variables using dynamic content. but still I am unable to pass them correctly.

 

I would like to achieve this.

Thank you

Nikhilesh

 

Are you trying to pass the gateway name "SQLServer" into the copy activity dynamically as well? Because in Fabric, if you use any dynamic content for the connection, it looks like this:

snraedsoeia_0-1721376443187.png

So unfortunately you can't dynamically connect to multiple SQL connections, only Fabric artifacts.

nikhil0511
Advocate I
Advocate I

sorry forgot to mention, each database connection string is different as well and we use on premesis gateway to connect to the server.

As far as I know, you can currently only dynamically select Fabric artifacts as connections. See this idea Microsoft Idea and this thread Copy Data Activity: Unable to set dynamic connection string : r/MicrosoftFabric (reddit.com)

Helpful resources

Announcements
Sept Fabric Carousel

Fabric Monthly Update - September 2024

Check out the September 2024 Fabric update to learn about new features.

Expanding the Data Factory Forums

New forum boards available in Data Factory

Ask questions in Apache Airflow Job and Mirroring.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors