Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
divyamalhotra12
Frequent Visitor

Data Pipeline - DataFlow Gen 2, Variables assignment

1. I have a dataflow in Data Factory to which I applied some transformations such as 'Unpivot Columns' and 'Group By'. I want to create a pipeline which ensures that new data gets added in my dataflow automatically and any transformation such as 'unpiviot' or 'group by' that I applied to my dataflow earlier gets applied to the new data that is added. 

 

2. I have a question regarding the 'Variable' feature in Data Pipeline. I created a pipeline wherein I have dataflow gen 2 as first activity and on success of that activity, I want to assign my dataflow a variable. The pipeline runs successfully but I do not see anything in the output.'

 

divyamalhotra12_0-1711387948565.png

 

divyamalhotra12_1-1711387958677.png

 

2 ACCEPTED SOLUTIONS

Hi @divyamalhotra12 ,

Inorder to understand what is Data Pipeline -

vgchennamsft_0-1711625239171.png


List of Data Transformation Options present in Data Pipeline - Data transformation activities - Microsoft Fabric | Microsoft Learn

Basically if you have Dataflow you can publish it inorder to do transformation. 
You can schedule the published Dataflow directly without any dependency. 
If you want to do prior steps before its execution you can use dataflow activity in data pipeline along with some series of activities.

It completely depends on how do you want to design your pipeline or consume your data flow gen 2. 

Hope this is helpful. Please let me know if you have any further queries.

View solution in original post

"If you want to do prior steps before its execution you can use dataflow activity in data pipeline along with some series of activities."

 

So, this means, I choose the dataflow activity in my pipeline, select the dataflow and then click on the arrow in the dataflow. That will then take me to the dataflow and I can edit, publish it and come back to my pipeline and run it?

 

divyamalhotra12_0-1711929639197.png

 

 

View solution in original post

9 REPLIES 9
v-gchenna-msft
Community Support
Community Support

Hi @divyamalhotra12 ,

Thanks for using Fabric Community.

If you are taking about Incremental Load using Data Flow Gen 2, then i recommend this - Pattern to incrementally amass data with Dataflow Gen2 - Microsoft Fabric | Microsoft Learn

I think your understanding towards Set Variable is different.

vgchennamsft_0-1711442590160.png

 

vgchennamsft_1-1711442660567.png

 

Can you please check the output there you can find the value.

Docs to refer -
Set Variable Activity - Azure Data Factory & Azure Synapse | Microsoft Learn
Set Pipeline Return Value - Azure Data Factory & Azure Synapse | Microsoft Learn

168. Set pipeline return value in Set variable || Access output of child pipeline in main pipeline (...


Hope this is helpful. Please let me know in case of further queries.

divyamalhotra12_0-1711465548513.png

I want to understand that,

1. Can I update the data in lakehouse? For example, can I add new rows to the existing columns? How do I update the same table/file? Or do I need to import a new file in lakehouse?

2. If the data gets updated then do the same transformations apply to the newly added data by running the same pipeline again?

3. If importing a new file is the only option then I would need to create a new dataflow gen 2 and apply transfomrations manually?

 

Ideally this is my use case, I have a dataset to which I want to apply some transformations. I want the data to be updated periodically and the transformations to be automatically applied to the updated data through data pipelines. 

Hi @divyamalhotra12 ,

1. Can I update the data in lakehouse? For example, can I add new rows to the existing columns? How do I update the same table/file? Or do I need to import a new file in lakehouse?
Yes, you can update the same table/file present in lakehouse. You can use option called append while configuring data flow gen 2.

vgchennamsft_0-1711537849408.png

 


 

2. If the data gets updated then do the same transformations apply to the newly added data by running the same pipeline again?
Yes, all transformations in data flow gen 2 will be applied to the data for every execution/run.
You can schdule option inorde to avoid manual trigger -

vgchennamsft_1-1711537959557.png

 


3. If importing a new file is the only option then I would need to create a new dataflow gen 2 and apply transfomrations manually?

I didn't get this point.

Hope this is helpful. Please let me know incase of further queries.

Thank you for the response. 

I have two more questions.

1. I create a data pipeline. I choose Dataflow activity and select a dataflow that I created. Can I specify some transformations to be performed in the dataflow through data pipeline? Can I see those steps in data pipeline?

2. If I create a pipeline and select 'dataflow' activity and then I select an already published dataflow from my workspace, then what is the output expected once I run the pipeline. I do not understand the purpose of the same.

Hi @divyamalhotra12 ,

Inorder to understand what is Data Pipeline -

vgchennamsft_0-1711625239171.png


List of Data Transformation Options present in Data Pipeline - Data transformation activities - Microsoft Fabric | Microsoft Learn

Basically if you have Dataflow you can publish it inorder to do transformation. 
You can schedule the published Dataflow directly without any dependency. 
If you want to do prior steps before its execution you can use dataflow activity in data pipeline along with some series of activities.

It completely depends on how do you want to design your pipeline or consume your data flow gen 2. 

Hope this is helpful. Please let me know if you have any further queries.

"If you want to do prior steps before its execution you can use dataflow activity in data pipeline along with some series of activities."

 

So, this means, I choose the dataflow activity in my pipeline, select the dataflow and then click on the arrow in the dataflow. That will then take me to the dataflow and I can edit, publish it and come back to my pipeline and run it?

 

divyamalhotra12_0-1711929639197.png

 

 

Thank you! It is clear to me now.

Hi @divyamalhotra12 ,

You can actually edit your dataflow and publish it. 
But what I actually mean is, incase if you have to do prior activities before execution of data flow, you can take help from data pipeline.

vgchennamsft_0-1711945899308.png


Hope this is helpful. Please let me know incase of further queries.

Hi @divyamalhotra12 ,

We haven’t heard from you on the last response and was just checking back to see if we answered your query.
Otherwise, will respond back with the more details and we will try to help .

Helpful resources

Announcements
April Fabric Update Carousel

Fabric Monthly Update - April 2024

Check out the April 2024 Fabric update to learn about new features.