Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Karthick_Balaje
Regular Visitor

Data Flow Gen 2 Issue

I used Dataflow Gen 2 to push the data from bronze to Silver with a little transformation (wrote a basic logic to perform Distinct Filtered Rows" operation - Filtering rows based on a condition (Primary Key not null and not 0)). When I select save and run, after validation, the dataflow gen 2 runs for a while, and it fails. Even after failing, I see most of the tables are pushed to Silver, and one table with 4 million rows is alone pushed. Why? 

Because of this, I am unable to run my main pipeline, and sometime I see there are duplicate rows at my target destination even after selecting "Replace" as an update method (whenver the dataflow gen 2 runs).

- Thanks,
Karthick

7 REPLIES 7
v-prasare
Community Support
Community Support

May I ask if you have resolved this issue? If so, Can you please share the resolution steps here. This will be helpful for other community members who have similar problems to solve it faster.
If we don’t hear back, we’ll go ahead and close this thread. For any further discussions or questions, please start a new thread in the Microsoft Fabric Community Forum we’ll be happy to assist.
Thank you for being part of the Microsoft Fabric Community.

v-prasare
Community Support
Community Support

Hi @Karthick_Balaje ,

We would like to follow up to see if the solution provided by the super user resolved your issue. Please let us know if you need any further assistance.


@BhaveshPatel & @AntoineW  thanks for your prompt response.


Thanks,

Prashanth Are

MS Fabric community support

AntoineW
Impactful Individual
Impactful Individual

@Karthick_Balaje,

 

If I understand well, you have perform a transformation to deduplicate a column but you also have deplucate in the destination. 

 

- Because by selecting the update method as "Replace", it will normally replace all records. Make sure your 4M-row entity is truly set to Replace (and not “Append” on an existing table with manual settings)

 

- If two dataflows/pipelines write to the same table (or a new run starts before the previous one finishes), you can get duplicates—especially if any writer uses Append. Make sure in the pipeline or others artefacts that is using the table, there is nothing in the "append" mode.

 

- Otherwise, create a tickets to microsoft support team.

 

Best regards,

Antoine

Hi @AntoineW, From Source (Azure SQL DB) to Bronze, I did a copy data activity here, and I saw dara getting duplicated after ingestion. Then I wrote a de-duplication logic in Dataflow gen 2 to move the data from Bronze to Silver (See the attached screenshot) to remove the duplicates. Then, again, from Silver to Gold, I made transformations with regards the required business KPIs, I made transformations and created custom columns to build KPIs using Dataflow gen 2 and pushed it to gold. When i queried the data in Gold Warehouse SQL analytics endpoint, I saw duplicates again, and I have wrote SQL logics for de-duplication. 

For the 2 dataflow Gen 2, the update method was replace the entire table in the target destination (see the attached screenshot).Screenshot 2025-09-16 142049.png

 

Karthick_Balaje_1-1758067247735.png

 

Hi @Karthick_Balaje,

Did your issues got resolved or you still need any help on above scenario of yours

 

 

Thanks,

Prashanth

 

BhaveshPatel
Community Champion
Community Champion

@Karthick_Balaje Thank you. You should not have to move data from bronze to silver.

There are two ways you can achieve this: One you should use Power BI Data flow Gen 2 or the second option is Use Notebooks and divide that into bronze, silver and gold layer.

 

If I use Power BI Dataflow Gen 2, then do all the transformations and save the data as a Fabric SQL Database. 

BhaveshPatel_0-1757307477544.png

one task at a time. ( One table at a time ). It can handle billions of rows at a time in a single table. 

 



 

Thanks & Regards,
Bhavesh

Love the Self Service BI.
Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to give Kudos.
Karthick_Balaje
Regular Visitor

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Fabric Update Carousel

Fabric Monthly Update - September 2025

Check out the September 2025 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors