Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
NikTheRussian
Frequent Visitor

Removing Duplicates (based on 2 or 3 columns) without breaking query folding

Hi folks,

 

We are attempting to remove duplicates from varying tables in a model we have built in our business. 

 

To give context, the fact table pulls from our SQL datawarehouse, meanwhile the dimensional information is from MS Forms & stored in Excel. 

 

When removing duplicates in both facts & dimensions we need to base this on multiple conditions, i.e. 2 or 3 columns being the same. When doing so in Power Query, query folding breaks at this stage. 

We have attempted to sort & append in an index, prior to adding in removal of duplicates. We have tried Table.Distinct using the required columns and yet query folding breaks. We cannot simply apply table level removal of duplicates as the records we require are then removed. 

The alternative solution (for the fact) is pushing this back to source & applying a row partition to then be filtered when brought in using the native query (enablefolding=true) functionality. This still leaves us with the challenge being faced with the dimensions fed by the sharepoint files. 

Can anyone provide some guidance as to how to overcome this please?

1 ACCEPTED SOLUTION
rohit1991
Super User
Super User

Hi @NikTheRussian 

 

The issue occurs because some Power Query steps , like sorting, adding an index, or removing duplicates ,can’t be sent back to the data source. When that happens, query folding stops, and Power Query starts processing the data locally instead of pushing the logic to SQL or Excel.

The best fix is to handle duplicates directly at the data source. If your data is from SQL, write a query using ROW_NUMBER() or RANK() to remove duplicates before loading it into Power BI. This approach keeps query folding active and improves performance.

If your source is Excel or SharePoint, folding isn’t supported, so you’ll need to remove duplicates within Power Query using Table.Distinct or Group By. It might be a bit slower, but that’s expected for these sources.


Did it work? ✔ Give a Kudo • Mark as Solution – help others too!

View solution in original post

5 REPLIES 5
NikTheRussian
Frequent Visitor

Thanks all for your swift responses - apologies for the delay in accepting. 

 

We have applied the row partition in the SQL source, meanwhile will look to implement group by using excel and/or Table.Distinct

v-pgoloju
Community Support
Community Support

Hi @NikTheRussian,

 

Thank you for reaching out to the Microsoft Fabric Forum Community, and special thanks to @KNP and @rohit1991  for prompt and helpful responses.

Just following up to see if the Response provided by community members were helpful in addressing the issue. if the issue still persists Feel free to reach out if you need any further clarification or assistance.

 

Best regards,
Prasanna Kumar

 

KNP
Super User
Super User

Hi @NikTheRussian,

 

Further to @rohit1991's answer. Which is the correct suggestion, to push it back to the source if that is an option.

 

If that's not possible, this resource may be useful to ensure you are using foldable functions.

 
I would have thought, if you added a custom column that combined the columns needed, and then carried on from there, that it would continue to fold. But, at that point, I would likely just write the SQL query and use it in Power Query to save the hassle (I come from a SQL background, so it depends on what you're comfortable with).
 
Have I solved your problem?
Please click Accept as Solution so I don't keep coming back to this post, oh yeah, others may find it useful also ;).
xOIEmaj

If you found this post helpful, please give Kudos.
It gives me a sense of instant gratification and, if you give me Kudos enough times, magical unicorns will appear on your screen.
If you find my signature vaguely amusing, please give Kudos.
image
fabric-SUbadge
Proud to be a Super User!

Greatly appreciate this - thank you

rohit1991
Super User
Super User

Hi @NikTheRussian 

 

The issue occurs because some Power Query steps , like sorting, adding an index, or removing duplicates ,can’t be sent back to the data source. When that happens, query folding stops, and Power Query starts processing the data locally instead of pushing the logic to SQL or Excel.

The best fix is to handle duplicates directly at the data source. If your data is from SQL, write a query using ROW_NUMBER() or RANK() to remove duplicates before loading it into Power BI. This approach keeps query folding active and improves performance.

If your source is Excel or SharePoint, folding isn’t supported, so you’ll need to remove duplicates within Power Query using Table.Distinct or Group By. It might be a bit slower, but that’s expected for these sources.


Did it work? ✔ Give a Kudo • Mark as Solution – help others too!

Helpful resources

Announcements
November Power BI Update Carousel

Power BI Monthly Update - November 2025

Check out the November 2025 Power BI update to learn about new features.

Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors