Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

A new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.

Reply
ronaldbalza2023
Continued Contributor
Continued Contributor

Data Cleaning - Power Query - Removing Duplicates

Hi everyone, so I have data source that comes from an FTP folder with .csv extension file. It updates everyday with new data. BUT, there were data that has been duplicated coming from previous day that is shown on the current day.

 

Say today is Monday, some data from previous day (Sunday) is also include on the update. How can I remove the duplicated data and ensure that it will not be duplicated in the future?

1 ACCEPTED SOLUTION
danextian
Super User
Super User

Hi @ronaldbalza2023 ,

 

The Folder. Files function (used when connecting to a folder as data source) allows one to see when a file in it was last modified.  You can sort the Date created column prior to removing duplicates and wrap that sort step in Table.Buffer.

danextian_3-1658714412112.png

 

 

danextian_1-1658714056527.png

 

This datetime column must be included in one of the criteria when removing duplicates. You can remove it afterwards if you don't need it. This tecnique requires that you do not use Combine & Transform Data feature as this doesn't take into account when a file was created.

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

View solution in original post

1 REPLY 1
danextian
Super User
Super User

Hi @ronaldbalza2023 ,

 

The Folder. Files function (used when connecting to a folder as data source) allows one to see when a file in it was last modified.  You can sort the Date created column prior to removing duplicates and wrap that sort step in Table.Buffer.

danextian_3-1658714412112.png

 

 

danextian_1-1658714056527.png

 

This datetime column must be included in one of the criteria when removing duplicates. You can remove it afterwards if you don't need it. This tecnique requires that you do not use Combine & Transform Data feature as this doesn't take into account when a file was created.

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

Helpful resources

Announcements
May Power BI Update Carousel

Power BI Monthly Update - May 2026

Check out the May 2026 Power BI update to learn about new features.

Fabric SQL PBI Data Days

Data Days 2026 coming soon!

Sign up to receive a private message when registration opens and key events begin.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.