Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Data Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more

Reply
ronaldbalza2023
Continued Contributor
Continued Contributor

Data Cleaning - Power Query - Removing Duplicates

Hi everyone, so I have data source that comes from an FTP folder with .csv extension file. It updates everyday with new data. BUT, there were data that has been duplicated coming from previous day that is shown on the current day.

 

Say today is Monday, some data from previous day (Sunday) is also include on the update. How can I remove the duplicated data and ensure that it will not be duplicated in the future?

1 ACCEPTED SOLUTION
danextian
Super User
Super User

Hi @ronaldbalza2023 ,

 

The Folder. Files function (used when connecting to a folder as data source) allows one to see when a file in it was last modified.  You can sort the Date created column prior to removing duplicates and wrap that sort step in Table.Buffer.

danextian_3-1658714412112.png

 

 

danextian_1-1658714056527.png

 

This datetime column must be included in one of the criteria when removing duplicates. You can remove it afterwards if you don't need it. This tecnique requires that you do not use Combine & Transform Data feature as this doesn't take into account when a file was created.

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

View solution in original post

1 REPLY 1
danextian
Super User
Super User

Hi @ronaldbalza2023 ,

 

The Folder. Files function (used when connecting to a folder as data source) allows one to see when a file in it was last modified.  You can sort the Date created column prior to removing duplicates and wrap that sort step in Table.Buffer.

danextian_3-1658714412112.png

 

 

danextian_1-1658714056527.png

 

This datetime column must be included in one of the criteria when removing duplicates. You can remove it afterwards if you don't need it. This tecnique requires that you do not use Combine & Transform Data feature as this doesn't take into account when a file was created.

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.

Helpful resources

Announcements
Fabric Data Days is here Carousel

Fabric Data Days 2026

Don't miss out on Data Days, June 15 through August 7. Learn Fabric, Power BI, SQL, AI and more.

May Power BI Update Carousel

Power BI Monthly Update - May 2026

Check out the May 2026 Power BI update to learn about new features.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.