Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
JasonP
Frequent Visitor

Removing duplicate data

I am trying to tie together multiple vendor data with our internal data. I have an accounting system that separates agents by company first, then a unique number so for instance John Smith at Company A is BH758_000001 and Joe Harris at Company B is UT849_0000010. Internally and with our vendors, we have a unique Agent ID for each agent meaning John Smith will always be 2459 and Joe Harris will always be 3945 no matter what company they are part of with us. 

 

The problem is if John Smith goes to company B and Joe Harris goes to Company A, John Smith will still be 2459 internally but now UT849_000080 in accounting and Joe Harris will still be 3945 but now BH758_000090 in accounting. 

 

As you can imagine, when I create a relationship in my agents table, my vendor table can't be created because there are two 2459 and two 3945 as accounting now has 2 records due to the number it assigned. 

 

My (possible) solution is to create a new column in my agents table that basically looks at our internal number (say 2459), if there is no duplicates it would return that same value. If there is a duplicate and one shows active (the agent would never be active in both) it returns that value and shows 0 for terminated. Where it gets tricky is if both are terminated in which case I would want it to return the value of the instance with the most recent terminated date and return 0 otherwise. Problem is I just started using Powerbi recently and have NO idea how to write any of that. 

 

Just to be clear, I have a table "agent" (accounting number as described), "AgentNum" (Internal number), "AgentStatus" (T or A for terminated or active), and "TermDate". Any help is GREATLY appreciated but mostly curious if its even possible. 

1 ACCEPTED SOLUTION

I figured it out. The main usable difference in the tables I had was an office code. I added the office code to the unique ID on both ends and was able to match. 

 

Thank you all for your help, for future reference, how to I share sample data? My data is sensitive internal data so I don't know how to show you without posting on the web our companies internal numbers and info. 

View solution in original post

4 REPLIES 4
Anonymous
Not applicable

Hi @JasonP,

 

I have an idea to deal with your issue, you can refer to below steps:

 

1. Use CONTAINS function to get the records which both contain the two keys.(duplicate records)
2. Filter inactive records from above records.(inactive records of duplicate records)
3. Use EXCEPT function to exclude the inactive records from original records, then you will get the active records and unique records which you want.

 

For detailed formula, please share some sample data to test.

 

Regards,

Xiaoxin Sheng

I figured it out. The main usable difference in the tables I had was an office code. I added the office code to the unique ID on both ends and was able to match. 

 

Thank you all for your help, for future reference, how to I share sample data? My data is sensitive internal data so I don't know how to show you without posting on the web our companies internal numbers and info. 

You would need to anonymize it. As long as we can see the structure of the tables that is all we need.
dkay84_PowerBI
Microsoft Employee
Microsoft Employee

Can you post some sample data?  

 

Also, not sure if this would work, but can you concantenate the accounting code and Agent ID to get a truly custom key?  The name John Smith would have two keys but at least you wouldn't have duplicates for your relationships, and you could use DAX to determine the relevant key to use based on filtering by term date or agent status.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.