Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Im sure im missing something simple but appricate the help. Im importing a big XML file that has some duplications in it.
Changes some info for provacy but the main point is I get duplications on the IP and ID ccolumns. For asmpe the first two lines you have ID 105146 twice. In the tracking column one has AGENT and one has IP. I want to deploucate all these remccing the link with IP in the tracking column.
Help would be appreciated. Thanks.
BUMP
Hi @Anonymous,
Can you please share some dummy data that keep the raw data structure with expected results? It should help us clarify your scenario and test to coding formula.
How to Get Your Question Answered Quickly
Regards,
Xiaoxin Sheng
Raw data looks like this
| Name | Tracking | Owner | Application | IP | QID |
| server1 | Agent | john | windows | 1.2.3.4 | 112334 |
| server1 | IP | john | windows | 1.2.3.4 | 112334 |
| server2 | Agent | john | linux | 5.6.7.8 | 113445 |
| server2 | IP | john | linux | 5.6.7.8 | 113445 |
| server3 | Agent | john | sql | 10.11.12.13 | 115667 |
| server3 | IP | john | sql | 10.11.12.13 | 115667 |
There are rows wich are duplicates except for the tracking column which is different between the two duplicate row.
I need to find duplicate rows and remove the row with IP in the tracking. End result would be this:
| Name | Tracking | Owner | Application | IP | QID |
| server1 | Agent | john | windows | 1.2.3.4 | 112334 |
| server2 | Agent | john | linux | 5.6.7.8 | 113445 |
| server3 | Agent | john | sql | 10.11.12.13 | 115667 |
HI @Anonymous,
I think they may relate to your source data structure. For this scenario, you can add filter on the category to filter records equal to blank or filter on tracking field if it equal to IP.
Regards,
Xiaoxin Sheng
Your right the source data contains the duplciation but i am pulling to from an API and I have no ability to clean it up before ingesting the data.
I considerd just filtering out field equal to IP however there are plenty of rows that IP is not a duplcate so i need to keep them in the data. I need to find duplicate rows based on IP and ID matching and remove the line which has IP in the tracking column.
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 38 | |
| 37 | |
| 33 | |
| 32 | |
| 29 |
| User | Count |
|---|---|
| 132 | |
| 88 | |
| 82 | |
| 68 | |
| 64 |