Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hello!
I have a set of data that I am trying to remove duplicates from based on the number of filled cells per row. The data is form submissions that have a person's email address and then a number of different fields that could be filled, ranging from 1 - 16 filled cells. I was able to get make a column that tells me how many cells are filled for each of my rows which basically gives me a max. I have removed the duplicates that have the same email address and the same amount of filled cells but now I need to remove any duplicates (based on the email address) that have less than the max amount of filled cells for each email address. I found another a solution in the forums to do this with DAX but I need to be able to do this within the Query Editor so that once I have removed the duplicates I can append this data with other ones.
Here is a snap shot of what the query currently looks like, and I have marked the ones that I would want removed:
Thanks!
Solved! Go to Solution.
I do something similar in one of my reports that should work here (given the lack of incremental load), where I've merged two queries, a bunch of existing data which I give a number 1 in a custom field, and new data, which I give the number 2 in the same custom field. If I sort it on that column in descending order, it gets rid of any old data that's been updated, so the same principle should work for your problem - sort it descending by your #filled value then remove duplicates on your email column?
Thanks for the response , however that doesn't seem to work. Even when I sort by decending and then remove duplicate emails it still takes away some of the emails that have the highest number of filled cells.
For example:
before-
after -
you can see that I now lost the entry with 16 filled cells.
A known issue with sorting and removing duplicates is that you need to buffer the table in between.
You might wrap your sort step like: Table.Buffer(your sortcode).
still relatively new to Power BI, what does "wrap your sort step like: Table.Buffer(your sortcode)" actually mean? are you able to show me by chance?
Yes:
Thanks for this 🙌
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
72 | |
71 | |
57 | |
38 | |
36 |
User | Count |
---|---|
81 | |
67 | |
61 | |
46 | |
45 |