Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
Hello!
I have a set of data that I am trying to remove duplicates from based on the number of filled cells per row. The data is form submissions that have a person's email address and then a number of different fields that could be filled, ranging from 1 - 16 filled cells. I was able to get make a column that tells me how many cells are filled for each of my rows which basically gives me a max. I have removed the duplicates that have the same email address and the same amount of filled cells but now I need to remove any duplicates (based on the email address) that have less than the max amount of filled cells for each email address. I found another a solution in the forums to do this with DAX but I need to be able to do this within the Query Editor so that once I have removed the duplicates I can append this data with other ones.
Here is a snap shot of what the query currently looks like, and I have marked the ones that I would want removed:
Thanks!
Solved! Go to Solution.
I do something similar in one of my reports that should work here (given the lack of incremental load), where I've merged two queries, a bunch of existing data which I give a number 1 in a custom field, and new data, which I give the number 2 in the same custom field. If I sort it on that column in descending order, it gets rid of any old data that's been updated, so the same principle should work for your problem - sort it descending by your #filled value then remove duplicates on your email column?
Thanks for the response , however that doesn't seem to work. Even when I sort by decending and then remove duplicate emails it still takes away some of the emails that have the highest number of filled cells.
For example:
before-
after -
you can see that I now lost the entry with 16 filled cells.
A known issue with sorting and removing duplicates is that you need to buffer the table in between.
You might wrap your sort step like: Table.Buffer(your sortcode).
still relatively new to Power BI, what does "wrap your sort step like: Table.Buffer(your sortcode)" actually mean? are you able to show me by chance?
Yes:
User | Count |
---|---|
123 | |
77 | |
62 | |
50 | |
49 |
User | Count |
---|---|
175 | |
125 | |
60 | |
60 | |
58 |