Don't miss your chance to take the Fabric Data Engineer (DP-600) exam for FREE! Find out how by attending the DP-600 session on April 23rd (pacific time), live or on-demand.
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
Hi,
I have a data set (~30 000 rows) where instead of removing the rows with duplicate values, I would like to remove the rows that have a value that's unique. There are too many rows to toggle off manually all the values that are only mentioned once. I can't seem to work out how to do this but surely there must be a way. This is because we need a list with solely the duplicates.
Edit: came here with this question because using the keep duplicate rows in the query editor also keeps the singular values.
Thanks!
Solved! Go to Solution.
There'll probably be many ways but there are 2 in the attached Excel workbook below.
In the Power Query editor after loading the data:
Version 1
1. Add an index column
2. Group the rows on the fields you want to include as duplicates (not the index column!) creating All Rows and Count
3. Filter out the 1s form the Count column
4. Expand the Tables column but only the index column
5. Optional: Sort on the Index column to retain original source data row order
6. Remove the Count and Index columns.
Version 2
1. Group the rows on the fields you want to include as duplicates creating a Count column
2. Filter the Count column to keep the 1s
3. Remove the Count column (leaves you a table with only rows that occur once)
4. Merge the Source table with the table in step 3
5. Expand this new column choosing any one of the fields (the field chosen must have no nulls)
6. Filter this expanded column for only nulls
7. Remove this expanded column.
There'll probably be many ways but there are 2 in the attached Excel workbook below.
In the Power Query editor after loading the data:
Version 1
1. Add an index column
2. Group the rows on the fields you want to include as duplicates (not the index column!) creating All Rows and Count
3. Filter out the 1s form the Count column
4. Expand the Tables column but only the index column
5. Optional: Sort on the Index column to retain original source data row order
6. Remove the Count and Index columns.
Version 2
1. Group the rows on the fields you want to include as duplicates creating a Count column
2. Filter the Count column to keep the 1s
3. Remove the Count column (leaves you a table with only rows that occur once)
4. Merge the Source table with the table in step 3
5. Expand this new column choosing any one of the fields (the field chosen must have no nulls)
6. Filter this expanded column for only nulls
7. Remove this expanded column.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.
| User | Count |
|---|---|
| 5 | |
| 3 | |
| 3 | |
| 3 | |
| 2 |