Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more
Hello,
In order to identify the cleanups I still have to do on a fairly large dataset, I would like to generate a new table where each column contains the unique values of my original table.
I have tried to remove duplicates but this is reflected in all the columns, I want each column to be independent.
I don't know if Power BI is the most appropriate tool for this so I'm listening to other solutions if more appropriate.
Thank you in advance,
Solved! Go to Solution.
Hi @Amazeroth ,
After researching, I think you just can create each column as a table, remove duplicates and then merge them together. For your fairly large dataset, the way isn't available apparently. There is not a solution that can implement it directly.
Can I ask why you want to do this? After the operation, the data doesn't make sense.
Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
I think I understand what you are trying to do but if you have examples that would be great.
Have you tried group by?
The downside here is if you have multiple columns it will group all columns like this
| 1 | 2 | 3 | 4 |
| A | C | C | C |
| A | A | C | C |
| A | A | A | C |
| B | A | A | A |
| B | B | A | A |
| B | B | B | A |
| C | B | B | B |
| C | C | B | B |
| C | C | C | B |
Thank you for your help, I'm going to dig up this lead but it doesn't seem to match. In the meantime, here are some more explanations:
Here's an example of my starting table:
| Columns 1 | Columns 2 | Columns 3 |
| A | 12 | John |
| A | 14 | Bob |
| B | 10 | Bob |
| C | 12 | Bob |
| B | 12 | John |
And this is the table I'd like to create:
| Columns 1 | Columns 2 | Columns 3 |
| A | 12 | John |
| B | 14 | Bob |
| C | 10 |
This would allow me to easily detect inconsistent values that I still need to correct in my queries.
Hi @Amazeroth ,
After researching, I think you just can create each column as a table, remove duplicates and then merge them together. For your fairly large dataset, the way isn't available apparently. There is not a solution that can implement it directly.
Can I ask why you want to do this? After the operation, the data doesn't make sense.
Best Regards,
Xue Ding
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Hi @v-xuding-msft ,
Thanks for your feedback, I came to the same conclusion on my side... I'm thinking of using Python instead.
There were two goals behind this request:
1. Our dataset being large, coming from several sources and very dirty, this table would allow me to discuss efficiently with the business teams about the cleaning/format change operations to be done.
2. Once the dataset cleaned, it would allow me to quickly generate a documentation on each column and the values it can contain.
I will close the post, thanks for your help.
Regards,
Amazeroth,
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.