Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
When using the Transform Data feature, over the View tab, there's an option that we can check: Column distribution, which brings distinct and unique values... What's the difference between these values? By Distinct values can I assume there are duplicates of the same value and for unique there are no duplicates? I already went into the documentation but it still is not clear. Thanks.
https://docs.microsoft.com/en-us/power-query/data-profiling-tools
Hello, does distinct calculates null too?
A good question given that the link that mario_ruiz's description of the "definitive" MicroDoc he linked to falls short.
The page https://docs.microsoft.com/en-us/power-query/data-profiling-tools mentions the terms 'Distinct' and 'Unique' but provides no definition.
MahyarTF and HoangHugo describe it perfectly.
I did a test to see the facts. Findings below the image, some interesting.
Distinct is the count of distinct values as in SQL:
Select count(distinct colname)
from tablename;
Unique is equivalent of:
with uniq(ResellerKey, cnt)
as
(
select ResellerKey, count(*) as cnt
from FactResellerSales
group by ResellerKey
having count(*) = 1
)
select count(*)
from uniq
Enough of the SQL.
This image demonstrates the results. From Excel, the blanks showed null. From CSV with consecutive commas it showed blank. Rather than two images I change the null to blank in the 4th column of the Excel load.
It is clear that null and blank are counted in Distinct. Multiple null/blank did not make it into Unique. Simple image for you to count yourself to full understand.
UPDATED IMAGE: I put a blank in No Duplicates to show that blank/null can be considered Distinct.
Interestingly, I had to close PBI Desktop to update the Empty to 7% in No Duplicates column. Refresh didn't do it. Neither did Apply and Close and then Transform. The profiling must be cached.
Hi,
Distinct mean : count all the values as 1, even if there was more than one.
Unique mean : count only the value that are not repeated in the particular column
Appreciate your Kudos
Hi,
I found it from Online
“Distinct” means total number of different values regardless how many times it appears in the dataset. A name appears in the list multiple times is counted as 1 distinct count.
Whereas, the “Unique” value is total number of values that only appear once.
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
User | Count |
---|---|
86 | |
76 | |
74 | |
55 | |
45 |
User | Count |
---|---|
117 | |
114 | |
78 | |
73 | |
68 |