Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!

Reply
mario_ruiz
Helper II
Helper II

Difference between Distinct and Unique when using column distribution

When using the Transform Data feature, over the View tab, there's an option that we can check: Column distribution, which brings distinct and unique values... What's the difference between these values? By Distinct values can I assume there are duplicates of the same value and for unique there are no duplicates? I already went into the documentation but it still is not clear. Thanks. 

 

mario_ruiz_0-1661916957175.png

 

 

https://docs.microsoft.com/en-us/power-query/data-profiling-tools

4 REPLIES 4
AbhinavJoshi
Resolver III
Resolver III

Hello, does distinct calculates null too? 

 

stumit
Frequent Visitor

A good question given that the link that mario_ruiz's description of the "definitive" MicroDoc he linked to falls short.
The page https://docs.microsoft.com/en-us/power-query/data-profiling-tools mentions the terms 'Distinct' and 'Unique' but provides no definition.

MahyarTF and HoangHugo describe it perfectly.

I did a test to see the facts. Findings below the image, some interesting.

Distinct is the count of distinct values as in SQL:
Select count(distinct colname)
from tablename;

Unique is equivalent of:
with uniq(ResellerKey, cnt)
as
(
select ResellerKey, count(*) as cnt
from FactResellerSales
group by ResellerKey
having count(*) = 1
)
select count(*)
from uniq

Enough of the SQL.

This image demonstrates the results. From Excel, the blanks showed null. From CSV with consecutive commas it showed blank. Rather than two images I change the null to blank in the 4th column of the Excel load.

It is clear that null and blank are counted in Distinct. Multiple null/blank did not make it into Unique. Simple image for you to count yourself to full understand.

UPDATED IMAGE: I put a blank in No Duplicates to show that blank/null can be considered Distinct.

Interestingly, I had to close PBI Desktop to update the Empty to 7% in No Duplicates column. Refresh didn't do it. Neither did Apply and Close and then Transform. The profiling must be cached.

 

PBE_Distribution_Distinct_vs_Unique.png

 

 

 

MahyarTF
Memorable Member
Memorable Member

Hi,

Distinct mean : count all the values as 1, even if there was more than one.

Unique mean : count only the value that are not repeated in the particular column

MahyarTF_0-1661921346703.png

Appreciate your Kudos

Mahyartf
HoangHugo
Solution Specialist
Solution Specialist

Hi,

I found it from Online

“Distinct” means total number of different values regardless how many times it appears in the dataset. A name appears in the list multiple times is counted as 1 distinct count.

Whereas, the “Unique” value is total number of values that only appear once. 

Helpful resources

Announcements
April AMA free

Microsoft Fabric AMA Livestream

Join us Tuesday, April 09, 9:00 – 10:00 AM PST for a live, expert-led Q&A session on all things Microsoft Fabric!

March Fabric Community Update

Fabric Community Update - March 2024

Find out what's new and trending in the Fabric Community.