Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
mario_ruiz
Helper II
Helper II

Difference between Distinct and Unique when using column distribution

When using the Transform Data feature, over the View tab, there's an option that we can check: Column distribution, which brings distinct and unique values... What's the difference between these values? By Distinct values can I assume there are duplicates of the same value and for unique there are no duplicates? I already went into the documentation but it still is not clear. Thanks. 

 

mario_ruiz_0-1661916957175.png

 

 

https://docs.microsoft.com/en-us/power-query/data-profiling-tools

4 REPLIES 4
AbhinavJoshi
Responsive Resident
Responsive Resident

Hello, does distinct calculates null too? 

 

stumit
Frequent Visitor

A good question given that the link that mario_ruiz's description of the "definitive" MicroDoc he linked to falls short.
The page https://docs.microsoft.com/en-us/power-query/data-profiling-tools mentions the terms 'Distinct' and 'Unique' but provides no definition.

MahyarTF and HoangHugo describe it perfectly.

I did a test to see the facts. Findings below the image, some interesting.

Distinct is the count of distinct values as in SQL:
Select count(distinct colname)
from tablename;

Unique is equivalent of:
with uniq(ResellerKey, cnt)
as
(
select ResellerKey, count(*) as cnt
from FactResellerSales
group by ResellerKey
having count(*) = 1
)
select count(*)
from uniq

Enough of the SQL.

This image demonstrates the results. From Excel, the blanks showed null. From CSV with consecutive commas it showed blank. Rather than two images I change the null to blank in the 4th column of the Excel load.

It is clear that null and blank are counted in Distinct. Multiple null/blank did not make it into Unique. Simple image for you to count yourself to full understand.

UPDATED IMAGE: I put a blank in No Duplicates to show that blank/null can be considered Distinct.

Interestingly, I had to close PBI Desktop to update the Empty to 7% in No Duplicates column. Refresh didn't do it. Neither did Apply and Close and then Transform. The profiling must be cached.

 

PBE_Distribution_Distinct_vs_Unique.png

 

 

 

MahyarTF
Memorable Member
Memorable Member

Hi,

Distinct mean : count all the values as 1, even if there was more than one.

Unique mean : count only the value that are not repeated in the particular column

MahyarTF_0-1661921346703.png

Appreciate your Kudos

Mahyartf
HoangHugo
Solution Specialist
Solution Specialist

Hi,

I found it from Online

“Distinct” means total number of different values regardless how many times it appears in the dataset. A name appears in the list multiple times is counted as 1 distinct count.

Whereas, the “Unique” value is total number of values that only appear once. 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.