Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
Abhaykumar
Microsoft Employee
Microsoft Employee

Getting distinct values for multiple columns

Is there a way to get distinct values for multiple columns? Specifically, I want to replicate the following SQL query into PowerBI to create a new table:

 

SELECT DISTINCT Col1, Col2, Col3 FROM TableA;

I can find the DISTINCT keyword, but it only supports one column. 

 

 

2 ACCEPTED SOLUTIONS
ankitpatira
Community Champion
Community Champion

@Abhaykumar You can also select individual columns in query editor and click Remove Duplicates.

 

Capture.PNG

View solution in original post

achinm45
Advocate IV
Advocate IV

Hi @Abhaykumar,

 

Use DAX and in DAX the function SUMMARIZE().

Summarize will pull out distinct values from columns.

 

So your resultset will be new table from Summarize

Table_Output =  Summarize(Table_IN,Col1,Col2,Col3)

 

I hope it helps !

 

BR,

Achin

View solution in original post

31 REPLIES 31
OlivierLam
New Member

You can use the Table.Group function without any aggregation.
= Table.Group(Source, {"Col1", "Col2", "Col3"}, {})

BrianVT
Resolver I
Resolver I

Hi @ImkeF 

 

I followed your approach in this thread to create my first manual query in Power BI.  I wanted to create this list of distinct values from two different columns from two tables, so that I could use the newly created column as the relational data point between those two tables/columns.  However, I'm getting the ambiguity error when trying to relate the second column to it.  Is that not the purpose of creating this union list?  I also tried creating a table that did the same thing, and the ambiguity error popped up there as well.

 

Is there a way to create a stand alone column that is the list of unique values from two columns for the purpose of relating those two columns in one-to-many relationships?

Hi @BrianVT 

sorry, but I cannot follow.

Could you possibly post some sample data?

 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

@ImkeF 

 

 = List.Distinct(List.Union({#"Obligations by FY"[Project Number],#"CJI3 Actual Cost Data"[Project Number]}))

 

So, that's the query I created.  In my model, I then try to create a relationship between this new column, and the two different "Project Number" columns, so that the new "Project Number" column can be used for filtering purposes in visuals.

 

I can create one relationship, but upon creating the second relationship, I get the error about ambiguous paths.  So, I can't do what I wanted with the new query/column. 

I guess I'll post a new question.  Was just trying to figure out a quick way to make the "one" side of my relationships for two tables that is dynamic, vs. manually joining them into a table to bring in on my own and having to manually refresh that table with new unique values all the time.  Creating the table within Power BI by joining the unique values from the two tables seemed like the easiest way to do it, which landed me here.

Hi Brian, 

 

I have a similar use case  - were you able to figure this out?

Sorry, I can't remember what I ended up doing, and don't have access to that file anymore.  More than likely, I solved it via a PowerBI table function, rather than M code. 

 

I remember some situations where I would create a new table that was a union of two other tables, and then create another table that just references that union table (just one column), and somehow making my relationships off of this "3rd leg" table worked without the ambguity errors that happen when using that "in-between" table.  Can't remember if that was my fix for this situation, but you might give that a try.  It's some kind of loophole (at least it is in my mind; I'm sure someone smarter has a techncial explanation for why that works).

achinm45
Advocate IV
Advocate IV

Hi @Abhaykumar,

 

Use DAX and in DAX the function SUMMARIZE().

Summarize will pull out distinct values from columns.

 

So your resultset will be new table from Summarize

Table_Output =  Summarize(Table_IN,Col1,Col2,Col3)

 

I hope it helps !

 

BR,

Achin

perfect!

 

Hi achinm45,

 

Receiving the following error when using your suggested function (below; names anonymized):

"The expression refers to multiple columns. Multiple columns cannot be converted to a scalar value."

 

SUMMARIZE(table,[col1],[col2],[col3])
 
Thank You!

I'm getting the same thing. Did you find a solution?

Despite the existence of many solutions in Power BI (Query Editor, DAX, etc), this seams to be the most elegante solution.
Thants for your hint @achinm45 

Anonymous
Not applicable

I am trying to create a table that shows the person's unique code, department and name.  I am using the Summarize function and am getting duplicates.  How do I exclude the duplicate that does not get associated with a department?

 

Table = Summarize('MO YTD Jan - June Revenue','MO YTD Jan - June Revenue'[Employee Code],'MO YTD Jan - June Revenue'[Department Name])
 
Distinct.JPG
 
Thank you!
 
Kim

If you do it in DAX, just be aware that the full table must be loaded into the data model. Doing it in the query-editor using M would (mostly) fold back to the server, meaning that the SQL-server does the extraction and returns only the results to PBI to be loaded.

 

So if you need to load the full table anyway, DAX would probably be faster.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

You can also wrap a DISTINCT(SELECTCOLUMNS(table, "col1name", Col1fromtable, table, "col2name", Col2fromtable)

How can I copy distinct values from multiple columns into ONE single colum?

 

The shown above always created multiple columns into which the code copies the distinct values from each column.

 

Thanks!

This formula will create a list of distinct values in the query editor (where "Source" is the name of the table/previous step and column names in square brackets):

 

List.Distinct(List.Union({Source[Col1], Source[Col2], Source[Col3]})) 

Not sure if you want this as a standalone query or merge the resulting list with some other table?

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hi, 

I'm trying this solution to combine two columns of sales order numbers from two separate queries and get an error:

"Expression.SyntaxError: Invalid identifier"

 

here is my code:

= list.distinct(list.union({"ZSD026 - Sales Analysis Report"[#"Sales Order No"],"ZSD029 - Invoice Report Material Level"[#"Sales Order No"}))

Hi @Anonymous ,

the M-language is case sensitive.

 



 

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

Anonymous
Not applicable

Hi ,

 

I'm trying this M code and getting an error "Expression.SyntaxError: Invalid identiffier"

 

Here is the code I'm entering to try and combine sales order number lists from two separate queries:

 

= list.distinct(list.union({"ZSD026 - Sales Analysis Report"[#"Sales Order No"],"ZSD029 - Invoice Report Material Level"[#"Sales Order No"}))

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.