Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
Anonymous
Not applicable

Power Query: Extract distinct values from column as new query

Hi,

 

I have a fairly large query with 1,5 million rows, which is growing by around 5000 rows per day. From that query I want to extract the distinct values from one specific column and create a new table with those values, so I can use it as a dimension and for forming relationships in my model.

 

Right now, I have referenced the first query, deleted all the columns besides the one I need, and then done a "Remove Duplicates", with a result of two distinct text values. This works, but at query refresh, this simple table containing only two text values as a final result will load all 1,5 million rows first, and then clean it up.

 

Is there a more efficient way to complete the same task in Power Query, so I can reduce refresh time?

1 ACCEPTED SOLUTION
v-shex-msft
Community Support
Community Support

Hi @Anonymous,

 

You can try to create a blank query to reference original source, then use list.distinct to remove duplicate records.

 

let
    DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName])
in
    DistinctSource

 

Sheet2(22582 rows) -> Date(953 rows)

9.PNG

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

View solution in original post

16 REPLIES 16
vangiang
Regular Visitor

hi All. i noticed the icon in front of that Query 2 looks very thin, wonder if that indicates distinct values? Thanks in advance.

vangiang_0-1730736586736.png

 

Hi @vangiang , not sure if this is still helpful, but that means that your query is in List form. The implication is that most of the operations you can apply to the data, like Merge, Append, Replace values and what-not are not available.

To hijack the response given above, you can convert it to a table using the following:

let
    DistinctSource = Table.FromList(List.Distinct(Sheet2[Date]))
in
    DistinctSource
Stephen_Sink
Frequent Visitor

I struggled for a while with this becuase there is some key puntuation missing from this solution. Here's what worked for me - I am refrenceing another query

= List.Distinct(#"QueryName"[ColumnName])



Key things missing was having the # and "" in the right spot

HairyDrumroll
Advocate I
Advocate I

In the Power Query editor:

Right click on your existing query and choose "Reference".

Use "Choose Columns" to select only the column which you want your unique values generated from.

Right click on the column and choose "Remove Duplicates"

I like this solution because it allows you to continue to use table tools and create new custom column off the de-duplicated list. Power BI Guy's solution may be more effecient however.

This is the users current solution, however this is inefficient as it loads the entire table and then performs the transformations.

 

Current best solution is:

 

let
    DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName])
in
    DistinctSource

Your code is slightly wrong. I've edited it below 🙂

 

let
    DistinctSource = List.Distinct(Sheet2,"Date")// List.Distinct(QueryName,"ColumnName")
in
    DistinctSource

 

Thanks! This is just the alternative I was looking for.

Anonymous
Not applicable

Hello

I faced the same problem you asked for and i am quite new to power bi and mostly know its basics.

what i did was i created a new table in power bi desktop and  used the formula

 

NewTable = DISTINCT(Tablename [field])
 
and it worked.
v-shex-msft
Community Support
Community Support

Hi @Anonymous,

 

You can try to create a blank query to reference original source, then use list.distinct to remove duplicate records.

 

let
    DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName])
in
    DistinctSource

 

Sheet2(22582 rows) -> Date(953 rows)

9.PNG

 

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

Thanks for posting the solution, it definitely helped me tackle the problem that I was facing.

This works. But the query still refreshes all the 1.5 million records! So it does not solve the question.

That means you still have to wait for the query to finish once more. This is made like this to "make shure" Power-Bi have all the data once more pulled from the souse. Quite annoying actually.    

I have a question for this. I am trying to do this myself:
you write // List.Distinct(QueryName[ColumnName]), is this the new query you're working in?

If not, what is it?

Hi @BiBra,

 

Nope, it just means the description of my formula.

 

Regards,

Xiaoxin Sheng

Community Support Team _ Xiaoxin
If this post helps, please consider accept as solution to help other members find it more quickly.

@v-shex-msft Okay, however the code does not work for me

I get this:

fault2.PNG

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.