March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi,
I have a fairly large query with 1,5 million rows, which is growing by around 5000 rows per day. From that query I want to extract the distinct values from one specific column and create a new table with those values, so I can use it as a dimension and for forming relationships in my model.
Right now, I have referenced the first query, deleted all the columns besides the one I need, and then done a "Remove Duplicates", with a result of two distinct text values. This works, but at query refresh, this simple table containing only two text values as a final result will load all 1,5 million rows first, and then clean it up.
Is there a more efficient way to complete the same task in Power Query, so I can reduce refresh time?
Solved! Go to Solution.
Hi @Anonymous,
You can try to create a blank query to reference original source, then use list.distinct to remove duplicate records.
let DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName]) in DistinctSource
Sheet2(22582 rows) -> Date(953 rows)
Regards,
Xiaoxin Sheng
hi All. i noticed the icon in front of that Query 2 looks very thin, wonder if that indicates distinct values? Thanks in advance.
Hi @vangiang , not sure if this is still helpful, but that means that your query is in List form. The implication is that most of the operations you can apply to the data, like Merge, Append, Replace values and what-not are not available.
To hijack the response given above, you can convert it to a table using the following:
let DistinctSource = Table.FromList(List.Distinct(Sheet2[Date])) in DistinctSource
I struggled for a while with this becuase there is some key puntuation missing from this solution. Here's what worked for me - I am refrenceing another query
= List.Distinct(#"QueryName"[ColumnName])
Key things missing was having the # and "" in the right spot
In the Power Query editor:
Right click on your existing query and choose "Reference".
Use "Choose Columns" to select only the column which you want your unique values generated from.
Right click on the column and choose "Remove Duplicates"
I like this solution because it allows you to continue to use table tools and create new custom column off the de-duplicated list. Power BI Guy's solution may be more effecient however.
This is the users current solution, however this is inefficient as it loads the entire table and then performs the transformations.
Current best solution is:
let DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName]) in DistinctSource
Your code is slightly wrong. I've edited it below 🙂
let DistinctSource = List.Distinct(Sheet2,"Date")// List.Distinct(QueryName,"ColumnName") in DistinctSource
Thanks! This is just the alternative I was looking for.
Hello
I faced the same problem you asked for and i am quite new to power bi and mostly know its basics.
what i did was i created a new table in power bi desktop and used the formula
Hi @Anonymous,
You can try to create a blank query to reference original source, then use list.distinct to remove duplicate records.
let DistinctSource = List.Distinct(Sheet2[Date])// List.Distinct(QueryName[ColumnName]) in DistinctSource
Sheet2(22582 rows) -> Date(953 rows)
Regards,
Xiaoxin Sheng
Thanks for posting the solution, it definitely helped me tackle the problem that I was facing.
This works. But the query still refreshes all the 1.5 million records! So it does not solve the question.
That means you still have to wait for the query to finish once more. This is made like this to "make shure" Power-Bi have all the data once more pulled from the souse. Quite annoying actually.
I have a question for this. I am trying to do this myself:
you write // List.Distinct(QueryName[ColumnName]), is this the new query you're working in?
If not, what is it?
Hi @BiBra,
Nope, it just means the description of my formula.
Regards,
Xiaoxin Sheng
I get this:
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
User | Count |
---|---|
90 | |
89 | |
85 | |
73 | |
49 |
User | Count |
---|---|
169 | |
144 | |
90 | |
70 | |
58 |