Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
DAX_merchant
Helper II
Helper II

Lookup to unrelated table in calculated column

Dear forums,

 

It has come to my attention that LOOKUPVALUE is an unreliable function when used on large scale tables. I don't quite understand why and would love some insight on this. 

 

Along the same lines, I was told by a super user friend of mine that TREATAS is best practice. 

I recently experienced this: 

Two unrelated tables, with a common key. 

Calculated column like such:
VAR id = currenttable ID
CALCULATE(MAX(column in unrelated table), ID in unrelated = currenttable ID))

This returned blanks even though data types of ID columns were the same, result rows were not blank, no duplicates in lookup table, basically all low hanging fruit fixes don't apply here, this lookup should work. 

 

The table this column is being evaluated in is ~ 9M rows. Copilot said that with large datasets, sometimes the storing of the variable itself can fail, along with the context transition from calculate being used in a calculated column. 

First, is there any truth to this? 
Second, why does TREATAS work? It is still being called within the calculate so if my formula looks like:

CALCULATE(max(column in unrelated table), treatas(values(current table ID), unrelated table ID))

Shouldn't this fail just as often as the other variant? 

 

Any expansion on this seemingly random failiure of either context transition, variable storing or anything else that would cause an otherwise straightforward lookup to fail at scale would be greatly appreciated. 

1 ACCEPTED SOLUTION

Yes, DAX *is* designed to perform large scale lookups like this, using 1 to many relationships between tables with PK and FK. What it's not designed to do is to materialise the results into the form of a new column of permanently stored data in one of the tables. It goes against everything the tool was built to do. 

I'm not sure what you mean by "validation". If you mean you want to check if you have any mismatches, then you can do that without materialising a new column. You just have to think differently. I often build an audit page in a report. For example, you could add a table visual, add the columns from Table B, Table A, and a measure COUNTROWS(Table A).  You can use this to find missing values in A and/or B, for example 



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
I will not give you bad advice, even if you unknowingly ask for it.

View solution in original post

11 REPLIES 11
v-sgandrathi
Community Support
Community Support

Hi @DAX_merchant,

Thank you for bringing this up, and I’d like to acknowledge @MattAllington  for their clear explanation.


Their assessment is correct, LOOKUPVALUE can be less reliable in large-scale calculated columns, as it often requires decompressing data and performing row-by-row operations, which can be resource-intensive in Power BI’s VertiPaq engine. In comparison, TREATAS is more efficient, as it applies filters directly within the compressed model, resulting in better performance and reliability, especially when working with unrelated tables. While both methods may seem similar, TREATAS is better optimized for context transition and scalability, which explains its effectiveness where LOOKUPVALUE may encounter limitations.

Thank you.

Hi @DAX_merchant,

 

We wanted to follow up since we haven't heard back from you regarding our last response. We hope your issue has been resolved.

If you need any further assistance, feel free to reach out.

 

Thank you for being a valued member of the Microsoft Fabric Community Forum!

Hi @DAX_merchant,

 

Just wanted to check regarding your question. We haven’t heard back and want to ensure you're not stuck. If you need anything else or have updates to share, we’re here to help!

 

Thank you.

Hi @DAX_merchant,

 

 

Just wanted to follow up and confirm that everything has been going well on this. Please let me know if there’s anything from our end.
Please feel free to reach out Microsoft fabric community forum.

MattAllington
Community Champion
Community Champion

I have some insights; others may have more. I don't know the specifics of the LOOKUPVALUE failure but I know a bit about the way the database is built. Originally, Microsoft used a name "semantic model".  They then changed it, and then changed it back (this is again the current term). I wrote about it here. https://exceleratorbi.com.au/why-did-microsoft-rename-datasets-to-semantic-models/

Long story short, the data is not in discrete tables as the semantic model implies but instead the data is stored in complex compressed states that can include multiple semantic tables physically being stored in single physical compressed table structures.

 

Now, I can't give you a clear answer to your question, but what I learnt over my 8 years as an MVP was that sometimes it's just easier to accept that "it's just the way it is" due to the complexity described above. So I'm making this next bit up (based on experience, not fact) but my assumption is that TREATAS is optimised to filter tables in the compressed states, can then extract the related keys (which likely are not the semantic keys you see in the table view) and correctly apply those to the other semantic table.  If you then consider LOOKUPVALUE, almost by definition it uses the opposite approach;  it requires the compressed data be uncompressed and the semantic tables you see in the model recreated prior to completing the lookup being completed using more traditional logic. I can't say why it fails (as you have said) but I can say, at least conceptually, it's a much harder task for the database to complete. 

HTH



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
I will not give you bad advice, even if you unknowingly ask for it.

Matt,

Thank you for the reply and explanation. I have since encountered some errors with TREATAS that I will describe below:

 

Apologies in advance for this long message, this has just been so confusing for me...

 

First lets imagine the model consists of two tables.
Table A; ~ 10M rows; column A is the unique ID (plenty of duplicates) and column B is to be a lookup; there is also a column C that is related to column A in tableB but I want the lookup to be based on column A. 
Table B; ~ 5K rows: Column A is unique ID (no dupes, same data type as column A in table A) and column B is the column I want to bring in to table A. 

I have tried 5 versions of a lookukp calculated column:
1

CALCULATE(MAX(tableB[columnB]), tableB[columnA] = tableA[columnA])

 

2
VAR currentval = tableA[columnA]

CALCULATE(MAX(tableB[columnB]), tableB[columnA] = currentval)

 

3

CALCULATE(MAX(tableB[columnB]), TREATAS(VALUES(tableA[columnA]), tableB[columnA]))

4
CALCULATE(MAX(tableB[columnB]), TREATAS({tableA[columnA]}, tableB[columnA]))

5

CALCULATE(MAX(tableB[columnB]), TREATAS({tableA[columnA]}, tableB[columnA]), ALL())

Only the 5th column behaves as expected. 

 

I can understand why the first two don't work, similarly to lookupvalue, maybe implicit filters in calculate need to evaluate the comparison row by row? 

 

However, the last three differ only in the use of VALUES and the use of ALL(). 

LOOKUPVALUE – DAX Guide This article shows the use of TREATAS in place of LOOKUPVALUE and uses ALL but for the life of me I can't understand why this matters. 

Furthermore, when I use VALUES, I get a different error, instead of the lookup sometimes being blank when there should be something, there are times when the lookup should return a blank (for example when tableA[columnA] is blank) and it instead returns what seems to be a random value from tableA[columnB]. 

 

What I am really confused about is why the ALL seemingly fixes all these issues. I thought that since I was forcing context with treatas or the implicit filter, that ALL shouldn't do anything. But since there is an existing relationship, does the calculate do the context transition based on the value of columnC (the column that has an existing active relationship with table B) and then show only the row that matches that index?

 

I guess I don't even really understand the context transition of calculate in this case since the row to filter context would make since if I was referencing tableA not tableB. Any help would be appreciated!

 

My first response is "why do you want to do this"?  The semantic model is designed to work with separate tables. If TableB[column A] is unique (primary key) and TableA[column A] contains duplicates of the same data (foreign key), then why don't you just create a 1 to many relationship between the 2 columns. Once done, you can build a visual containing columns from Table B and Table A as if they were one table. This is exactly what the semantic model was built to do; no lookups needed. 



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
I will not give you bad advice, even if you unknowingly ask for it.

Matt, good point. 

You can imagine it like the classic order date vs invoice date. Order date is used for most things, but invoice date might want to be related at some point. Of course in a measure this works, but in a calculated column, I have not found a way to use related inside a calcualte with userelationship. 

The point is, you probably don't need it in a calculated column. Just build a table visual on the reporting canvas using columns from both tables. As long as you have a measure from Table A in the visual, it should work as you would expect. 



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
I will not give you bad advice, even if you unknowingly ask for it.

Matt, I will try this out. Despite this, I am still curious as to why the output of these columns would differ. From what you have told me, I am assuming that DAX is simply not designed to perform large scale lookups like this, especially when using sub optimal syntax. 

I have the habit of performing validation in the table view, but it seems like this is not best practice. I want to reiterate that I completely understand where you are coming from, but I still have the desire to understand the software that I am using to a level where I know when to use certain functions versus others, not just from experience, but because I know what is going on under the hood. 

 

Thanks again for your help!

Yes, DAX *is* designed to perform large scale lookups like this, using 1 to many relationships between tables with PK and FK. What it's not designed to do is to materialise the results into the form of a new column of permanently stored data in one of the tables. It goes against everything the tool was built to do. 

I'm not sure what you mean by "validation". If you mean you want to check if you have any mismatches, then you can do that without materialising a new column. You just have to think differently. I often build an audit page in a report. For example, you could add a table visual, add the columns from Table B, Table A, and a measure COUNTROWS(Table A).  You can use this to find missing values in A and/or B, for example 



* Matt is an 8 times Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.
I will not give you bad advice, even if you unknowingly ask for it.

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Top Solution Authors