Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi all,
I am reading this article on joining tables in DAX and I am quite confused as to how these INNERJOINs work. In the section called "Using NATURALLEFTOUTERJOIN and NATURALINNERJOIN without Relationships" the authors claim:
You cannot join these two tables by using ProductKey, because such a these columns havehas the same name but different data lineages in the model. In fact, the following code generates an error:
EVALUATE NATURALLEFTOUTERJOIN( P_A, P_B )
Ok, so from that I understand that the columns need a common lineage. But then a few lines later they state:
In order to join two columns with the same name and no relationships, it is necessary that these columns do not have a data lineage. To obtain that, it is necessary to write the column using an expression that breaks the data lineage, as in the following example.
So with this it seems we have to break the lineage (??). Later, again, the example with the TREATAS seems to be used to create rather than break the lineage. This is where I'm getting lost. Is it break or create lineage that you need for the INNERJOINS to work?
If anyone knows about this I would really appreciate it if you can clarify it for me, or direct me to other resources where I can read about it. I've checked it in another book but it doesn't really go into the details.
Many thanks for your help.
Solved! Go to Solution.
Man... you ask very difficult questions
My limited understanding is following
1) InnerJoin can be created between tables that do not physically exist in the data model. In that case the tables should have a column with same name and same type. For example you can create temporary tables using variables using DATATABLE, Row or other such functions
2) When using INNERJOIN with Tables existing in the Data Model, there should already be a relationship beween these tables.
If there there is no relationship , we can still use INNERJOIN
but first
we need to remove the data lineage (i.e. the tag which identifies the original column in the data model that the values of a column originated from). Otherwise it can create confusion querying physical tables in the data model.
This tag is not removed simply by changing the names of the column using SELECTCOLUMNS or any other simialr function. becasue DAX engine is smart enough to remember where the values originated from. Data lineage is maintained as long as an expression is only made up of one column reference. To beak that lineage we use techniques like adding empty string or a 0 to the column expression.
Hi @AlB,
Hope the blogs below will help you.
https://curbal.com/blog/glossary/naturalleftouterjoin-dax
https://curbal.com/blog/glossary/naturalleftouterjoin-dax
In addition, we could use DAX Studio trace the dax query to analyze the query better.
Best Regards,
Cherry
Hi @v-piga-msft
Thanks for your reply. I checked the videos but, unfortunately, they don't answer my question. In fact the speaker says that she doesn't know how the joins work with no relationships and she is asking for tips herself.
Maybe @Zubair_Muhammad can give us a hand here?
Many thanks
Man... you ask very difficult questions
My limited understanding is following
1) InnerJoin can be created between tables that do not physically exist in the data model. In that case the tables should have a column with same name and same type. For example you can create temporary tables using variables using DATATABLE, Row or other such functions
2) When using INNERJOIN with Tables existing in the Data Model, there should already be a relationship beween these tables.
If there there is no relationship , we can still use INNERJOIN
but first
we need to remove the data lineage (i.e. the tag which identifies the original column in the data model that the values of a column originated from). Otherwise it can create confusion querying physical tables in the data model.
This tag is not removed simply by changing the names of the column using SELECTCOLUMNS or any other simialr function. becasue DAX engine is smart enough to remember where the values originated from. Data lineage is maintained as long as an expression is only made up of one column reference. To beak that lineage we use techniques like adding empty string or a 0 to the column expression.
@AlB thanks for this great question !!!
@Zubair_Muhammad thanks for the amazing explanation !!!
Thanks very much for your reply. I guess it's a good sign that the questions are difficult
Your tips were very useful, particularly the difference between physical tables and those created dynamically. After testing a few things my conclusion is that:
1. Tables created dynamically from scratch, such as with DATATABLE or the table constructor: { ("a", 1) , ("b", 2) }, do not have that lineage tag.
2. Columns in physical tables always have a lineage tag.
In the first case there's no lineage so there's no problem for the joins ; it seems like "no lineage" on both columns is considered "same lineage" by the engine. In the second, there's lineage and it's different. It needs to be removed so that we have two columns with no lineage that can now be joined.
What do you think?
Thanks a lot
And since you like tough questions I have something else regarding the same article on SQLBI.
(Disclaimer: I have posted this question there as well, in case you prefer to wait for their answer. That might be soon or take quite long, depending on how busy they are).
They show an interesting way of creating common lineage through TREATAS before joining the tables:
EVALUATE
VAR B_TreatAs =
TREATAS ( P_A, P_B[ProductKey], P_A[Code], P_A[Color] )
VAR Result =
NATURALLEFTOUTERJOIN ( B_TreatAs, P_B )
RETURN
Result
I have two variations of this code and corresponding questions:
1. The following does not work:
EVALUATE
VAR Result =
CALCULATETABLE(NATURALLEFTOUTERJOIN ( P_A, P_B ), TREATAS ( P_A, P_B[ProductKey], P_A[Code], P_A[Color] ))
RETURN
Result
Here we perform the TREATAS in an argument of the CALCULATETABLE. We are providing the CALCULATETABLE with a table (result of the TREATAS) with the adequate lineage changes. Any idea why this wouldn't work?
2. This does not work either
EVALUATE
VAR B_TreatAs =
TREATAS ( P_A, P_B[ProductKey], P_A[Code], P_A[Color] )
VAR Result =
NATURALLEFTOUTERJOIN ( P_A, P_B )
RETURN
Result
The only difference with the initial code we are using P_A as the first argument of the INNERJOIN instead of the variable B_TreatAs. Here is my reasoning as to why this should work:
TREATAS properly modifies the lineage when the variable B_treatAs is created. So P_B[ProductKey] has already "received" the lineage of P_A[ProductKey]. Once that has happened, the Innerjoin should work with both P_A or B_TreatAs.
What is wrong with my reasoning? Or is it that the variable B_TreatAs is actually not created if the variable is not used/consumed?
Many thanks
Hi Guys,
This link provide your answer >>>
https://stackoverflow.com/questions/51432874/how-to-join-summarized-data-from-two-queries-into-new-t...
Just wanted to say thanks. Adding a text value to the column values solved my issue.
Without the "C", the below didn't worked :).
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
107 | |
100 | |
80 | |
63 | |
58 |
User | Count |
---|---|
148 | |
111 | |
94 | |
84 | |
67 |