Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.
Hello,
So I have two solutions in place but they are causing some issues. Bascially, I have two tables:
1.) Table 1 has ID which is also present in Table 2
2.) Table 2 has ID and other data fields
Either in M or DAX I want a final table which gives me just one row per ID based on the Insert Date and for other ID's that are not present in Table 2 to return NULL:
Table 1
ID | NAME | ROLE |
1 | ABC | DEVELOPER |
2 | ADAS | QA |
3 | MQE | BA |
4 | RTYA | PM |
5 | GJHT | PM |
Table 2
ID | TASK | DURATION | INSERT DATE |
1 | COMPLETE | 2 HRS | 6/22/2018 16:39 |
1 | INCOMPLETE | 3 HRS | 6/24/2018 15:39 |
1 | INCOMPLETE | 4 HRS | 6/24/2018 18:39 |
1 | INCOMPLETE | 3 HRS | 6/24/2018 15:39 |
Final Result
ID | TASK | DURATION | INSERT DATE |
1 | INCOMPLETE | 4 HRS | 6/24/2018 18:39 |
2 | null | null | null |
3 | null | null | null |
4 | null | null | null |
5 | null | null | null |
My DAX table function and M worked till now but is failing today showing weird results.
Solved! Go to Solution.
This calculated table gets close. It looks long, but it's not that complicated. I have attached a PBIX file
Table = VAR x = SELECTCOLUMNS( Table2 , "IDx",[ID] , "TASK" ,[TASK] , "DURATIONx" , INT(SUBSTITUTE('Table2'[DURATION]," HRS","")) , "INSERT DATE",[INSERT DATE] ) VAR y = GROUPBY( x, [IDx], "MAX_DURATIONx", MAXX(CURRENTGROUP(),[DURATIONx]) ) VAR z= SELECTCOLUMNS( GENERATEALL( 'Table1', FILTER( y, [ID] = [IDx] ) ), "IDz",[ID], "MAX_DURATOINx",[MAX_DURATIONx] ) RETURN SELECTCOLUMNS( GENERATEALL( z, FILTER( 'Table2',[ID] = [ID] && [MAX_DURATOINx] & " HRS" = 'Table2'[DURATION] ) ), "ID",[IDz], "TASK",[TASK], "DURATION",[DURATION], "INSERT DATE",[INSERT DATE] )
Yes there are many sites which explains how to do this. For example,Please refer the below URL which explains the left outer join in DAX.
https://curbal.com/blog/joining-table-in-power-bi-with-power-query-and-dax
Thanks
Raj
Hi,
This M code works
let
Source = Table.NestedJoin(Table1,{"ID"},Table2,{"ID"},"Table2",JoinKind.LeftOuter),
#"Expanded Table2" = Table.ExpandTableColumn(Source, "Table2", {"TASK", "DURATION", "INSERT DATE"}, {"TASK", "DURATION", "INSERT DATE"}),
#"Grouped Rows" = Table.Group(#"Expanded Table2", {"ID"}, {{"Max", each List.Max([INSERT DATE]), type datetime}}),
#"Merged Queries" = Table.NestedJoin(#"Grouped Rows",{"ID", "Max"},Table2,{"ID", "INSERT DATE"},"Table2",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "Table2", {"TASK", "DURATION"}, {"TASK", "DURATION"}),
#"Reordered Columns" = Table.ReorderColumns(#"Expanded Table1",{"ID", "TASK", "DURATION", "Max"})
in
#"Reordered Columns"
This calculated table gets close. It looks long, but it's not that complicated. I have attached a PBIX file
Table = VAR x = SELECTCOLUMNS( Table2 , "IDx",[ID] , "TASK" ,[TASK] , "DURATIONx" , INT(SUBSTITUTE('Table2'[DURATION]," HRS","")) , "INSERT DATE",[INSERT DATE] ) VAR y = GROUPBY( x, [IDx], "MAX_DURATIONx", MAXX(CURRENTGROUP(),[DURATIONx]) ) VAR z= SELECTCOLUMNS( GENERATEALL( 'Table1', FILTER( y, [ID] = [IDx] ) ), "IDz",[ID], "MAX_DURATOINx",[MAX_DURATIONx] ) RETURN SELECTCOLUMNS( GENERATEALL( z, FILTER( 'Table2',[ID] = [ID] && [MAX_DURATOINx] & " HRS" = 'Table2'[DURATION] ) ), "ID",[IDz], "TASK",[TASK], "DURATION",[DURATION], "INSERT DATE",[INSERT DATE] )
@Phil_Seamark - Hey Phil, I tried adding another field from Table 2 to the final outcome and for some reason it stopped returning the other columns, In order to add any more additional columns should that column be reference in every variable defined?
@Phil_Seamark thanksa lot again Phil, last time you gave me a similar solution and this was an expansion to that and like previously I again appreciate you taking out time to detail it out and laying out new DAX functions which always expands my knowledge. I am customizing this to my need, but it is working so far. Thanks a lot again!
Giving this a shot, my actual tables have more data fields than I listed but hopefully my tweaks take them all.
I just tried in different way.
Step 1 : Lets take your Table 2. Add new column with Max insert date for each ID.
Max_Insert_Date = CALCULATE(MAX(Table2[INSERT DATE]),FILTER(Table2,Table2[ID]= EARLIER(Table2[ID])))
Step 2: From step 1,Get the Latest record only
Table3 = FILTER(Table2,Table2[INSERT DATE]=Table2[Max_Insert_Date])
Step 3: Do a left outer join between Table 1 ( Original table - Just take ID column alone) and Table 3 ( derived in step 2- Select all columns but ID).
You can combine these steps as well to reduce the steps .
Thanks
Raj
I am gettign error on the second step. The error is "the expression referrs to multiple columsn. Multiple columns cannot be converted to scalar value"
@nirvana_moksh wrote:I am gettign error on the second step. The error is "the expression referrs to multiple columsn. Multiple columns cannot be converted to scalar value"
Please do the second step in Modelling -> Nea Table and use the formula. SInce the formula returns multiple colums/ rows, we cant use it as measure/ calculated column.
Thanks
Raj
I See, for the last step are you suggesting to use the NATURALLEFTOUTERJOIN in Dax?
Yes there are many sites which explains how to do this. For example,Please refer the below URL which explains the left outer join in DAX.
https://curbal.com/blog/joining-table-in-power-bi-with-power-query-and-dax
Thanks
Raj
Thanks for the article @Anonymous but the last step is something that I always have an issue with because of the PBI has joins in DAX and how it doesnt recognize the join columns and I was running into that again for some reason.