Solved: Re: Steps for Creating Multiple Step Joins on Diff...

MarkDGaal · ‎05-11-2016

I have two tables, Table 1; always has the full detail:

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours
WAC      | Plane      |Civil       | Recon      | Air        |  1000
WAC      | Plane      | Military   | Patrol     | Land       |  1000
WAC      | Plane      | Civil      |Survey      |Water       |  1000

Table 2 sometime has the full detail and sometimes it does not and never has the "Hours" field:

ID#| Location | Category 1 | Category 2 | Category 3 | Category 4
001| WAC      | Plane      | Civil      | Recon      | Air            
002| WAC      | Plane      | Civil      | Recon      | Air                    
003| WAC      | Plane      | Military   | Patrol     |            
004| WAC      | Plane      | Civil      |            |

*at minimum Table2 always has a match to Table1 based on Location and Category1

I want to create a multiple step join in which the first time I merge Table1 & Table2 it uses Location & Category 1-4 to bring over a count of matching ID#'s as well as the ID# itself. Then, the next time I only use Location & Category 1-3. Followed by Location & Category 1-2.... etc For a final output that looks like:

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours | CountMatchingTable2.ID#| Table2.ID#
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 2                      | 001 (match on Loc & C1-4)
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 2                      | 002 (match on Loc & C1-4) 
WAC      | Plane      | Military   | Patrol     | Land       |  1000 | 1                      | 003 (match on Loc & C1-3) 
WAC      | Plane      | Civil      |Survey      |Water       |  1000 | 1                      | 004 (match on Loc & C1-2)

Thus far I've created Table2 Grouped by Location & Category 1-4, Table 2 grouped by Location & Category 1-3, Table2 Grouped by Location & Category 1-2, etc... Then, I merge (left outer join) Table1 with Table2 on Location and Category1-4. This duplicates the first record in Table1 and ascribes ID# 001 and ID#002 to the records. The 3rd and 4th Records in Table1 are null because Category 4 and 3 do not match respectively.

Now my problem is I would like to only do the remaining (progressively less detailed) joins on the unmatched records in Table1. However, I can't figure out how to do that exclusively on the unmatched records and when I try to merge Table1 again with Table2 on Location & Category1-3 I duplicate the already matched (1st and 2nd records) from Table1. The end goal here is to divide Hours by the number of matching records in Table2 and then create a relationship back to Table2 using ID#.

Also if there is a way to do this without having 4 separate queries for Table2 grouped 4 different ways kindly let me know. Thanks in advance.

ImkeF · ‎05-11-2016

Somehow this request feels a bit strange, but anyway - here we go:

// Table2
let
    Source = YourTable2,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}})
in
    #"Changed Type"

// Table1
let
    Source = YourTable1,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Hours", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    MatchCat4 = Table.NestedJoin(#"Added Index",{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"Cat4",JoinKind.LeftOuter),
    StrikesCat4 = Table.AggregateTableColumn(MatchCat4, "Cat4", {{"First", List.Last, "Cat4"}}),
    MatchCat3 = Table.NestedJoin(StrikesCat4,{"Location", "Category 1", "Category 2", "Category 3"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat3 = Table.AggregateTableColumn(MatchCat3, "NewColumn", {{"First", List.Last, "Cat3"}}),
    MatchCat2 = Table.NestedJoin(StrikesCat3,{"Location", "Category 1", "Category 2"},DistinctCats,{"Location", "Category 1", "Category 2"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat2 = Table.AggregateTableColumn(MatchCat2, "NewColumn", {{"First", List.Last, "Cat2"}}),
    HighestMatchLevel = Table.AddColumn(StrikesCat2, "HighestMatchLevel", each List.Min({[Cat4],[Cat3],[Cat2]}))
in
    HighestMatchLevel

// DistinctCats
let
    Source = Table2,
    #"Grouped Rows" = Table.Group(Source, {"Location", "Category 1", "Category 2", "Category 3", "Category 4"}, {{"First", each List.Min([ID]), type number}, {"CountIDs", each Table.RowCount(_), type number}})
in
    #"Grouped Rows"

// Result
let
    Source = Table2,
    MergeDistinctCats = Table.NestedJoin(Source,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"NewColumn",JoinKind.LeftOuter),
    ShowLookups = Table.ExpandTableColumn(MergeDistinctCats, "NewColumn", {"First", "CountIDs", "AllIDs"}, {"First", "CountIDs", "AllIDs"}),
    MergeTable1OnID = Table.NestedJoin(ShowLookups,{"First"},Table1,{"HighestMatchLevel"},"NewColumn",JoinKind.LeftOuter),
    ShowTotalHours = Table.ExpandTableColumn(MergeTable1OnID, "NewColumn", {"Hours"}, {"TotalHours"}),
    CalculateHoursPerID = Table.AddColumn(ShowTotalHours, "HoursPerID", each [TotalHours]/[CountIDs])
in
    CalculateHoursPerID

The query-names are given as //... at the beginning of each query

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

View solution in original post

ImkeF · ‎05-11-2016

Somehow this request feels a bit strange, but anyway - here we go:

// Table2
let
    Source = YourTable2,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", Int64.Type}, {"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}})
in
    #"Changed Type"

// Table1
let
    Source = YourTable1,
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Location", type text}, {"Category 1", type text}, {"Category 2", type text}, {"Category 3", type text}, {"Category 4", type text}, {"Hours", Int64.Type}}),
    #"Added Index" = Table.AddIndexColumn(#"Changed Type", "Index", 0, 1),
    MatchCat4 = Table.NestedJoin(#"Added Index",{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"Cat4",JoinKind.LeftOuter),
    StrikesCat4 = Table.AggregateTableColumn(MatchCat4, "Cat4", {{"First", List.Last, "Cat4"}}),
    MatchCat3 = Table.NestedJoin(StrikesCat4,{"Location", "Category 1", "Category 2", "Category 3"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat3 = Table.AggregateTableColumn(MatchCat3, "NewColumn", {{"First", List.Last, "Cat3"}}),
    MatchCat2 = Table.NestedJoin(StrikesCat3,{"Location", "Category 1", "Category 2"},DistinctCats,{"Location", "Category 1", "Category 2"},"NewColumn",JoinKind.LeftOuter),
    StrikesCat2 = Table.AggregateTableColumn(MatchCat2, "NewColumn", {{"First", List.Last, "Cat2"}}),
    HighestMatchLevel = Table.AddColumn(StrikesCat2, "HighestMatchLevel", each List.Min({[Cat4],[Cat3],[Cat2]}))
in
    HighestMatchLevel

// DistinctCats
let
    Source = Table2,
    #"Grouped Rows" = Table.Group(Source, {"Location", "Category 1", "Category 2", "Category 3", "Category 4"}, {{"First", each List.Min([ID]), type number}, {"CountIDs", each Table.RowCount(_), type number}})
in
    #"Grouped Rows"

// Result
let
    Source = Table2,
    MergeDistinctCats = Table.NestedJoin(Source,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},DistinctCats,{"Location", "Category 1", "Category 2", "Category 3", "Category 4"},"NewColumn",JoinKind.LeftOuter),
    ShowLookups = Table.ExpandTableColumn(MergeDistinctCats, "NewColumn", {"First", "CountIDs", "AllIDs"}, {"First", "CountIDs", "AllIDs"}),
    MergeTable1OnID = Table.NestedJoin(ShowLookups,{"First"},Table1,{"HighestMatchLevel"},"NewColumn",JoinKind.LeftOuter),
    ShowTotalHours = Table.ExpandTableColumn(MergeTable1OnID, "NewColumn", {"Hours"}, {"TotalHours"}),
    CalculateHoursPerID = Table.AddColumn(ShowTotalHours, "HoursPerID", each [TotalHours]/[CountIDs])
in
    CalculateHoursPerID

The query-names are given as //... at the beginning of each query

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-12-2016

Hummm, okay I've gotten through the code and I think we have a bit of a misunderstanding, let me illustrate with my original tables:

Table1 (i've added an Index as you did in your code):

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours |Index
WAC      | Plane      |Civil       | Recon      | Air        |  1000 |1
WAC      | Plane      | Military   | Patrol     | Land       |  1000 |2
WAC      | Plane      | Civil      |Survey      |Water       |  1000 |3

Table2 from my original post:

ID#| Location | Category 1 | Category 2 | Category 3 | Category 4
001| WAC      | Plane      | Civil      | Recon      | Air            
002| WAC      | Plane      | Civil      | Recon      | Air                    
003| WAC      | Plane      | Military   | Patrol     |            
004| WAC      | Plane      | Civil      |            |

When I do the Matching on my most detialed level ID# 001 AND 002 both match with Index #1.

Accordingly in my final output table (Index added):

Location | Category 1 | Category 2 | Category 3 | Category 4 | Hours | Index| CountMatchingTable2.ID#| Table2.ID#
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 1    | 2                      | 001 (match on Loc & C1-4)
WAC      | Plane      | Civil      | Recon      | Air        |  1000 | 1    | 2                      | 002 (match on Loc & C1-4) 
WAC      | Plane      | Military   | Patrol     | Land       |  1000 | 2    | 1                      | 003 (match on Loc & C1-3) 
WAC      | Plane      | Civil      |Survey      |Water       |  1000 | 3    | 1                      | 004 (match on Loc & C1-2)

Index 1 has duplicated and 1000 will be divided by 2 matching ID's splitting 500 hours to ID#001 and ID#002. I believe by using "List.Last" in your "Strikes" this sort of duplication will not be possible.....

I hear you on this feeling odd, but really all that I am trying to accomplish is making sure that my Hours make it back to the ID#'s at the most detailed levels possible and then not have the hours ever double-counted. I'm open to a different approach if you can think of one...

ImkeF · ‎05-12-2016

Hi Mark,

please have a look at my code again. It should do exactly what you've requested (at least at this point): The result-query starts with query 2 (which is in your example the table with the ID's and incomplete Category-allocations).

So please check your table names and the according references in your code.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-13-2016

Hmmm... no it's basically not allowing ID# 002 to match back with Table1 only ID#001....

Could it be the List.Min in the DistinctCats query? Wouldn't that always just take ID#001 rather than both ID 001 and 002 when the count of IDs is > 1?

ImkeF · ‎05-13-2016

This is the result I got:

Please check if this is your desired result.

If that’s your desired result, please post your query code so that I can have a look to check.

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-16-2016

Okay, phew finally got my head around this, thanks ImkeF you are correct the code handles the duplication correctly! Thanks for all of your help.

MarkDGaal · ‎05-12-2016

Seems I've hit a snag in the Results query....

That query refers to an "ALLIDS" field that is not present in the //Distincats query did you mean to add an "All Rows" Aggregation to the Group By?

ImkeF · ‎05-12-2016

Sorry, yes, I did that and later removed it. Just throw it out:

ShowLookups = Table.ExpandTableColumn(MergeDistinctCats, "NewColumn", {"First", "CountIDs"}, {"First", "CountIDs"}),

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

ImkeF · ‎05-12-2016

I've changed the code for the first 2 queries. Now the first "Source"-steps reference the 2 tables which should be provided ("YourTable1" and "YourTable2")

So if these tables already exist in your model the easiest way would be to rename them to: "YourTable1" and "YourTable2".

That way these queries will work instantly.

Yes, this will create some redundant code and queries. So if you feel confident enough, just clean up afterwards 🙂

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-12-2016

@ImkeF Thank you, I'm starting through your code now (I like to learn so I'm doing it step-by-step). It appears that at the first Aggregation of the most detailed level this step takes 4-5 mins of processing time (2cores/4threads at 50% utilization):

#"StrikesCat4" = Table.AggregateTableColumn(#"MatchCat4", "NewColumn", {{"First", List.Last, "Cat4IDMatch"__li__ng_escape____li__ng_escape____li__ng_escape__}__li__ng_escape____li__ng_escape____li__ng_escape__})

Table2 is 493 rows and Table1 is 1478 rows (after Grouping, source is 170,000rows)... should it be that processor intensive? I should also add that ID from Table2 is not an integer field but rather a combination of letters and numbers. Also, what does "List.Last" do as that doesn't appear to be a function I can do through the Aggregation GUI?

EDIT: Table2 Source is a SharePoint Library Table1 Source is an Excel Doc.

EDIT2: Note Saving the Excel Doc source for Table1 already Grouped (1478rows vs 170,000) did not improve the processor time for Matching&Striking in Table1

ImkeF · ‎05-12-2016

Sorry, didn't pay attention to the performance at all. This is about finding the right place for the List.Buffer or Table.Buffer basically.

Try it here:

// DistinctCats
let
    Source = Table2,
    #"Grouped Rows" = Table.Buffer(Table.Group(Source, {"Location", "Category 1", "Category 2", "Category 3", "Category 4"}, {{"First", each List.Min([ID]), type number}, {"CountIDs", each Table.RowCount(_), type number}}))
in
    #"Grouped Rows"

Imke Feldmann (The BIccountant)

If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!

How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries

MarkDGaal · ‎05-12-2016

Well that helped! All the Matchs&Strikes run in 5secs now as opposed to 50mins. Getting to the Results Table now should be done shortly, will let you know if I have any issues.

Steps for Creating Multiple Step Joins on Different Keys

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

How to Get Your Question Answered Quickly

Fabric Data Days starts November 4th!

Steps for Creating Multiple Step Joins on Different Keys

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

How to Get Your Question Answered Quickly