Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
Fnoorda
Frequent Visitor

loosing rows after expanding merge query

Hi,

 

I am trying to merge two queries. Each query has three keys, I have tried merging with multiple columns and I have merged columns and then tried the merge the two query. Both approaches with no luck.

 

Actually the problem only manifest itself when you expand the resulting query. I seem to loose records.

 

when i click on the merged table column i can see the record it matched on but after i expand the table is empty.

 

I have seen a simular post dated dec 2015 but it had no resolution other than to send a frown.

 

any help would be appreciated.

 

Thanks

 

 

 

10 REPLIES 10
Anonymous
Not applicable

What type of join are you doing in the Merge?  Depending on what type you have chosen will depend on whether you should lose rows on the Left or right of the join.  There is also a type to keep all rows, but you need to be aware of the data gaps this will create when rows can't be matched.

Thanks Ross,

 

I am trying to do an inner join and I am aware of the consequences of the various joins. The situation is this.

File one:

h2_id   Date                 Code

123      12/2/2005        C18

123      12/4/2015        C00

 

File two:

h2_id   Date                 Code

123      12/2/2005        C18

123      12/3/2012        C18

 

This is a simplified picture but I need to rank both file and get separate queries with the first and second ranks.

I used this grouping command to get a ranking:

= Table.Group(#"Sorted Rows", {"h2_id"}, {{"Count", each Table.AddIndexColumn( _, "Index",1,1), type table}})

So now I have:

h2_id   Date                 Code    Rank

123      12/2/2005        C18      1         

123      12/4/2015        C00      2

And

h2_id   Date                 Code    Rank

123      12/2/2005        C18      1

123      12/3/2012        C18      2

 

Now I am trying to get a join on the highlighted records! Is there something peculiar about the index column. I can see a “one” but is it really a one? If I explicitly create “Ones” in each of the columns I get a successful join. I am going to try a

different method to get my first and second ranking by doing a group and Min aggregate.

 

Any advice is welcome.

 

Fred

The highlighting didn't survive my copy/paste but you can see i should get a match on the first record in each file

Anonymous
Not applicable

I'm quite lost at what you are trying to achieve.  You have created a Rank field in both tables, but they are a different type of ranking correct?  What is your expected output?

Ross. The second file is a subset of the first. All records of the first exist in the second.  the first file has aound 16 thousand reocrds and the seciond file has about two thousend. The final result is likely to be a couple of hundred. So I work for the cancer council victoria and files are all to do with types fo cancer.  So the first file has all of the cancers people in aparticular study have had adn the second file is just one type of cancer. but even the one type people can have mutiple times. so the relationship between the two file is m:n. I am looking for all the people that had second cancer after the first theta is different from the first one. 

 

In my example it was just the first step of joining the two file on the first cancer that is failing. It should have matched on 

 

123      12/2/2005        C18      1

 

But when I use the rank created with Table.AddIndexColumn it does not join correctly. I have tried it on a small sample just imported from excel. If I create the index column in excel it joins correctly.

 

The full blown files come from a SQL server DW so creating the index column before hand is noyt an option.

 

Hope i made myself a little clearer.

I have a small pbix file an Excel file that illustrates the point. How can i attach this example to this post?

Anonymous
Not applicable

Still trying to get a handle on the problem.  Based on your description:

 

File 1 contains every instance of a person having cancer in your study.

 

File 2 is the same as file 1, but is filtered down on a specific cancer.

 

The object is to achieve a listing of People (H2_ID?) from File 2, who have had another cancer (Code?) occur after the instance shown in File 2?

 

 

Yes nearly there, The brief looked simple enough to me as well until I asked some questions and the in and outs come in to play.

 

The object is to achieve a Count of People (H2_ID?) by Code from File 2, who have had another cancer (Code?) occur after the instance shown in File 2 and is the first cancer they have had in file 2 is the first cancer in file 1. 

 

 

 

 

 

 

 

 

Anonymous
Not applicable

Perhaps an alternate method?  Here is a quick example:

"File One"

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtIBkvpAaGBgCmQ7G1ooxeogyRgDZQyNsMmYgGTAegwMlGJjAQ==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [h2_id = _t, Date = _t, Code = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"h2_id", Int64.Type}, {"Date", type date}, {"Code", type text}}),
    #"Sorted Rows" = Table.Sort(#"Changed Type",{{"Date", Order.Ascending}})
in
    #"Sorted Rows"


"File One First Entry"  (Patients with 1 or less cancers are removed)

let
    Source = #"File One",
    #"Grouped Rows" = Table.Group(Source, {"h2_id"}, {{"First Code", each List.First([Code]), type text}, {"Count", each Table.RowCount(_)}}),
    #"Filtered Rows" = Table.SelectRows(#"Grouped Rows", each ([Count] > 1))
in
    #"Filtered Rows"


"File Two"

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjQyVtIBkvpAaGBgCmQ7G1ooxeogyRgDZQyNYDKxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [h2_id = _t, Date = _t, Code = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"h2_id", Int64.Type}, {"Date", type date}, {"Code", type text}}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Changed Type",{{"Date", type date}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type1",{"h2_id", "Code"},#"File One First Entry",{"h2_id", "First Code"},"File One First Entry",JoinKind.LeftOuter),
    #"Expanded File One First Entry" = Table.ExpandTableColumn(#"Merged Queries", "File One First Entry", {"Count"}, {"Count"})
in
    #"Expanded File One First Entry"



File One, is designed to be the complete list you had mentioned.  From here we reference this into a new table called "File One First Entry" with the purpose of only taking the first cancer entry for each person.  Each row should have have the Person's ID, the First Cancer recorded, and how many cancer findings were recorded in total.  Since we don't need anyone with 1 finding or less, we filter them out.

File Two is loaded with a subsequent Left outer join to the "File One First Entry" table.  This join will only succeed if both the Person ID and Cancer match.  This means the only matches you will receive will be people with the Cancer listed in 'File Two' and where that was their first cancer match.  Unfortunately this will also capture people who had the cancer first, but also had it in all subsequent findings.

 

I know this is deviating from your original design, however it might help resolve your joining problem.

 

I will try your appoach and built on that. I will not be able to do anything tomorrow but will let you know later this week how it went.

 

Thanks very much for your interest and help.

 

Fred

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors