Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
rayinOz
Helper III
Helper III

Finding multiple entries / Creating new table

Hello,

 

So I have a table that has the following columns:

 

| Username | Course Name | Course Status |

 

The table contains data related to course enrolments/completions.

 

There is Course A and Course B.

 

Users are only supposed to enrol in one course, but we have situations where a user enrols in both.

 

So, I want to find the users who are enroled into both courses. Idealy a new table with the users are are enroled into both courses.

 

Thoughts?

 

Thanks!

 

RayinOz

2 ACCEPTED SOLUTIONS
MarcelBeug
Community Champion
Community Champion

Power Query solution:

You can use remove and keep duplicates:

If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)

Select Username and select Home - Keep Rows - Keep duplciates

Select Username and select Home - Remove Rows - Remove Duplciates

Remove the other columns.

 

Resulting code:

let
    Source = CourseEnrolments,
    #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}),
    #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner),
    #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)

View solution in original post

Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.

 

In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column

Specializing in Power Query Formula Language (M)

View solution in original post

6 REPLIES 6
MarcelBeug
Community Champion
Community Champion

Power Query solution:

You can use remove and keep duplicates:

If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)

Select Username and select Home - Keep Rows - Keep duplciates

Select Username and select Home - Remove Rows - Remove Duplciates

Remove the other columns.

 

Resulting code:

let
    Source = CourseEnrolments,
    #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}),
    #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner),
    #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)

Marcel,

 

Actually, I can't get it to work correctly. If I select Username and remove duplicates, it will remove all but one entry. Which is OK, however, I do want it to keep the entry with the latest date... 

 

Here's a screenshot of an example of a duplicate entry.

2017-04-20_15-07-47.png

 

 

 

In this table, the highlighted user (aaj) has completed two courses and I want to eliminate one of the entries... the one with the recent date I want to keep... the older date I want to ignore. In this example the dates are the same, so it doesn't matter which one goes away.

 

Does that make sense? Is this possible?

 

Rayinoz

 

 

Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.

 

In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column

Specializing in Power Query Formula Language (M)

nevermind, i figured it out... duh... LOL

 

= Table.Buffer(Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}))

Thanks Marcel... how do I wrap the code in the Table.buffer?

 

Here's the code that I do have.... 

 

    #"Sorted Rows" = Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Username"})
in
    #"Removed Duplicates"

Thanks for your help... I appreciate it so much!

 

Rayinoz

Thank you Marcel!! 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.