Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
rayinOz
Helper III
Helper III

Finding multiple entries / Creating new table

Hello,

 

So I have a table that has the following columns:

 

| Username | Course Name | Course Status |

 

The table contains data related to course enrolments/completions.

 

There is Course A and Course B.

 

Users are only supposed to enrol in one course, but we have situations where a user enrols in both.

 

So, I want to find the users who are enroled into both courses. Idealy a new table with the users are are enroled into both courses.

 

Thoughts?

 

Thanks!

 

RayinOz

2 ACCEPTED SOLUTIONS
MarcelBeug
Community Champion
Community Champion

Power Query solution:

You can use remove and keep duplicates:

If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)

Select Username and select Home - Keep Rows - Keep duplciates

Select Username and select Home - Remove Rows - Remove Duplciates

Remove the other columns.

 

Resulting code:

let
    Source = CourseEnrolments,
    #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}),
    #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner),
    #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)

View solution in original post

Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.

 

In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column

Specializing in Power Query Formula Language (M)

View solution in original post

6 REPLIES 6
MarcelBeug
Community Champion
Community Champion

Power Query solution:

You can use remove and keep duplicates:

If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)

Select Username and select Home - Keep Rows - Keep duplciates

Select Username and select Home - Remove Rows - Remove Duplciates

Remove the other columns.

 

Resulting code:

let
    Source = CourseEnrolments,
    #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}),
    #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner),
    #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}),
    #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"})
in
    #"Removed Columns"

 

Specializing in Power Query Formula Language (M)

Marcel,

 

Actually, I can't get it to work correctly. If I select Username and remove duplicates, it will remove all but one entry. Which is OK, however, I do want it to keep the entry with the latest date... 

 

Here's a screenshot of an example of a duplicate entry.

2017-04-20_15-07-47.png

 

 

 

In this table, the highlighted user (aaj) has completed two courses and I want to eliminate one of the entries... the one with the recent date I want to keep... the older date I want to ignore. In this example the dates are the same, so it doesn't matter which one goes away.

 

Does that make sense? Is this possible?

 

Rayinoz

 

 

Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.

 

In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column

Specializing in Power Query Formula Language (M)

nevermind, i figured it out... duh... LOL

 

= Table.Buffer(Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}))

Thanks Marcel... how do I wrap the code in the Table.buffer?

 

Here's the code that I do have.... 

 

    #"Sorted Rows" = Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}),
    #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Username"})
in
    #"Removed Duplicates"

Thanks for your help... I appreciate it so much!

 

Rayinoz

Thank you Marcel!! 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.