Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Hello,
So I have a table that has the following columns:
| Username | Course Name | Course Status |
The table contains data related to course enrolments/completions.
There is Course A and Course B.
Users are only supposed to enrol in one course, but we have situations where a user enrols in both.
So, I want to find the users who are enroled into both courses. Idealy a new table with the users are are enroled into both courses.
Thoughts?
Thanks!
RayinOz
Solved! Go to Solution.
Power Query solution:
You can use remove and keep duplicates:
If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)
Select Username and select Home - Keep Rows - Keep duplciates
Select Username and select Home - Remove Rows - Remove Duplciates
Remove the other columns.
Resulting code:
let Source = CourseEnrolments, #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}), #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner), #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}), #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"}) in #"Removed Columns"
Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.
In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column
Power Query solution:
You can use remove and keep duplicates:
If you can have multiple records per Username/Course Name combination: first select Username and Course Name and remove duplicates (Home - Remove Rows - Remove Duplicates)
Select Username and select Home - Keep Rows - Keep duplciates
Select Username and select Home - Remove Rows - Remove Duplciates
Remove the other columns.
Resulting code:
let Source = CourseEnrolments, #"Removed Duplicates" = Table.Distinct(Source, {"Username", "Course Name"}), #"Kept Duplicates" = let columnNames = {"Username"}, addCount = Table.Group(#"Removed Duplicates", columnNames, {{"Count", Table.RowCount, type number}}), selectDuplicates = Table.SelectRows(addCount, each [Count] > 1), removeCount = Table.RemoveColumns(selectDuplicates, "Count") in Table.Join(#"Removed Duplicates", columnNames, removeCount, columnNames, JoinKind.Inner), #"Removed Duplicates1" = Table.Distinct(#"Kept Duplicates", {"Username"}), #"Removed Columns" = Table.RemoveColumns(#"Removed Duplicates1",{"Course Name", "Course Status"}) in #"Removed Columns"
Marcel,
Actually, I can't get it to work correctly. If I select Username and remove duplicates, it will remove all but one entry. Which is OK, however, I do want it to keep the entry with the latest date...
Here's a screenshot of an example of a duplicate entry.
In this table, the highlighted user (aaj) has completed two courses and I want to eliminate one of the entries... the one with the recent date I want to keep... the older date I want to ignore. In this example the dates are the same, so it doesn't matter which one goes away.
Does that make sense? Is this possible?
Rayinoz
Yes: before removing duplicates, sort descending on date, wrap the code in Table.Buffer(....) and then remove duplicates.
In case you want to preserve the original sort order: add an index column first.
After removing duplciates, sort on that index column and remove the index column
nevermind, i figured it out... duh... LOL
= Table.Buffer(Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}))
Thanks Marcel... how do I wrap the code in the Table.buffer?
Here's the code that I do have....
#"Sorted Rows" = Table.Sort(#"Added Conditional Column",{{"Completion Date", Order.Descending}}), #"Removed Duplicates" = Table.Distinct(#"Sorted Rows", {"Username"}) in #"Removed Duplicates"
Thanks for your help... I appreciate it so much!
Rayinoz
Thank you Marcel!!
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.
User | Count |
---|---|
86 | |
76 | |
74 | |
56 | |
45 |
User | Count |
---|---|
117 | |
105 | |
77 | |
66 | |
64 |