Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply

Create a duplicate flag in M query

Hello

 

I have created a concatendated column in M query to identify duplicates. I have sorted it. Now I want to created a column that has a duplicate flag (i.e. if not duplicate then 0, if duplicate then 1). 

 

Please see below how I have created it in Excel, however I have no idea how to create this in M query. the "let" query is not working for some reason. 

 

I have also pasted a view of my "concatenate" column, it is formatted as text. The table is called api_data program_4378 

 

KatieFarrand12_0-1694442538017.png 

KatieFarrand12_1-1694442610087.png

 

Warm regards,

Katie

3 REPLIES 3
BA_Pete
Super User
Super User

Hi Katie,

 

In Power Query, select your [Concatenate] column then go to the Home tab > Group By.

In the Group By dialog, create one aggregated column that uses the Count Rows operator, and another that uses the All Rows operator.

Once you apply this, you'll see that your All Rows column is a nested table. Expand this column back out to restore all your original columns.

Now, any row that has a Count Rows column value greater than 1 is a duplicate.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Hi Pete,

 

Thank you for this. I have followed your instructions but I have not gotten the correct outcome. What I wanted is an indicator of 0 if it is the first unique number, and 1 if it is a dupliate. Then I can filter out anything with a 1 and be left with a unique set of data points to work with in my report.  Anyway, I will just use the 'remove duplicates' function to achieve my purpose. Thanks for your response though; I learned something new!

 

Ah, I see, sorry. I misunderstood your use-case.

Yes, remove duplicates will work just fine if all you want to do is create a unique list. Just keep an eye on which duplicated value Power Query decides to remove in case you have other columns that aren't duplicated in each row. I think it selects all duplicates after the first occurrence in the scan to delete which sounds logical, but depends on how your table is sorted.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors