Solved: Re: need ignore dublicated rows

AndrejZevzikov · ‎02-08-2021

Hi, first of all sorry for my english I will try my best.

Our system have bug it's sometime duplicates almost the same information two or three times. It's look something like that:

ID created_at customer_id Start_date End_date work_time

1 2021.01.01 10:48 13 2021.01.01 2021.01.21 21

2 2021.01.01 10:48 13 2021.01.01 2021.01.21 21

3 2021.01.01 10:50 13 2021.01.01 2021.01.21 21

4 2021.01.05 11:12 5 2021.01.05 2021.01.08 4

5 2021.01.05 11:13 5 2021.01.05 2021.01.08 4

For work_time i create new collumn with =datediff(Start_date,End_date,day)+1

The task is calculate the average work time, but I can't use Average function becouse math don't work

(21+4)/2=12,5 (21+21+21+4+4)/5 = 14,2

maybe you have any ideas how remove dublicated rows or maybe here is other solution for my task.

And again sorry for my english)

amitchandak · ‎02-08-2021

@AndrejZevzikov , Create a measure like

averageX(summarize(Table, Table[customer_id], Table[created_at], Table[work_time]),[work_time])

Or you can delete duplicates in power query

https://www.youtube.com/watch?v=Hc5bIXkpGVE

Full Power BI Video 20 Hours YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube

View solution in original post

PaulDBrown · ‎02-08-2021

@AndrejZevzikov

Your best option is to delete duplicate rows in power Query as @amitchandak rightly suggests.

If you cannot access Power Query, you can create a new table as your "work" table (and ignore the original completely) using DAX. In the ribbon under Modeling, select "New Table" and type the equivalent DAX for your table (do not include the "ID" column since it's unique and will therefore just create the same table you already have) :

New Table =
SUMMARIZE (
    'Old Table',
    'Old Table'[created_at],
    'Old Table'[customer_id],
    'Old Table'[Start_date],
    'Old Table'[End_date],
    'Old Table'[work_time]
)

and you will get this:

Beware that you have two rows (highlighted in the image) which are the same except for the time they were created. If you know these are duplicate, you will have to define a business logic to identify them and then delete them.

To add a new "ID" column, choose new column in the ribbon and type:

ID =
RANK.EQ ( 'New Table'[created_at], 'New Table'[created_at], ASC )

Now you have a clean table to work with, and you can ignore the original.

Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.

amitchandak · ‎02-08-2021

@AndrejZevzikov , Create a measure like

averageX(summarize(Table, Table[customer_id], Table[created_at], Table[work_time]),[work_time])

Or you can delete duplicates in power query

https://www.youtube.com/watch?v=Hc5bIXkpGVE

Full Power BI Video 20 Hours YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube

AndrejZevzikov · ‎02-08-2021

Nice, it's seems working perfectly.

Thanks!

need ignore dublicated rows

Helpful resources

Join us at the Microsoft Fabric Community Conference

Join our Community Sticker Challenge 2025

Fabric Community Update - February 2025

How to Get Your Question Answered Quickly

Join us at the 2025 Microsoft Fabric Community Conference