Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now

Reply

need ignore dublicated rows

Hi, first of all sorry for my english I will try my best.

Our system have bug it's sometime duplicates almost the same information two or three times. It's look something like that:

 

ID              created_at                 customer_id            Start_date            End_date             work_time

1            2021.01.01 10:48                   13                     2021.01.01           2021.01.21              21

2            2021.01.01 10:48                   13                     2021.01.01           2021.01.21              21

3            2021.01.01 10:50                   13                     2021.01.01           2021.01.21              21

4            2021.01.05 11:12                    5                      2021.01.05           2021.01.08               4

5            2021.01.05 11:13                    5                      2021.01.05           2021.01.08               4

 

For work_time i create new collumn with =datediff(Start_date,End_date,day)+1

The task is calculate the average work time, but I can't use Average function becouse math don't work

(21+4)/2=12,5   (21+21+21+4+4)/5 = 14,2

maybe you have any ideas how remove dublicated rows or maybe here is other solution for my task.

And again sorry for my english)

1 ACCEPTED SOLUTION
amitchandak
Super User
Super User

@AndrejZevzikov , Create a measure like

averageX(summarize(Table, Table[customer_id], Table[created_at], Table[work_time]),[work_time])

 

Or you can delete duplicates in power query

https://www.youtube.com/watch?v=Hc5bIXkpGVE

Join us as experts from around the world come together to shape the future of data and AI!
At the Microsoft Analytics Community Conference, global leaders and influential voices are stepping up to share their knowledge and help you master the latest in Microsoft Fabric, Copilot, and Purview.
️ November 12th-14th, 2024
 Online Event
Register Here

View solution in original post

3 REPLIES 3
PaulDBrown
Community Champion
Community Champion

@AndrejZevzikov 

Your best option is to delete duplicate rows in power Query as @amitchandak rightly suggests. 

If you cannot access Power Query, you can create a new table as your "work" table (and ignore the original completely) using DAX. In the ribbon under Modeling, select "New Table" and type the equivalent DAX for your table (do not include the "ID" column since it's unique and will therefore just create the same table you already have) :

 

New Table =
SUMMARIZE (
    'Old Table',
    'Old Table'[created_at],
    'Old Table'[customer_id],
    'Old Table'[Start_date],
    'Old Table'[End_date],
    'Old Table'[work_time]
)

 

and you will get this:

new table.JPG

 Beware that you have two rows (highlighted in the image) which are the same except for the time they were created. If you know these are duplicate, you will have to define a business logic to identify them and then delete them.

To add a new "ID" column, choose new column in the ribbon and type:

 

ID =
RANK.EQ ( 'New Table'[created_at], 'New Table'[created_at], ASC )

 

Id col.JPG

Now you have a clean table to work with, and you can ignore the original.

model.JPG

 





Did I answer your question? Mark my post as a solution!
In doing so, you are also helping me. Thank you!

Proud to be a Super User!
Paul on Linkedin.






amitchandak
Super User
Super User

@AndrejZevzikov , Create a measure like

averageX(summarize(Table, Table[customer_id], Table[created_at], Table[work_time]),[work_time])

 

Or you can delete duplicates in power query

https://www.youtube.com/watch?v=Hc5bIXkpGVE

Join us as experts from around the world come together to shape the future of data and AI!
At the Microsoft Analytics Community Conference, global leaders and influential voices are stepping up to share their knowledge and help you master the latest in Microsoft Fabric, Copilot, and Purview.
️ November 12th-14th, 2024
 Online Event
Register Here

Nice, it's seems working perfectly.

Thanks!

Helpful resources

Announcements
OCT PBI Update Carousel

Power BI Monthly Update - October 2024

Check out the October 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

October NL Carousel

Fabric Community Update - October 2024

Find out what's new and trending in the Fabric Community.