Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
Anonymous
Not applicable

Remove old duplicates

We have a table in Excel with CRM data. Each CRM opportunity appears multiple times. Meaning if an opportunity moves through the different CRM stages for each stage a row will be created. The CRM stages are numbered from 01, 02, 03 04 etc. We would like to create a report with just the most current stage (=highest number). What we did was the following in Power BI Desktop:

 

1) Sort the column 'Sales Stage' in descending order so that the highest (is latest) sales stage would be on top

2) Then selected the column Opportunity ID and removed the duplicates

 

Unfortunately the remaining row for each opportunity was the oldest one (sales stage 01)

 

We tried to do the same as above, but this time first sorting the Sales Stages' in ascending order.

 

But again we were left with Sales Stage 01

 

How can we remove duplicate opportunities (based on the field Opportunity ID) while retaining the latest (=highest) sales stage?

1 ACCEPTED SOLUTION

@Anonymous It is almost the same thing. The UI is a bit diffrent.

 

In PowerBI Desktop go to  "Edit Queries" 

 

1.PNG

 

Select the table/query taht you import from excel -> Select the ID column only and sort ( doesn't matter asc or desc) ->

Select the Sales Stage column now & sort z-a ( the biggest on top )

Notice the small numbers for the order of sorting in the column?

 

1.PNG

 

now the tricky part

 Click the formula icon (fx) in the bar. This will create a new step 'Custom" and also in the bar will be the name of the previous step     = #"Sorted Rows"

 

2.PNG

 

Now after equal sign wirite =Table.Buffer( #"Sorted Rows")

SO just write the formula and include the text already there - previous step.

 

3.PNG

 

 

You already know how to remove duplicates so that is.

 

Hope it works

 

 

 

Konstantinos Ioannou

View solution in original post

9 REPLIES 9
Anonymous
Not applicable

With Excel and Power Query, I had a Formula.Firewall issue when using the buffer function proposed as solution. So I found using this alternative (answer from ExcelApostle) with an index column simple and convenient:

 

There seems to be an implicit index constructed in the backend that is overwritten when you add an index column. After adding the index column, the latest sort is respected when removing duplicates. Lot faster than grouping / merging or buffering the table! Would have been nice to see this in the manual, as we cannot be sure whether after updates this implicit behavior will stay the same.

Anonymous
Not applicable

I had the formula.fiewall issue, but it was simply because I still applied the @remove duplicates to the original table instead of #"buffered". Once corrected, it worked like a charm. I tried the Excel Apostle's solution but to no avail. It still removed new duplicates and kept the old ones. But yes, I created an indexed column in the query and sorted descending before doing the buffer step.

Eric_Zhang
Employee
Employee

@Anonymous

 

You can also try a way in DAX.

 

Capture.PNG

 

latest stage =
CALCULATE (
    LASTNONBLANK ( 'CRM TABLE'[Sales Stage ], "" ),
    ALLEXCEPT ( 'CRM TABLE', 'CRM TABLE'[id] )
)

isLatest = IF('CRM TABLE'[latest stage]='CRM TABLE'[Sales Stage ],"Y","N")

Then apply a filter  like isLatest="Y" accordingly.

konstantinos
Memorable Member
Memorable Member

@Anonymous 

 

If sales stage is a number ( or duplicate the column - convert to number ), then you can select all columns exept "Sales Stage" , and then in Transform tab "Group by" and change the new column operation to to 'max' and then choose the sales stage column.

 

That is the easy BUT might still leave you duplicate ID if you have ID had different values for another column.

 

If that is the case ( more likely ) check this post by Ken Plus 

http://www.excelguru.ca/blog/2016/05/25/keep-the-most-recent-entry/

 

 

Konstantinos Ioannou
Anonymous
Not applicable

We do have different values in another column. The post from Ken Plus assumes we are using PowerQuery. But we are using Power BI Desktop and I don't understand how to make that work in Power BI Desktop (if at ll possible).

 

Thank you,

Oscar Broekman

@Anonymous It is almost the same thing. The UI is a bit diffrent.

 

In PowerBI Desktop go to  "Edit Queries" 

 

1.PNG

 

Select the table/query taht you import from excel -> Select the ID column only and sort ( doesn't matter asc or desc) ->

Select the Sales Stage column now & sort z-a ( the biggest on top )

Notice the small numbers for the order of sorting in the column?

 

1.PNG

 

now the tricky part

 Click the formula icon (fx) in the bar. This will create a new step 'Custom" and also in the bar will be the name of the previous step     = #"Sorted Rows"

 

2.PNG

 

Now after equal sign wirite =Table.Buffer( #"Sorted Rows")

SO just write the formula and include the text already there - previous step.

 

3.PNG

 

 

You already know how to remove duplicates so that is.

 

Hope it works

 

 

 

Konstantinos Ioannou
Anonymous
Not applicable

Excellent! You made my day. I just had to create an Indexed column in the query editor, sort descending on Index, and then buffer.

Awesome! It did work for me.  @konstantinos Where did you learn that from? I need to up my game too.

Anonymous
Not applicable

Thank you. I am able to follow all steps, until ' Click the formula icon (fx) in the bar. ' I don't see that white bar in my screen. How do I make that white bar show?

 

 

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

Power BI Carousel June 2024

Power BI Monthly Update - June 2024

Check out the June 2024 Power BI update to learn about new features.

RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.