Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
geminitp
Regular Visitor

Addressing Partial Duplicates

Hi All, I have question about partial duplicates. I would like to find out about best practices and/or solution to get OrderTable & CustomerInfo  into DesiredTable.

 

I have used various join to merge the table but I can't fix the CustomerID which are duplicate.

I would like to have  a distinct and unique ID for CustomerID. When I used ReplaceValue (111), it replaced all. Appreciate any advice. Thanks. Paul

 

 

OrderTable

CustomerID              CustomerName          Orders

111                           John                             PC

111                           Paul                              Labtop

111                           Mary                             Handphone

112                           Charlie                          Switch

 

 

CustomerInfo

CustomerName        Address

John                         Boston  

Paul                          New York

Mary                         Texas

Charie                       California

 

DesiredTable

CustomerID              CustomerName          Orders                   Address

100                           John                             PC                        Boston

101                           Paul                              Labtop                 New York

111                           Mary                             Handphone         Texas

112                           Charlie                          Switch                  California 

 

4 REPLIES 4
BA_Pete
Super User
Super User

Hi @geminitp ,

 

I'm not sure whether you have oversimplified your data here, but you are going to find it incredibly challenging to create the output table you require with the data structure you have provided.

 

The first question that really needs addressing is: why are Customer ID's not unique?

The second question is: how unique are customer names? If they are actually John, Jack etc. you've got little chance. If they are ABC Widget Co., ACME Dynamite Inc. etc. (i.e. entirely unique and will not repeat) then you could probably join on the [CustomerName] fields in each table without using the [CustomerID] field at all.

 

You *may* be able to get a unique [CustomerID] column by merging the [CustomerID] and [CustomerName] columns in Power Query. This would give you a new ID column that would show values like 111John, 111Paul, 111Mary, 112Charlie etc. although, as above, these types of names may repeat frequently like your [CustomerID] and therefore this may still not create a truly unique ID field. Even once you've done this, there's no way of doing the same in the CustomerInfo table to create a common field to join on.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




amitchandak
Super User
Super User

@geminitp , to fix this we need something unique. Like in sample data customer name is unique.  You can add an index column

https://stackoverflow.com/questions/45715963/creating-an-index-column-for-power-bi

 

Starting with 100.  in first table

and then you can merge two tables on customer name

https://radacad.com/append-vs-merge-in-power-bi-and-power-query

Thanks for  advice  @amitchandak , @BA_Pete . I have simplified the table to convey  a problem  involving partial table, and duplication as error entry.  The CustomerID is used as record identifer for an evolving Table and some CustomerID have errors , my thought is to simplify flow and correct CustomerID to be use as identifier.  This CustomerID is just like an Social Security No. which is supposed to be distinct and unique.  

1) The new index column may work us single out unique entry but  I probably have to figure out  a way to ReplaceValue for  CustomerID errors.  

 

2) Concatenate CustomerID & Name probably wont work if I need to index to other table using CustomerID.

 

 

@geminitp ,

 

OK, so it looks like you really need to get to the root cause of why some of your [CustomerID] are showing as errors. Are the errors generated by Power Query, or are the errors in the data source and are being pulled into Power Query?

If you're not sure, select one of the error cells in Power Query and share a screenshot of what the errors look like. Please remove any sensitive information from your screenshot.

 

Next, you need to look at whether you can bring [CustomerID] into your CustomerDetails table. If [CustomerID] exists in the system I would assume that it would feature in the CustomerDetails table. From there, a simple merge will get you what you need.

 

Pete



Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!




Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.