Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
Konrad909090
Helper II
Helper II

Data mapping from 3 different sources.

Hi Experts,

I come to ask if it is possible to achieve data mapping in PowerBI. Currently the team is doing this in excel.

This is a description of my task:


Mapping of Microsoft master data to our company and company B master data. Thousands of records(5k+).

Mapping phase:
1. after the ‘vlookup’ phases
2. then use Excel's fuzzy logic tool, which does another mapping of about 50%.
3. finally, the rest requires some skills in model building or automation or Excel tool:
Pseudo logic:
*If account names are similar, check if regions are the same, then check if BRANCHES are the same = 90% Confidence map.
*For the rest, IF REGIONS are the same AND BRANCHES are the same, then HOW similar the names are = recommended mapping with 60% confidence.
4 Finally, complete manually


I am attaching dummy data which I hope will map the problem.
I am more concerned with the approach to the solution is it better to stay in excel or can it be done in PowerBI?
I would appreciate any hints.

dummy data 

4 REPLIES 4
Konrad909090
Helper II
Helper II

Hi @lbendlin ,
Thank you for your reply.

Good question. I don't have too much knowledge on this subject. When trying to play around with dummy data, I used Fuzzy matching in PowerQuery and Fuzzy Lookup with a similarity value of 0.8. From what I observed, for example, the name Google Inc and Google were matched as similar in both approaches.

From what I've figured out Power Query in Excel uses the Jaccard Index algorithm, will it be suitable?

I am also concerned with the step-by-step approach to the problem?
Is it worth merging the 3 data sets into one or keeping them in separate tables and doing comparisons?

What would you advise in such a situation?


My setup for Fuzzy Matching i PQ. 

Konrad909090_0-1738137783693.png

 

How do you decide which of the tables has the "authoritative"  value? Or do you just want to end up with similarity groups?

I think the authoritative board will be the Microsoft board and to that I will try to map my company and company B's data.
I understand enough that I need to somehow match in one line the relevant data for one company. So that from these 3 sources the data will be the same based on the different key. And this is pretty much where my understanding of the approach to the subject ends. The data I got looks much more complicated than that. There are dozens of columns in one file. I see only 2 keys that are common to these 3 files.

lbendlin
Super User
Super User

What is your definition of "similar"?  Levenshtein distance?

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.

Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.