The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Dears I'd like to have as an output of a fuzzy merge also the similarity score to apply some fine tuning to the algorithm. I saw on power query documentation there was an option to expand that value from the query https://docs.microsoft.com/en-us/power-query/merge-queries-fuzzy-match , but I can't find it in power query editor on the desktop.
Do you have any "trick" to help me with this?
Thanks in advance
A
Solved! Go to Solution.
Yes, that documentation should have noted that this is a feature available only for Power Query Online (app.powerbi.com). This feature is still not transported to Power BI Desktop / Power Query Excel. You need to be having at least Power BI Pro subscription to use Dataflows which will enable you to use PQ Online.
See the snip from Power Query Online.
How to join 2 columns in fuzzy merge
Below is the table with columns.
Within same Department_Id, how to do Fuzzy Merge/match on Employee_Name with score 65% and number of matches = 10
There is a trick that will work in Excel. You can use Advanced Editor and edit the code directy to include the SimilarityColumnName option. NOTE THERE IS A CATCH. IF you edit the SOURCE paramters through the GUI the SimilarityColumnName is lost (overwritten). Seems like a simple fix for MS to add to Excel.
Example: Source = Table.FuzzyNestedJoin(Contacts, {"Employer"}, Companies, {"Company name"}, "All Companies", JoinKind.LeftOuter, [IgnoreCase=true, IgnoreSpace=true, Threshold=0.90, TransformationTable=CompanyTransformationTable,SimilarityColumnName="matchscore"]),
Somewhat related.... if you use a Tranformation Table to map values, the match returns a 0.95 score. This seems very counterintuitive to me. If I am using a transformation table I am forciing it to a match, and the similarity score should be 1. It took me along time to figure this out and turn down my threshold to .95 to get my transformation table to work. Ideally it would return 0.9999 so we could ieintify hits from the transformation table.
If I add the SimilarityColumnName="matchscore" to the code, how do I get that column to display in the table once I run the query? I do have a step after the "Source" step to Remove Other Columns - do I have to add anything to the code there? Thanks!
thank you for your prompt reply.
I know about the option and using it.
What I would like to check if I can expand after the merge a column displaying the similarity score (please compare the MSFT documentation in my original post).
A
Yes, that documentation should have noted that this is a feature available only for Power Query Online (app.powerbi.com). This feature is still not transported to Power BI Desktop / Power Query Excel. You need to be having at least Power BI Pro subscription to use Dataflows which will enable you to use PQ Online.
See the snip from Power Query Online.
See the merge here. If you still don't find this, please post a screenshot of your merge
User | Count |
---|---|
11 | |
7 | |
5 | |
5 | |
4 |
User | Count |
---|---|
16 | |
14 | |
8 | |
6 | |
6 |