Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
Hi ALL
I need a little help.
Is it possible to use Power BI to data mining? Let me explain my case.
I have a project where i have a table "dProducts" with produtc name, product code, produtc family, product group, etc.
Ex:
business_unit | group_name | modelgroup_name | family_name | Fiscal Code | product_ number | produc_name | description | description_pt-br |
X | A | AA-28 | DMMX | 11111111 | 4582763 | AAA-C28X | MULTIMETER W | MULTIMETRO 287 C/ REGISTRO E GRAFICO P/ ELETRONICO |
X | A | AA-28 | DMMX | 11111111 | 4585036 | AAA-C28XA | TRUE-RMS ELECTRONIC LOGGING | MULTIMETRO 287 C/ REGISTRO E GRAFICO E SOFTWARE |
X | A | AA-28 | DMMX | 11111111 | 4584916 | AAA-C28XB | INDUSTRIAL LOGGING MULTIMETER | MULTIMETRO 289 C/ REGISTRO E GRAFICO P/ INDUSTRIAL |
X | A | AA-28 | DMMX | 11111111 | 4585051 | AAA-C28XC | INDUSTRIAL LOGGING MULTIMETER | MULTIMETRO 289 C/ REGISTRO, GRAFICO E SOFTWARE |
And I do need to verify (row by row) a huge table "fImports" to search inside field "Product Description" to find each product from my table "dProducts" and show partial or total match %.
Ex.:
ORDER_NUMBER | NCM | PRODUCT_DESCRIPTION_PT-BR |
564520000900001 | 90303100 | 2675778 - MULTIMETRO 289 |
564520000900002 | 90303100 | 4691232 - MULTIMETROS C/ TESTE DE ISOLACAO 1587 |
564520000900003 | 90303100 | 1564549 - MULTIMETRO |
564520000900004 | 90303100 | 1564560 - MULTIMETRO 177 |
564520000900005 | 90303100 | 1564549 - 79 MULTIMETRO |
564520001000001 | 90303100 | 4281915 - 28IIEX/BR MULTIMETRO DIGITAL PORTATIL 28IIEX |
564520001100001 | 90303100 | 4228705 - 117 MULTIMETRO PARA ELETRICISTA |
564520001100002 | 90303100 | 4367953 - MULTIMETRO PARA ELETRICISTAS 107 |
564520001100003 | 90303100 | 4404210 - MULTIMETRO DIGITAL -15B+ESP |
564520001100004 | 90303100 | 2538803 - MULTIMETRO -116 |
564520001200001 | 90303100 | 3026976 - TESTADOR DE COMPONENTES LCR55-A |
1891670001200001 | 90303100 | AM-32002 MULTIMETER, PVC MULTIMETRO |
654720001000001 | 90303100 | 12155 - MULTIMETRO SEM DISPOSITIVO REGISTRADOR, ACON. EM CXS. CONT. 100 PCS. C/U. |
1015950000100001 | 90303100 | CODIGO DO IMPORTADOR: AU325 - MULTIMETRO DIGITAL SENDO MULTIMETRO NA COR PRETA COMPOSTO PRINCIPALMENTE DE PLASTICO E COMPONENTES ELETRONICOS, CONTENDO ACESSORIOS COMO CABO POSITIVO E NEGATIVO UTILIZADOS DURANTE A MEDICAO. |
1907910000500001 | 90303100 | 3104045010 - Multimetro Multi medidor Optima VAF |
1907910000500002 | 90303100 | 3104045011 - Multimetro Multi medidor Delta Power |
1907910000500003 | 90303100 | 3104045012 - Multimetro Multi medidor Delta Energy |
1907680002100001 | 90303100 | C-14588 - CAPACIMETRO DIGITAL - FERRAMENTA UTILIZADA PARA TESTAR CAPACITORES DE FASE OU DE PARTIDA MATERIAL CONSTITUINTE PLASTICO |
25640000300001 | 90303100 | MULTIMETRO DIGITAL, REF. MB74117 |
525150000600001 | 90303100 | 4367953 - MULTIMETRO PARA ELETRICISTAS 107 |
1941690000100001 | 90303100 | U1194-66001 MULTÍMETRO DIGITAL DE MÃO EM FORMA DE ALICATE AMPERIMETRO, MEDIÇÃO DE CORRENTE EFICAZ ATE 600 V, DMM COM MEDIÇÃO DE RESISTENCIA , CAPACITANCIA, TENSAO CC E CA E CORRENTE CA, CCMICROM E CAMICROM MODELO U1194A, SEM DISPOSITIVO REGISTRADOR, |
Main thoughts:
1) As both tables have NCM code first rule may be search product from dProducts on fImports when Fiscal Code (NCM) = NCM
2) Search for dProduct product_ number on FImports PRODUCT_DESCRIPTION_PT-BR, in some cases if product_ number matches that´s it, but we may check all others fields just in case...
3) Search for dProduct modelgroup_name on FImports PRODUCT_DESCRIPTION_PT-BR
4) Search for dProduct family_name on FImports PRODUCT_DESCRIPTION_PT-BR
5) Search for dProduct produc_name on FImports PRODUCT_DESCRIPTION_PT-BR
6) Search for dProduct description_pt-br on FImports PRODUCT_DESCRIPTION_PT-BR, something like TOTAL FOUND WORDS / TOTAL WORDS = X% and define a business rule to accept over 80%, for example...
Hi @EvertonRosa ,
To search columns in Power Query, you can use Merge feature, please refer:
Tutorial: Shape and combine data in Power BI Desktop
Since there is no matched results based on NCM columns from your sample, currently it would return a blank table.
For fuzzy merge to create a column, please refer this blog:
How to fuzzy match the dirty data and horizontal display the corresponding table
Reference function in Power Query: Table.AddFuzzyClusterColumn
Best Regards,
Community Support Team _ Yingjie Li
Thanks @v-yingjl
I am trying a different approach, cause in some cases result will be only text similarity.
NCM is onlyne first step validation.
Greg, thanks for your help.
What I do want is IF product number is matched (2) or on (3-6) it gets one or more hypothesis with more than 80% (similarity), is to create a new column with product_number value from dProduct on fImports.
I would prefer to create this on Power Query, instead DAX.
@EvertonRosa Will have to research what is possible in Power Query. There is fuzzy matching available for Merge queries but seems like you might want things to be more customized. Might have to do something with a custom function. I'm not clear on whether your product description columns are identical for both tables or if they would be if you dropped the numeric digits and dash from the second table? From your description of what you want, it seems like nothing would match between the two tables or am I missing something?
In DAX, I created a highly customizable fuzzy matching pattern.
Fuzzy - Microsoft Power BI Community
In DAX you could use FIND or SEARCH to look for words in other columns as well.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.