Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
erlicp
Frequent Visitor

Fuzzy Grouping

Hello, 

 

I have a long list of account names that I have complied from several files into a data flow. I then applied the fuzzy group function to the entire list of accounts. Picture below for ref : 

erlicp_1-1673979703611.png

My question is how do I make use of the grouped account names? In power desktop there are data groups that I currently use to group these account names together. In the screenshot below you can see the harvard account name group highlighted in red>   

erlicp_3-1673980028132.png

 

My question is. Is there anyway to use fuzzy grouping to create these data groups inside my data flow or do I have to create the groups manually in the desktop version? 

 

*please be gently am very new to data analytics and power bi. 

 

 

1 ACCEPTED SOLUTION
edhans
Super User
Super User

Data Groups are done in the model, not in Power Query or source data. 

If you wanted to group them in at Dataflow, you'd need to create a conditional column that would add the right grouping as another column. 
if [field] = "Boston's Children" then "Harvard Med"
else if [field] = "something else" then "Harvard Public Health"

and so on.

I'd argue the conditional column is the better way from a modeling standpoint to approach it, but it is more tedious than data grouping drag and drop.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

5 REPLIES 5
edhans
Super User
Super User

Data Groups are done in the model, not in Power Query or source data. 

If you wanted to group them in at Dataflow, you'd need to create a conditional column that would add the right grouping as another column. 
if [field] = "Boston's Children" then "Harvard Med"
else if [field] = "something else" then "Harvard Public Health"

and so on.

I'd argue the conditional column is the better way from a modeling standpoint to approach it, but it is more tedious than data grouping drag and drop.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
erlicp
Frequent Visitor

Yeah I'm not sure either one of these options is feasible the list of accounts contains roughly 500k rows. Maybe I am better off using the built in ML modules to try and group the accounts, I've done at least a few thousand manually already that I could use as a potential training model. 

 

You could create a list to merge and create your values, and a Fuzzy Merge is available which means you don't have to generate one for every 500K possible options.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
erlicp
Frequent Visitor

"You could create a list to merge" when you say create a list are you referring to making a transformation table? 

 

Just saying create a list of items (not a Power Query "List") that could be pulled in and then a fuzzy merge done. For example, if you turned down the sensitivity in Fuzzy Merge, pretty much anything with Harvard in it would match and could be grouped to the Harvard section.

It is like all AI type features though. It may work 95-97% of the time, and the rest you have to keep adding exceptions for.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.