Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
Anonymous
Not applicable

Clean column containing inconsistent data entered people

Hi,

 

I have a table of meetings and a column with topics. These topics are entered by the person conducting the meeting and aren't always standarized, but do contain a keyword. I read you can use Table.AddFuzzyClusterColumn to standardize any spelling or case issues and then look up values in a transform table. I tried this approach but I don't think I have enough data for the clustering algorithm to work. I was thinking if I can strip out the keyword in the Topics column and look up this value in the transformation table this would do the job.

 

Here is a simplified example. I want to standarized topics names in the "Topics" column by checking the "From" column in the TransformationTable and return the value in the "To" column. The final table will combine information from both tables. 

heroyo_1-1616172584232.png

 

 

How can this be done, or is there a better way to do this?

 

See PXI file here: https://drive.google.com/file/d/1YX8DmMVKRH4IYcKpus2Pyl9JsErvpeXa/view?usp=sharing 

1 ACCEPTED SOLUTION

Hello @Anonymous 

 

you can use Table.FuzzyNestedJoin. Use the joinOptions to define treshold and other options to change the outcome. However you didn't specify that all content is always replaced with the value of the to-column or you need a replacement. If the second is the case you could use both of your column from and to to search for the "from" and replace it with the "to"

Here to code

let
    Topic = let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXKOUHBNSU9VitWJVjIC8uEcYyDHsag0KVHBOSc1sSggsbgYLG4C0oQQiQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Meeting Id" = _t, Column1 = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Meeting Id", Int64.Type}, {"Column1", type text}})
    in
        #"Changed Type",
Transformation = let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wck1JT1XSUXL0D9Z1jtB1dXF3VYrViVZyzklNLApILC4GyiHYsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}})
in
    #"Changed Type",

JoinFuzzy = Table.FuzzyNestedJoin
(
    Topic,
    "Column1",
    Transformation,
    "From",
    "New",
    JoinKind.LeftOuter,
    [Threshold=0.5, IgnoreCase=true]
),
    #"Expanded New" = Table.ExpandTableColumn(JoinFuzzy, "New", {"To"}, {"To"})
in
    #"Expanded New"

THis is the output

Jimmy801_0-1616264193034.png


If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

View solution in original post

9 REPLIES 9
Jimmy801
Community Champion
Community Champion

Hello @Anonymous 

 

did you check out the solutions proposed?

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Anonymous
Not applicable

I'm not sure the best mod to accomplish what you are looking for is the use of fuzzy functions. In the attached file you can find an attempt with the Table.GroupFyzzy function but if you don't make any preliminary transformations, in my opinion it doesn't work. The solution I propose makes use of the List.PosiztioOf function with the help of a comparison criterion that can be further refined. The simple criterion I applied is the one for which the key you indicate is contained in the string and in this case the string is "translated"

Jimmy801
Community Champion
Community Champion

Hello @Anonymous 

 

could you please make an example of your data and what's your expected output? Please post some usable data.

 

If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

Anonymous
Not applicable

@Jimmy801 I've updated my orginal post with a picture of a simplified example. Hopefully this can clarify the issue. 

Hello @Anonymous 

 

you can use Table.FuzzyNestedJoin. Use the joinOptions to define treshold and other options to change the outcome. However you didn't specify that all content is always replaced with the value of the to-column or you need a replacement. If the second is the case you could use both of your column from and to to search for the "from" and replace it with the "to"

Here to code

let
    Topic = let
        Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXKOUHBNSU9VitWJVjIC8uEcYyDHsag0KVHBOSc1sSggsbgYLG4C0oQQiQUA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Meeting Id" = _t, Column1 = _t]),
        #"Changed Type" = Table.TransformColumnTypes(Source,{{"Meeting Id", Int64.Type}, {"Column1", type text}})
    in
        #"Changed Type",
Transformation = let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wck1JT1XSUXL0D9Z1jtB1dXF3VYrViVZyzklNLApILC4GyiHYsbEA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [From = _t, To = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"From", type text}, {"To", type text}})
in
    #"Changed Type",

JoinFuzzy = Table.FuzzyNestedJoin
(
    Topic,
    "Column1",
    Transformation,
    "From",
    "New",
    JoinKind.LeftOuter,
    [Threshold=0.5, IgnoreCase=true]
),
    #"Expanded New" = Table.ExpandTableColumn(JoinFuzzy, "New", {"To"}, {"To"})
in
    #"Expanded New"

THis is the output

Jimmy801_0-1616264193034.png


If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too

Have fun

Jimmy

 

@Anonymous please provide data per links below. We cannot paste images into Power BI.

 

How to get good help fast. Help us help you.

How To Ask A Technical Question If you Really Want An Answer

How to Get Your Question Answered Quickly - Give us a good and concise explanation
How to provide sample data in the Power BI Forum - Provide data in a table format per the link, or share an Excel/CSV file via OneDrive, Dropbox, etc.. Provide expected output using a screenshot of Excel or other image. Do not provide a screenshot of the source data. I cannot paste an image into Power BI tables.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

@edhans thank you for the link, it is useful to understand how to post better questions. I posted a link to my pbxi file with all the data, and thought that along with the description, it was clear. 

 

Can you tell me how I could have been more clear? I am genuinely courious, as I a new here and would like help maintain a good experience for everyone.  

It isn't a matter of clarity so much as usability of data. Screen caps are great for expected results, but you are asking me or others to key this data in. Provide a file link or a table per the links I gave. Then we can spend time solving the problem, not retyping data.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting
Anonymous
Not applicable

@edhans This is the part I don't understand, if you see my orginal post I linked to a PBI file containing my data and my test query. I'm not sure I could have presented the data in a better way.

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors