Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
TMielke
New Member

Count substrings in comma-separated string of varying length

Hello,

 

I have a special problem which can be simplified to a table like this:

 

Number of ItemsItemsDesired measure
1Apple_xxyyy 41
3Banana_xxyxy 2, Melon_xxxyy 3, Orange_xxy 60
5Banana_xxyyx 2, Apple_yyyy 2, Apple_xxxx 3, Apple_yxyx 5,  Orange_yxyx 13

 

Now what I need is to count the occurence of special substrings (let's say "Apple_" for example) considering the comma delimiters. So within each comma delimited substring, I need to do a check (e.g. contains "Apple_") if true, sum it up. The problem is, that the number of items can be huge (let's say up to 500). How can I do this? I would prefer a solution without having to split the original column, however if there's no other way, it could be done.

 

Thanks a lot in advance!

1 ACCEPTED SOLUTION
Ahmedx
Super User
Super User

5 REPLIES 5
Hema_Gupta
Frequent Visitor

Hi,

Hope this helps.

Hema_Gupta_0-1708426480973.png

 

danextian
Super User
Super User

Hi @TMielke ,

 

Assuming that you have a column that indicates what text to find and that for each delimiter two instances of the text to find is to be counted just once  (say Apple_xxxx and apple_yyyy are counted once), this can be fairly done in Power Query. Here's a sample custom column formula

let 
texttofind = Text.Lower([text to find]),
substrings = Text.Split([Items],","),
lowercased = List.Transform(substrings, Text.Lower),
filtered = List.Select(lowercased, each Text.Contains(_, texttofind))
in List.Count(filtered)

here's a sample M code

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUXIsKMhJja+oqKysVDAB8kFiiSAxpVidaCVjIM8pMQ8IQUoqKhWMdBR8U3Py84BcoBYFYx0F/6LEvHSwCQpmQNUGQJwLUgHWb4qiv7ICpB9iYyXIQjgPaFoFyDCoHNAqBVOdQwtghoMFQC4zRrguFgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Number of Items" = _t, Items = _t, #"Desired measure" = _t, #"text to find" = _t]),
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Number of Items", Int64.Type}, {"Items", type text}, {"Desired measure", Int64.Type}}),
    #"Added Custom" = Table.AddColumn(#"Changed Type", "Count", each let 
texttofind = Text.Lower([text to find]),
substrings = Text.Split([Items],","),
lowercased = List.Transform(substrings, Text.Lower),
filtered = List.Select(lowercased, each Text.Contains(_, texttofind))
in List.Count(filtered))
in
    #"Added Custom"

danextian_0-1708426247162.png

 










Did I answer your question? Mark my post as a solution!


Proud to be a Super User!









"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.
Ahmedx
Super User
Super User

pls try this

Screenshot_2.png

this one seems to work for me exactly as expected! thanks a lot!

Arul
Super User
Super User

@TMielke ,

try this measure,

Count Measure =
LEN ( SUBSTITUTE ( MAX ( 'Table'[Items] ), "Apple_", "Apple_0" ) )
    - LEN ( MAX ( 'Table'[Items] ) )

Arul_0-1708424931610.png

 





Did I answer your question? Mark my post as a solution!

Proud to be a Super User!


LinkedIn


Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.