Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
gopalv
Microsoft Employee
Microsoft Employee

Count unique strings in column containing list of strings

I have data like the following:

Timestamp

Subscriptions

2020-06-11sub1, sub2
2020-06-11sub2, sub3
2020-06-12sub5, sub6
2020-06-12sub5

 

I'd like to be able to plot a chart which shows the number of unique subscriptions at each time stamp. So for timestamp 2020-06-11, the value would be 3, and for timestamp 2020-06-12, the value would be 2. Is this possible?

1 ACCEPTED SOLUTION

Hi @gopalv ,

 

Based on the requirements you have posted, i think you will need to get your table in the correct format.

 

For calculating it only once, you can create a CC

 

CAL = CALCULATE(MAX('Table'[Usage]),FILTER('Table','Table'[Usage] = EARLIER('Table'[Usage]) && 'Table'[Compute Type] = EARLIER('Table'[Compute Type]) && 'Table'[Timestamp] = EARLIER('Table'[Timestamp]) && 'Table'[Subscription] > EARLIER('Table'[Subscription])))
 
1.jpg
 
 
Regards,
Harsh Nathani
WOULD appreciate a KUDOS 🙂 🙂

View solution in original post

8 REPLIES 8
harshnathani
Community Champion
Community Champion

Hi @gopalv ,

 

 

You can use Power Query

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjIwMtA1MNM1NFTSUSouTTLUUQCSRkqxOphyRmA5Y1Q5I4icKVjODLscCaImSrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Timestamp = _t, #"(blank)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Timestamp", type date}, {"(blank)", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"(blank)", "Subscriptions"}}),
SplittedCol = Table.TransformColumns(#"Renamed Columns",{{"Subscriptions", each Text.Split(_,",")}}),
#"Expanded Subscriptions" = Table.ExpandListColumn(SplittedCol, "Subscriptions"),
#"Trimmed Text" = Table.TransformColumns(#"Expanded Subscriptions",{{"Subscriptions", Text.Trim, type text}})
in
#"Trimmed Text"

 

 

 

 

Original Table : Have Added some Values

1.jpg

 

 

 

2.JPG

 

 

 

3.JPG

 

 

 

Create a simple measure 

 

Sub = DISTINCTCOUNT('Table (2)'[Subscriptions])

 

4.JPG

 

 

Regards,

Harsh Nathani

Did I answer your question? Mark my post as a solution! Appreciate with a Kudos!! (Click the Thumbs Up Button)

Hi @gopalv ,

 

 

You can also use this code in Power Query 

 

let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMjIwMtA1MNM1NFTSUSouTTLUUQCSRkqxOphyRmA5Y1Q5I4icKVjODLscCaImSrGxAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Timestamp = _t, #"(blank)" = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Timestamp", type date}, {"(blank)", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"(blank)", "Subscriptions"}}),
#"Grouped Rows" = Table.Group(#"Renamed Columns", {"Timestamp"}, {{"a", each Text.Combine([Subscriptions], ", "), type text}}),
#"Renamed Columns1" = Table.RenameColumns(#"Grouped Rows",{{"a", "Combined"}}),
#"RemovedDuplicates" = Table.TransformColumns (#"Renamed Columns1",{{"Combined", each Text.Combine(List.Distinct(List.Transform(Text.Split(_,","), Text.Trim)),", ")}})
in
#"RemovedDuplicates"

 

1.jpg

 

 

Create a measure

 

CountOfItems = LEN(MAX('Table (3)'[Combined])) - LEN(SUBSTITUTE(MAX('Table (3)'[Combined]),",","")) + 1
 
 
Regards,
HN
az38
Community Champion
Community Champion

Hi @gopalv 

try a measure

Measure = CALCULATE(
SUMX('Table', 
LEN('Table'[Subscriptions])-LEN(SUBSTITUTE('Table'[Subscriptions], ",", "")) + 1
), 
ALLEXCEPT('Table','Table'[Timestamp])
)

do not hesitate to give a kudo to useful posts and mark solutions as solution
LinkedIn
gopalv
Microsoft Employee
Microsoft Employee

In my example, I have multiple rows per timestamp. Would this code still work in that case?

Hi @gopalv ,

 

Yes, it will.

 

The first post.. Removes the delimiter and creates a new row for each time stamp and then does a DISTINCT COUNT.

 

 

The second post .. Combines the rows for a particular time stamp, then removes the duplicates and Counts the no. of words.

 

 

Regards,

Harsh Nathani

 

I'd prefer a DAX solution, if possible. The given DAX solution doesn't appear to be correct and the problem with splitting subscriptions in PowerQuery is that the resulting table ends up double- or triple-counting values in another column (not shown) that represent usage across all subscriptions at each timestamp.

 

For example, I start with:

TimestampSubscriptionsComputetypeUsage (across all subscriptions)
2020-06-11sub1, sub2Compute135
2020-06-11sub3, sub4Compute289
2020-06-12sub5, sub6Compute132
2020-06-12sub7, sub8Compute282

 

Your solution will give me a table that looks like:

 

TimestampSubscriptionComputeUsage
2020-06-11sub1Compute135
2020-06-11sub2Compute135
2020-06-11sub3Compute289
2020-06-11sub4Compute289
...   
2020-06-12sub8Compute282

 

Now, if I try to plot usage over time without any filters, I will suddenly have double as much usage at every time point.

Hi,

To plot usage over time, drag Year/Month from the Calendar Table and write this mwasure

=min(Data[Usage])

Hope this helps.


Regards,
Ashish Mathur
http://www.ashishmathur.com
https://www.linkedin.com/in/excelenthusiasts/

Hi @gopalv ,

 

Based on the requirements you have posted, i think you will need to get your table in the correct format.

 

For calculating it only once, you can create a CC

 

CAL = CALCULATE(MAX('Table'[Usage]),FILTER('Table','Table'[Usage] = EARLIER('Table'[Usage]) && 'Table'[Compute Type] = EARLIER('Table'[Compute Type]) && 'Table'[Timestamp] = EARLIER('Table'[Timestamp]) && 'Table'[Subscription] > EARLIER('Table'[Subscription])))
 
1.jpg
 
 
Regards,
Harsh Nathani
WOULD appreciate a KUDOS 🙂 🙂

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.