Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
projman
Frequent Visitor

How to count unique strings across multiple columns

Hi

I am hoping someone can help me with a table i am struggling with.

 

I have multiple columns with varying numbers of rows in each.  Across the rows/columns are various single word strings, and i am just looking to output a single column that identifies each string found and how many times it occurs.  A particular string may appear multiple times in different columns (i.e. the string 'mystring' may occur 10 times in column A and also 15 times in column C)

 

Example input:

 

ShoehatgloveSockSock
hatjacketjacketgloveglove
hatjacketshoe hat
jacketjacketSweater  

 

output i am looking for:

Shoe2
Hat4
Jacket5
Glove3
Sweater1
Sock2

 

Note that my actual dataset is huge, and i wont know all the words that are in there to manually query each word.

Any help is gratefully received

1 ACCEPTED SOLUTION

@projman OK, I mocked this up. PBIX is attached under signature. I think I left out a DISTINCT in the table calc so that is there now.

Table2 = 
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Column1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Column2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Column3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Column4])
VAR tmpCol5 = SELECTCOLUMNS(Data,"Column",[Column5])
VAR tmpTable = DISTINCT(UNION(tmpCol1,tmpCol2,tmpCol3,tmpCol4,tmpCol5))
RETURN tmpTable

You want Data and Table2 tables in the PBIX.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

View solution in original post

7 REPLIES 7
projman
Frequent Visitor

Once again - thank you!

this has definitely taken me closer, but its still listing out each work multiple times if it appears in multiple columns.  For example, if 'shoe' appears in 'column 1' 20 times and 'column 4' 5 times, its listing out 'shoe' in the new column twice - once with a distinct count of 20 and the other showing 5.

I feel the last thing needed here is to merge them all together, but i dont think this is a simple case of removing dupes

@projman OK, I mocked this up. PBIX is attached under signature. I think I left out a DISTINCT in the table calc so that is there now.

Table2 = 
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Column1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Column2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Column3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Column4])
VAR tmpCol5 = SELECTCOLUMNS(Data,"Column",[Column5])
VAR tmpTable = DISTINCT(UNION(tmpCol1,tmpCol2,tmpCol3,tmpCol4,tmpCol5))
RETURN tmpTable

You want Data and Table2 tables in the PBIX.


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Thank you!  Thats the column finally sorted 🙂 thank you so much

For some reason, my measure however is just showing '1' for every value.  Im not sure why this is as im using the same code as you are, but with my variables in there. I shall keep looking into it...

Sorted it!  I had an extra distint in my MC Count measure that basically meant it was only ever looking at 1 string! All working now, thank you so much for your help!

Greg_Deckler
Super User
Super User

@projman Well, generally you would want to unpivot those columns in Power Query but you could also use MC Aggregations: Multi-Column Aggregations (MC Aggregations) - Microsoft Power BI Community


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

This is great, thank you!

 

Unpivoting them isn't a problem.  In fact, they were originally just a single column, but I extracted them out based on a colon delimiter.  I still have the original single column data, however, I couldn't see how to split that out be individual word, only by combinations of word.  i.e. shoe;jacket and shoe;jacket;sock, when it was in that form, which is why i split them into individual columns

 

So using your pbix I think I am halfway there!  Using mc Count Distinct I now have the total count of distinct strings, but how do I go about outputting each string to a new row and then showing its total count occurrence next to it?

 

 

 

@projman Well, you could create a separate unrelated table like this:

Table = 
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Value1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Value2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Value3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Value4])
VAR tmpTable = UNION(UNION(UNION(tmpCol1,tmpCol2),tmpCol3),tmpCol4)
RETURN tmpTable

Then your measure would be:

MC Count = 
VAR value = MAX('Table'[Column])
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Value1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Value2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Value3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Value4])
VAR tmpTable = UNION(UNION(UNION(tmpCol1,tmpCol2),tmpCol3),tmpCol4)
VAR tmpValue = COUNTROWS(FILTER(tmpTable,[Column]=value))
RETURN tmpValue

 

As for the Power Query thing, you would split out the values based on your delimiter. You would then select your other columns, right-click and "unpivot other columns". This would then give you a column with all of your single words in a single column in theory.

 

 

 


@ me in replies or I'll lose your thread!!!
Instead of a Kudo, please vote for this idea
Become an expert!: Enterprise DNA
External Tools: MSHGQM
YouTube Channel!: Microsoft Hates Greg
Latest book!:
The Definitive Guide to Power Query (M)

DAX is easy, CALCULATE makes DAX hard...

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.