Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.
Hi
I am hoping someone can help me with a table i am struggling with.
I have multiple columns with varying numbers of rows in each. Across the rows/columns are various single word strings, and i am just looking to output a single column that identifies each string found and how many times it occurs. A particular string may appear multiple times in different columns (i.e. the string 'mystring' may occur 10 times in column A and also 15 times in column C)
Example input:
Shoe | hat | glove | Sock | Sock |
hat | jacket | jacket | glove | glove |
hat | jacket | shoe | hat | |
jacket | jacket | Sweater |
output i am looking for:
Shoe | 2 |
Hat | 4 |
Jacket | 5 |
Glove | 3 |
Sweater | 1 |
Sock | 2 |
Note that my actual dataset is huge, and i wont know all the words that are in there to manually query each word.
Any help is gratefully received
Solved! Go to Solution.
@projman OK, I mocked this up. PBIX is attached under signature. I think I left out a DISTINCT in the table calc so that is there now.
Table2 =
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Column1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Column2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Column3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Column4])
VAR tmpCol5 = SELECTCOLUMNS(Data,"Column",[Column5])
VAR tmpTable = DISTINCT(UNION(tmpCol1,tmpCol2,tmpCol3,tmpCol4,tmpCol5))
RETURN tmpTable
You want Data and Table2 tables in the PBIX.
Once again - thank you!
this has definitely taken me closer, but its still listing out each work multiple times if it appears in multiple columns. For example, if 'shoe' appears in 'column 1' 20 times and 'column 4' 5 times, its listing out 'shoe' in the new column twice - once with a distinct count of 20 and the other showing 5.
I feel the last thing needed here is to merge them all together, but i dont think this is a simple case of removing dupes
@projman OK, I mocked this up. PBIX is attached under signature. I think I left out a DISTINCT in the table calc so that is there now.
Table2 =
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Column1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Column2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Column3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Column4])
VAR tmpCol5 = SELECTCOLUMNS(Data,"Column",[Column5])
VAR tmpTable = DISTINCT(UNION(tmpCol1,tmpCol2,tmpCol3,tmpCol4,tmpCol5))
RETURN tmpTable
You want Data and Table2 tables in the PBIX.
Thank you! Thats the column finally sorted 🙂 thank you so much
For some reason, my measure however is just showing '1' for every value. Im not sure why this is as im using the same code as you are, but with my variables in there. I shall keep looking into it...
Sorted it! I had an extra distint in my MC Count measure that basically meant it was only ever looking at 1 string! All working now, thank you so much for your help!
@projman Well, generally you would want to unpivot those columns in Power Query but you could also use MC Aggregations: Multi-Column Aggregations (MC Aggregations) - Microsoft Power BI Community
This is great, thank you!
Unpivoting them isn't a problem. In fact, they were originally just a single column, but I extracted them out based on a colon delimiter. I still have the original single column data, however, I couldn't see how to split that out be individual word, only by combinations of word. i.e. shoe;jacket and shoe;jacket;sock, when it was in that form, which is why i split them into individual columns
So using your pbix I think I am halfway there! Using mc Count Distinct I now have the total count of distinct strings, but how do I go about outputting each string to a new row and then showing its total count occurrence next to it?
@projman Well, you could create a separate unrelated table like this:
Table =
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Value1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Value2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Value3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Value4])
VAR tmpTable = UNION(UNION(UNION(tmpCol1,tmpCol2),tmpCol3),tmpCol4)
RETURN tmpTable
Then your measure would be:
MC Count =
VAR value = MAX('Table'[Column])
VAR tmpCol1 = SELECTCOLUMNS(Data,"Column",[Value1])
VAR tmpCol2 = SELECTCOLUMNS(Data,"Column",[Value2])
VAR tmpCol3 = SELECTCOLUMNS(Data,"Column",[Value3])
VAR tmpCol4 = SELECTCOLUMNS(Data,"Column",[Value4])
VAR tmpTable = UNION(UNION(UNION(tmpCol1,tmpCol2),tmpCol3),tmpCol4)
VAR tmpValue = COUNTROWS(FILTER(tmpTable,[Column]=value))
RETURN tmpValue
As for the Power Query thing, you would split out the values based on your delimiter. You would then select your other columns, right-click and "unpivot other columns". This would then give you a column with all of your single words in a single column in theory.
Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City
Check out the April 2024 Power BI update to learn about new features.
User | Count |
---|---|
114 | |
100 | |
81 | |
70 | |
62 |
User | Count |
---|---|
148 | |
116 | |
104 | |
90 | |
65 |