Microsoft Fabric Community Conference 2025, March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for a $400 discount.
Register nowGet inspired! Check out the entries from the Power BI DataViz World Championships preliminary rounds and give kudos to your favorites. View the vizzies.
Hi PowerBi experts, I have 2 questions:
1. I have a survey dataset where I'm trying to create a category with a condition that if the ID column has more than 5 entries, it should be tagged as either "global" or "regional". For example, if ID 1 contains 4 regions, it should be tagged as regional. Any ID with more than 5 or more regions should be tagged "global". How can I do this in PowerQuery? (see image below)
2. In the same dataset, I have a column that contains open-ended responses. The responses contain a lot of text but I need to extract out only certain words such as the name of countries they mention. For example, in the image below, one respondent has mentioned that they work in Australia and Canada. How can I create another column which only extracts the name of countries by ID? I have a list of countries in a seperate table as reference if needed.
Thank you!
Solved! Go to Solution.
For the first question, use a Group BY @vivi_nainai2021 grouping on the ID.
I then added a custom column with the formula:
if [Count] >= 5 then "Global" else "Regional"
It returns this:
I then expanded the Regional Column that was inside the All Rows column and deleted the temporary Count column, and final result is this:
The code for this is here:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUUpUitWBsJLgrGQ4KwXMMgayUsEsEyArDc5Kh7My4KxMOCsLzspWio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Region = _t]),
#"Grouped Rows" = Table.Group(Source, {"ID"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"All Rows", each _, type table [ID=nullable text, Region=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Type", each if [Count] >= 5 then "Global" else "Regional", type text),
#"Expanded All Rows" = Table.ExpandTableColumn(#"Added Custom", "All Rows", {"Region"}, {"Region"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded All Rows",{"Count"})
in
#"Removed Columns"
How to use M code provided in a blank query:
1) In Power Query, select New Source, then Blank Query
2) On the Home ribbon, select "Advanced Editor" button
3) Remove everything you see, then paste the M code I've given you in that box.
4) Press Done
5) See this article if you need help using this M code in your model.
On the last question, one approach could be this - very easy to do with the UI.
You will have something like this:
You can now merge the Value column shown with your list of countries, and filter out wherever there was a null. You know which ones belonged with each other by the index. So the 0 record had both Australia and Canada, but the 1 record had only Germany. Again, here it the full code example:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCsnILFbISCxWcCwtLilKzMlMVEjMS1FwTsxLTElUyE3NK8nMz0tN0VOK1YEqzgIqVMjJLC4pVnBPLcpNzKsESsYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,".","",Replacer.ReplaceText,{"Column1"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",",","",Replacer.ReplaceText,{"Column1"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value1", "Column1", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6"}),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value")
in
#"Unpivoted Other Columns"
You could definitely get more fancy using List.Generate to go through each record, but the code would be about 4 times a long and 10 times more complex, and I think the above logic will get the job done.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingFor the first question, use a Group BY @vivi_nainai2021 grouping on the ID.
I then added a custom column with the formula:
if [Count] >= 5 then "Global" else "Regional"
It returns this:
I then expanded the Regional Column that was inside the All Rows column and deleted the temporary Count column, and final result is this:
The code for this is here:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlTSUUpUitWBsJLgrGQ4KwXMMgayUsEsEyArDc5Kh7My4KxMOCsLzspWio0FAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Region = _t]),
#"Grouped Rows" = Table.Group(Source, {"ID"}, {{"Count", each Table.RowCount(_), Int64.Type}, {"All Rows", each _, type table [ID=nullable text, Region=nullable text]}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Type", each if [Count] >= 5 then "Global" else "Regional", type text),
#"Expanded All Rows" = Table.ExpandTableColumn(#"Added Custom", "All Rows", {"Region"}, {"Region"}),
#"Removed Columns" = Table.RemoveColumns(#"Expanded All Rows",{"Count"})
in
#"Removed Columns"
How to use M code provided in a blank query:
1) In Power Query, select New Source, then Blank Query
2) On the Home ribbon, select "Advanced Editor" button
3) Remove everything you see, then paste the M code I've given you in that box.
4) Press Done
5) See this article if you need help using this M code in your model.
On the last question, one approach could be this - very easy to do with the UI.
You will have something like this:
You can now merge the Value column shown with your list of countries, and filter out wherever there was a null. You know which ones belonged with each other by the index. So the 0 record had both Australia and Canada, but the 1 record had only Germany. Again, here it the full code example:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCsnILFbISCxWcCwtLilKzMlMVEjMS1FwTsxLTElUyE3NK8nMz0tN0VOK1YEqzgIqVMjJLC4pVnBPLcpNzKsESsYCAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Replaced Value" = Table.ReplaceValue(Source,".","",Replacer.ReplaceText,{"Column1"}),
#"Replaced Value1" = Table.ReplaceValue(#"Replaced Value",",","",Replacer.ReplaceText,{"Column1"}),
#"Split Column by Delimiter" = Table.SplitColumn(#"Replaced Value1", "Column1", Splitter.SplitTextByDelimiter(" ", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6"}),
#"Added Index" = Table.AddIndexColumn(#"Split Column by Delimiter", "Index", 0, 1, Int64.Type),
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Added Index", {"Index"}, "Attribute", "Value")
in
#"Unpivoted Other Columns"
You could definitely get more fancy using List.Generate to go through each record, but the code would be about 4 times a long and 10 times more complex, and I think the above logic will get the job done.
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI ReportingI figured out question 1. Simply counted the number of rows per ID by name of region. Still looking for an asnwer to the second question 🙂
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code FABINSIDER for a $400 discount!
Check out the February 2025 Power BI update to learn about new features.
User | Count |
---|---|
28 | |
27 | |
23 | |
14 | |
10 |
User | Count |
---|---|
24 | |
21 | |
17 | |
11 | |
9 |