Earn the coveted Fabric Analytics Engineer certification. 100% off your exam for a limited time only!
I have a table that has a column called Tags each row of which contains a list. In each list there is a set of field/value pairs (e.g. ENVIRONMENT and Production").
I would like to extract all the field names from the list and check if the word "environment" appears as a fieldname. But I also want to check of " environment" or "environment " exists along with any other typical typos like incorrect/different case. I cannot find any way of using a wildcard such as "*environment*" in my transformation step.
I'm trying to use this set of functions:
= Table.AddColumn(#"Added Custom", "TagNames", each List.FindText( Record.FieldNames([Tags]),"ENVIRONMENT"))
This works but only for the exact match (including case) on "ENVIRONMENT".
Solved! Go to Solution.
@vgeldbr try this dynamic version and you need to adapt to your scenario
let
fx=(input)=>
Web.Page(
"<script>
var x='"&input&"';
var b=x.search(/environment/gmi);
document.write(b);
</script>"){0}[Data]{0}[Children]{1}[Children]{0}[Text],
Source = #table({"ColumnA", "ColumnB"}, {{[ENVIRONMENT=1,Data=1],1},{[environment=1,Data=1],2},{[ environment=1,Data=1],3},{[environment =1,Data=1],4},{[environ=1,Data=1],5},{[Data=1,ENVIRONMENT=1],6}}),
#"Added Custom" = Table.AddColumn(Source, "Custom", each let
x = Table.FromList(Table.ColumnNames(Table.FromRecords({[ColumnA]}))),
y= Table.FromList(x[Column1]),
z = Table.AddColumn(y,"Custom",each fx([Column1])),
a=Table.SelectRows(z,each [Custom]<>"-1")
in a),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each (Table.IsEmpty([Custom]) = false)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom"})
in
#"Removed Columns"
I started with a differnt data as following coordinate of ENVIRONMENT
Source = #table({"ColumnA", "ColumnB"}, {{[ENVIRONMENT=1,Data=1],1},{[environment=1,Data=1],2},{[ environment=1,Data=1],3},{[environment =1,Data=1],4},{[environ=1,Data=1],5},{[Data=1,ENVIRONMENT=1],6}})
@vgeldbr still dobale
let
fx=(input)=>
Web.Page(
"<script>
var x='"&input&"';
var b=x.search(/environment/gmi);
document.write(b);
</script>"){0}[Data]{0}[Children]{1}[Children]{0}[Text],
Source = #table({"ColumnA", "ColumnB"}, {{[ENVIRONMENT=1,Data=1],1},{[environment=1,Data=1],2},{[ environment=1,Data=1],3},{[environment =1,Data=1],4},{[environ=1,Data=1],5}}),
#"Filtered Rows" = Table.SelectRows(Source, each (fx(Table.ColumnNames(Table.FromRecords({[ColumnA]})){0}) <> "-1"))
in
#"Filtered Rows"
in your case , you need to change the index number and it will be (looking at the screenshot you provided)
#"Filtered Rows" = Table.SelectRows(Source, each (fx(Table.ColumnNames(Table.FromRecords({[Tags]})){6}) <> "-1"))
Fantastic and almost there. I don't understand the function part fully and the solution appears to be failing where the number of items in the record is 6 or fewer. My screenshot example had 7 columns in the record but in fact as these are tags there could be any number from none to infinite. I'm guessing that the issue relates to the value of 6 in the filter argument:
(fx(Table.ColumnNames(Table.FromRecords({[Tags]})){6}) <> "-1")
But I don't understand the function itself to figure out why:
let
Source = (input)=>
Web.Page(
"<script>
var x='"&input&"';
var b=x.search(/environment/gmi);
document.write(b);
</script>"){0}[Data]{0}[Children]{1}[Children]{0}[Text]
in
Source
@vgeldbr as far as understading the funtion, please read two of my blog posts
Can you please provide a sizeable sample data?
Thanks. I've read the blog posts and have a basic grasp. I believe I'm stuck because I cannot find any documentation explaining the {6} parameter for Table.Columnames. Microsoft documentation does not mention any parameters.
Table.ColumnNames(Table.FromRecords({[Tags]})){6})
I will try to create a sample data set - my actual data is many millions of rows so need to spend some time doing that unless there is a ready pointer to above.
if you look at the following screenshot, in a 0 based inedx (PQ uses 0 based index), ENVIRONMENT's position is 6
and through the following code
Table.SelectRows(Source, each (fx(Table.ColumnNames(Table.FromRecords({[Tags]})){6}) <> "-1"))
I am asking PQ to invoke the custome function(fx) on each element (cell) of TAG hoping when that funtion runs it always find ENVIRONMENT ot variation of ENVIRONMENT in the 6th place.
@vgeldbr would be lot easier to debug with sample data
@smpa01 any thoughts on how to extract sample data? It is from a very large SQL database. I can grab the first 100 lines but if I post that PBIX you won't be able to connec to the source and I suspect it won't work for you. I can't see any obvious way to take the data in the query off line and post it. I tried DAX Studio but that does not export the reccords and lists in the columns.
In terms of your other response, unfortuntely it looks like in the data ENVIRONMENT is not always at the same location. So, not only is there not a constant number of columns in the record but the record I need is not at a consistent location. And of course the original problem remains is tha the tag name may have typos and other issues with it hence the need for a wildcard search.
@vgeldbr try this dynamic version and you need to adapt to your scenario
let
fx=(input)=>
Web.Page(
"<script>
var x='"&input&"';
var b=x.search(/environment/gmi);
document.write(b);
</script>"){0}[Data]{0}[Children]{1}[Children]{0}[Text],
Source = #table({"ColumnA", "ColumnB"}, {{[ENVIRONMENT=1,Data=1],1},{[environment=1,Data=1],2},{[ environment=1,Data=1],3},{[environment =1,Data=1],4},{[environ=1,Data=1],5},{[Data=1,ENVIRONMENT=1],6}}),
#"Added Custom" = Table.AddColumn(Source, "Custom", each let
x = Table.FromList(Table.ColumnNames(Table.FromRecords({[ColumnA]}))),
y= Table.FromList(x[Column1]),
z = Table.AddColumn(y,"Custom",each fx([Column1])),
a=Table.SelectRows(z,each [Custom]<>"-1")
in a),
#"Filtered Rows" = Table.SelectRows(#"Added Custom", each (Table.IsEmpty([Custom]) = false)),
#"Removed Columns" = Table.RemoveColumns(#"Filtered Rows",{"Custom"})
in
#"Removed Columns"
I started with a differnt data as following coordinate of ENVIRONMENT
Source = #table({"ColumnA", "ColumnB"}, {{[ENVIRONMENT=1,Data=1],1},{[environment=1,Data=1],2},{[ environment=1,Data=1],3},{[environment =1,Data=1],4},{[environ=1,Data=1],5},{[Data=1,ENVIRONMENT=1],6}})
Thanks so much for the detailed help @smpa01 . I think I can get there now. Makes me realize the underlying data is a mess. I guess with tags that is not entirely unexpected!
@vgeldbr yes it possible
let
fx=(input)=>
Web.Page(
"<script>
var x='"&input&"';
var b=x.search(/environment/gm);
document.write(b);
</script>"){0}[Data]{0}[Children]{1}[Children]{0}[Text],
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSs0ryyzKz8tNzStRitWJVlJAF0DiKyALKMXGAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Filtered Rows" = Table.SelectRows(#"Changed Type", each (fx([Column1]) <> "-1"))
in
#"Filtered Rows"
Reference - https://community.powerbi.com/t5/Community-Blog/Using-JavaScript-in-power-query-for-regex-Part2/ba-p...
Pbis is attached
This is great! But still stuck because my column contains a record and the record may or may not contain the Tag "ENVIRONMENT" or any of the variants I mentioned. I am able to extract the ENVIRONMENT field from the record if it exists but I need to be able to extract all variations of the field name not just ENVIRONMENT. See two screenshots:
User | Count |
---|---|
139 | |
113 | |
103 | |
73 | |
63 |
User | Count |
---|---|
136 | |
125 | |
107 | |
70 | |
61 |