March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Good afternoon everyone,
I would like to ask for your help, as I am facing the following issue. I work with a product table, where I have a field for products. In this field, I have the name and the weight of each product (except for some that do not have it). In the table, I do not have a field just for the weight, and today the need arose for me to have this field. I thought about doing something similar to Excel's "text to columns", but so far, this solution is not meeting my needs, or perhaps I am not sure how to do it. I would like to ask for your help to see if it is possible to separate only the weight, creating another column through Power Query.
Some examples of products:
Chocolate 80g
White Chocolate 90g
Biscuit 60g Gold
Product from 100g to 500g
I would like to point out that they do not follow a pattern of being always at the far right. In some cases, the weight is in the middle of the name, in others to the right, and in other cases, the "g" is separated.
Do you know how to develop a solution like this through Power Query?
Thank you in advance.
Solved! Go to Solution.
Hi @Lwpro,
it is not an easy task because we don't see exact data, but something like this could help.
Edit 2024-03-01: I've updated my code a bit
It is mandantory to have space after each unit of measure (or it has to be at the end of text string)
Result:
You have to define units of measure in 2nd step (just separate them with coma)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY+xCsIwFEV/5VJwc3htreioDgGhKDg4lA6ljWkgTSRNBP/epA62mPHcw819VZWcetMa1TiOlGgFYEcsqddVcu9lgL94T2LiRzm2XjpsyfU4aC1f3I6NfSMQgfiYUd2kXq3pfOvwsGaI9QLOoKC/IgjcpAo9E59/CbZQ04wYMso2OHvNF1HpBUrjwhoUeRwRvfzwtFJ9zwmFNo4ABoWLbbTgyyDgMC8taC7UHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
DefineUnitsOfMeasureHere = "g, kg, ml, l",
Ad_Weight = Table.AddColumn(Source, "Weight", each [ l_units = List.Buffer(List.Transform(Text.Split(DefineUnitsOfMeasureHere, ","), Text.Trim)),
l_removedSpaces = List.ReplaceMatchingItems(List.RemoveItems(Text.Split(Text.Upper([Item]), " "), {""}), List.Zip( { List.Transform(l_units, Text.Upper), l_units } )),
t_replaceUnitsSpaces =
List.Accumulate(
List.Zip( { List.Transform(l_units, each " " & _), l_units } ),
Text.Combine(l_removedSpaces, " "),
(s,c)=> Text.Replace(s, c{0}, c{1} )
),
//Transform to lower text list and remove rows with numbers only
l_weight1 = List.Select(List.Transform(Text.Split(t_replaceUnitsSpaces, " "), Text.Lower), each (try Number.From(_) otherwise _) is text),
//Select only valid weight
l_weight2 =
List.Accumulate(
l_units,
{},
(s,c)=> s & List.Transform( l_weight1, each
if Text.EndsWith(_, c)
and (try Number.From(Text.Replace(_, c, "")) otherwise _) is number
then _ else null )
),
l_weightFinal = List.RemoveNulls(l_weight2),
t_result = if List.Count(l_weight2) > 1 then Text.Combine(l_weight2, " to ") else l_weight2{0}
][t_result], type text)
in
Ad_Weight
Hi @Lwpro,
Alternatively, you can try something like this.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
getWeight = Table.AddColumn(Source, "Weight", each
[
a = List.RemoveMatchingItems( List.Transform( Splitter.SplitTextByCharacterTransition( each true, {"G", "g"})([Item]), (x)=> Text.Select(Text.Trim(x), {"0".."9"}) ), {"", null}),
b = if List.Count(a) >1 then Text.Combine( a, "g to ") & "g" else List.First(a) & "g"
][b], type text
)
in
getWeight
That will get you this result.
or maybe even this, as there appear to be no other numbers contained within your string.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
getWeight = Table.AddColumn(Source, "Weight", each
[
a = Text.Split([Item], " "),
b = List.RemoveMatchingItems( List.Transform(a, each Text.Select(_, {"0".."9"})), {"", null}),
c = if List.Count(b) >1 then Text.Combine( b, "g to ") & "g" else List.First(b) & "g"
][c], type text
)
in
getWeight
I hope this is helpful
Hi @Lwpro,
Alternatively, you can try something like this.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
getWeight = Table.AddColumn(Source, "Weight", each
[
a = List.RemoveMatchingItems( List.Transform( Splitter.SplitTextByCharacterTransition( each true, {"G", "g"})([Item]), (x)=> Text.Select(Text.Trim(x), {"0".."9"}) ), {"", null}),
b = if List.Count(a) >1 then Text.Combine( a, "g to ") & "g" else List.First(a) & "g"
][b], type text
)
in
getWeight
That will get you this result.
or maybe even this, as there appear to be no other numbers contained within your string.
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcs7IT87PSSxJVVCwMHBXitWJVgrPyARyERKWBulgcafM4uTSzBIFBQUzg3QFEHDPz0kBSwUU5aeUJpcopBXl5yoYGgClS/IVTA3QNJoZKKQrBGfmlKUWgcWRrVAA2h0LAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
getWeight = Table.AddColumn(Source, "Weight", each
[
a = Text.Split([Item], " "),
b = List.RemoveMatchingItems( List.Transform(a, each Text.Select(_, {"0".."9"})), {"", null}),
c = if List.Count(b) >1 then Text.Combine( b, "g to ") & "g" else List.First(b) & "g"
][c], type text
)
in
getWeight
I hope this is helpful
Hi @m_dekorte, to be honest I didn't know about Splitter.SplitTextByCharacterTransition so thank you 🙂
you just should repair this:
your v1
your v2
my version:
@Lwpro, if you find something else - let us know 😉
Yes, that limitation was called out.
If a representative sample was provided there's no need for adjustment. Time will tell...
Great stuff.
Hi @Lwpro,
it is not an easy task because we don't see exact data, but something like this could help.
Edit 2024-03-01: I've updated my code a bit
It is mandantory to have space after each unit of measure (or it has to be at the end of text string)
Result:
You have to define units of measure in 2nd step (just separate them with coma)
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("ZY+xCsIwFEV/5VJwc3htreioDgGhKDg4lA6ljWkgTSRNBP/epA62mPHcw819VZWcetMa1TiOlGgFYEcsqddVcu9lgL94T2LiRzm2XjpsyfU4aC1f3I6NfSMQgfiYUd2kXq3pfOvwsGaI9QLOoKC/IgjcpAo9E59/CbZQ04wYMso2OHvNF1HpBUrjwhoUeRwRvfzwtFJ9zwmFNo4ABoWLbbTgyyDgMC8taC7UHw==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Item = _t]),
DefineUnitsOfMeasureHere = "g, kg, ml, l",
Ad_Weight = Table.AddColumn(Source, "Weight", each [ l_units = List.Buffer(List.Transform(Text.Split(DefineUnitsOfMeasureHere, ","), Text.Trim)),
l_removedSpaces = List.ReplaceMatchingItems(List.RemoveItems(Text.Split(Text.Upper([Item]), " "), {""}), List.Zip( { List.Transform(l_units, Text.Upper), l_units } )),
t_replaceUnitsSpaces =
List.Accumulate(
List.Zip( { List.Transform(l_units, each " " & _), l_units } ),
Text.Combine(l_removedSpaces, " "),
(s,c)=> Text.Replace(s, c{0}, c{1} )
),
//Transform to lower text list and remove rows with numbers only
l_weight1 = List.Select(List.Transform(Text.Split(t_replaceUnitsSpaces, " "), Text.Lower), each (try Number.From(_) otherwise _) is text),
//Select only valid weight
l_weight2 =
List.Accumulate(
l_units,
{},
(s,c)=> s & List.Transform( l_weight1, each
if Text.EndsWith(_, c)
and (try Number.From(Text.Replace(_, c, "")) otherwise _) is number
then _ else null )
),
l_weightFinal = List.RemoveNulls(l_weight2),
t_result = if List.Count(l_weight2) > 1 then Text.Combine(l_weight2, " to ") else l_weight2{0}
][t_result], type text)
in
Ad_Weight
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
User | Count |
---|---|
20 | |
13 | |
10 | |
9 | |
7 |
User | Count |
---|---|
39 | |
27 | |
16 | |
15 | |
10 |