Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi there everyone,
I am trying to process a set of data that contains the gross weight of shipments, and its driving me nuts, due to how bad and inconsistent the data is. I get values such as below:
1,234.50 kg
1.234,5Kg
12.o3 Kg
approx. 1,5 kg
0, 12 kg
1.234
1.23
1.234 + 4,56
I manage to trim all the spaces and text just fine. What I can't fix is how to deal with the different decimal and thousands separators. The only consistent way to tell from the data is whether there are 2 or 3 digits after the separator.
I'd appreciate any ideas on how to deal with this.
Thanks!
Solved! Go to Solution.
This may work, if your differentiation is that a token followed by two (or one) digits is the decimal token:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQxMjbRMzVQyE5XitWJVgIThnpAQR1Tb2QhI718YwVkkcSCgqL8Cj0FQx1TFM0GOgqGRpjGQU2BMYBCUIYhQo2CtgLQWjOl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Removed Blank Rows" = Table.SelectRows(#"Changed Type",
each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
//Clean the data
#"Remove Leading/Trailing text" = Table.TransformColumns(#"Removed Blank Rows",
{"Column1", each Text.Trim(_,{" ",".","a".."z","A".."Z"}), type text}),
#"'o' to '0'"=Table.ReplaceValue(#"Remove Leading/Trailing text","o","0",Replacer.ReplaceText,{"Column1"}),
#"Remove Spaces" =Table.ReplaceValue(#"'o' to '0'"," ","",Replacer.ReplaceText,{"Column1"}),
//Convert the number based on algorithm that
// if there are one or two digits after the `dot` or `comma` then that token = decimal token
#"Add Converted" = Table.AddColumn(#"Remove Spaces", "Converted Number", each
let
#"Split Plus" = Text.Split([Column1],"+"),
#"Split Dot" = List.Combine(List.Transform(#"Split Plus", each Text.Split(_,"."))),
#"Split Comma" = List.Combine(List.Transform(#"Split Plus", each Text.Split(_,","))),
#"Decimal" = if (Text.Length(#"Split Dot"{1}?) <= 2)??false then "."
else if ((Text.Length(#"Split Comma"{1}?)??0) <= 2)??false then ","
else null,
#"Convert" = if #"Decimal" = "."
then List.Transform(#"Split Plus", each Number.From(_, "en-US"))
else List.Transform(#"Split Plus", each Number.FromText(_,"en-DE"))
in
List.Sum(#"Convert"), type number)
in
#"Add Converted"
See if the transform is what you expect
Original
Transformed
Hi @Neg, different approach here. Jut keep in mind that it works with sample data - but in real world there can be other cases...
Result
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQxMjbRMzVQyE5XitUB8vWAfB1TbyjPSC/fWAHKSSwoKMqv0FMw1DGFqTbQUTA0QtEKZyGEFLQVgCaaKcXGAgA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
Ad_CorrectValue = Table.AddColumn(Source, "CorrectValue", each
[ chars = Text.ToList("+-,.oO"),
a = Text.Select([Column1], {"0".."9"} & chars),
b = Text.ToList(Text.Trim(a, chars)),
c = List.ReplaceMatchingItems(b, {{"o", "0"}, {"O", "0"}}),
d = if List.Contains(c, {"+", "-"}) then c else if List.PositionOf(c, ",") < List.PositionOf(c, ".") then List.RemoveMatchingItems(c, {","}) else
[ d1 = List.ReplaceMatchingItems(c, {{",", "."}}),
d2 = if List.Count(List.Select(d1, (x)=> x = ".")) > 1 then List.Transform(List.Difference(List.Positions(d1), {List.PositionOf(d1, ".", Occurrence.First)}), (y)=> d1{y}) else d1
][d2],
e = Text.Combine(d),
f = Expression.Evaluate(e)
][f], type number)
in
Ad_CorrectValue
Thanks, I'll try this solution next time!
This may work, if your differentiation is that a token followed by two (or one) digits is the decimal token:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMtQxMjbRMzVQyE5XitWJVgIThnpAQR1Tb2QhI718YwVkkcSCgqL8Cj0FQx1TFM0GOgqGRpjGQU2BMYBCUIYhQo2CtgLQWjOl2FgA", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Column1", type text}}),
#"Removed Blank Rows" = Table.SelectRows(#"Changed Type",
each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null}))),
//Clean the data
#"Remove Leading/Trailing text" = Table.TransformColumns(#"Removed Blank Rows",
{"Column1", each Text.Trim(_,{" ",".","a".."z","A".."Z"}), type text}),
#"'o' to '0'"=Table.ReplaceValue(#"Remove Leading/Trailing text","o","0",Replacer.ReplaceText,{"Column1"}),
#"Remove Spaces" =Table.ReplaceValue(#"'o' to '0'"," ","",Replacer.ReplaceText,{"Column1"}),
//Convert the number based on algorithm that
// if there are one or two digits after the `dot` or `comma` then that token = decimal token
#"Add Converted" = Table.AddColumn(#"Remove Spaces", "Converted Number", each
let
#"Split Plus" = Text.Split([Column1],"+"),
#"Split Dot" = List.Combine(List.Transform(#"Split Plus", each Text.Split(_,"."))),
#"Split Comma" = List.Combine(List.Transform(#"Split Plus", each Text.Split(_,","))),
#"Decimal" = if (Text.Length(#"Split Dot"{1}?) <= 2)??false then "."
else if ((Text.Length(#"Split Comma"{1}?)??0) <= 2)??false then ","
else null,
#"Convert" = if #"Decimal" = "."
then List.Transform(#"Split Plus", each Number.From(_, "en-US"))
else List.Transform(#"Split Plus", each Number.FromText(_,"en-DE"))
in
List.Sum(#"Convert"), type number)
in
#"Add Converted"
See if the transform is what you expect
Original
Transformed
Wow! That was fast and helped enormously. Thanks!
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
9 | |
9 | |
7 | |
6 | |
6 |