Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
My dataset contains a column where all products purchased in a single transaction are combined in a single field. These products are separated by a distinct character (the character is "|") so I do not have a problem delimiting them, however, they are not in any particular order, and each customer could potentially purchase any combination of products.
My first guess was to separate them into separate columns. When I separate them into new columns it gives me columns"Product.1" "Product.2" etc. but if I then try to sort by one of the columns it will not identify later instances of the same value in other columns:
Name | Product | Product.1 | Product.2 | Product.3 | Product.4 |
Val Kilmer | a | b | c | d | a | b | c | d |
Simon Bolivar | c | a | b | c | a | b | null |
C.S. Lewis | b | d | | b | d | null | null |
(In the above example, the "A" product in column "Product.1" column is not associated with the "a" in "Product.2")
I am sure there is another approach to this, I am just not seeing it.
This is part of a template where multiple datasets will be imported over time. All of the datasets will have the same Columns but not necessarily the same products, so if at all possible a solution that is based on some kind of logic versus looking up specific values would be ideal. I know this may be a big ask, but I am not able to control how the data is coming to me and any help would be greatly appreciated.
Solved! Go to Solution.
Hi @HaleCarGuy, try this. It should be dynamic.
Dynamic split without product sort result:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCkvMUfDOzMlNLVLSUUpUqFFIAuJkIE4B8lUMDZRidaKVgjNz8/MUnPJzMssSQepA8mC1IDVGEDXOesF6Cj6p5ZnFQEGQISkKNSBpY6B0LAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Product = _t, #"Purchase Price" = _t]),
Ad_NoOfComns = Table.AddColumn(Source, "No of Columns", each List.Count(Text.Split([Product], "|")), Int64.Type),
Columns = [ a = List.Max(Ad_NoOfComns[No of Columns]),
b = List.Transform({1..a}, (x)=> "Product" & Text.From(x))
][b],
StepBack = Source,
SplitColumnDynamic = Table.SplitColumn(StepBack, "Product", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), Columns),
TextTrimDynamic = Table.TransformColumns(SplitColumnDynamic, List.Transform(Columns, (colName)=> { colName, each if Text.Trim(_) = "" then null else Text.Trim(_), type text } ) )
in
TextTrimDynamic
Dynamic split with product sort result
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCkvMUfDOzMlNLVLSUUpUqFFIAuJkIE4B8lUMDZRidaKVgjNz8/MUnPJzMssSQepA8mC1IDVGEDXOesF6Cj6p5ZnFQEGQISkKNSBpY6B0LAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Product = _t, #"Purchase Price" = _t]),
Ad_NoOfComns = Table.AddColumn(Source, "No of Columns", each List.Count(Text.Split([Product], "|")), Int64.Type),
Ad_Products = Table.AddColumn(Ad_NoOfComns, "Products", each List.Select(List.Transform(Text.Split([Product], "|"), Text.Trim), (x)=> x <> "") , type list),
Products = List.Buffer(List.Distinct(List.Combine(Ad_Products[Products]))),
StepBack = Ad_Products,
Ad_ProductColumns = List.Accumulate(
List.Zip({{1..List.Count(Products)}, Products}),
StepBack,
(s,c)=> Table.AddColumn(s, "Product" & Text.From(c{0}), each if List.Contains([Products], c{1}) then c{1} else null, type text)
),
RemovedColumns = Table.RemoveColumns(Ad_ProductColumns,{"No of Columns", "Products"})
in
RemovedColumns
Hi @HaleCarGuy, try this. It should be dynamic.
Dynamic split without product sort result:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCkvMUfDOzMlNLVLSUUpUqFFIAuJkIE4B8lUMDZRidaKVgjNz8/MUnPJzMssSQepA8mC1IDVGEDXOesF6Cj6p5ZnFQEGQISkKNSBpY6B0LAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Product = _t, #"Purchase Price" = _t]),
Ad_NoOfComns = Table.AddColumn(Source, "No of Columns", each List.Count(Text.Split([Product], "|")), Int64.Type),
Columns = [ a = List.Max(Ad_NoOfComns[No of Columns]),
b = List.Transform({1..a}, (x)=> "Product" & Text.From(x))
][b],
StepBack = Source,
SplitColumnDynamic = Table.SplitColumn(StepBack, "Product", Splitter.SplitTextByDelimiter("|", QuoteStyle.Csv), Columns),
TextTrimDynamic = Table.TransformColumns(SplitColumnDynamic, List.Transform(Columns, (colName)=> { colName, each if Text.Trim(_) = "" then null else Text.Trim(_), type text } ) )
in
TextTrimDynamic
Dynamic split with product sort result
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WCkvMUfDOzMlNLVLSUUpUqFFIAuJkIE4B8lUMDZRidaKVgjNz8/MUnPJzMssSQepA8mC1IDVGEDXOesF6Cj6p5ZnFQEGQISkKNSBpY6B0LAA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Name = _t, Product = _t, #"Purchase Price" = _t]),
Ad_NoOfComns = Table.AddColumn(Source, "No of Columns", each List.Count(Text.Split([Product], "|")), Int64.Type),
Ad_Products = Table.AddColumn(Ad_NoOfComns, "Products", each List.Select(List.Transform(Text.Split([Product], "|"), Text.Trim), (x)=> x <> "") , type list),
Products = List.Buffer(List.Distinct(List.Combine(Ad_Products[Products]))),
StepBack = Ad_Products,
Ad_ProductColumns = List.Accumulate(
List.Zip({{1..List.Count(Products)}, Products}),
StepBack,
(s,c)=> Table.AddColumn(s, "Product" & Text.From(c{0}), each if List.Contains([Products], c{1}) then c{1} else null, type text)
),
RemovedColumns = Table.RemoveColumns(Ad_ProductColumns,{"No of Columns", "Products"})
in
RemovedColumns
Thanks for the answer! This might be making progress, but when I do that, it duplicates all of the data in the row. Now as an example, the price is entered in three times throwing off all of the other calculations in my template.
This is a very simplified example below as the data has 30+ columns but it demonstrates the issue:
Name | Product | Purchase Price |
Val Kilmer | a | b | c | d | $540 |
Becomes:
Name | Product | Purchase Price |
Val Kilmer | a | $540 |
Val Kilmer | b | $540 |
Val Kilmer | c | $540 |
Val Kilmer | d | $540 |
Sorry I should have mentioned there was many other columns before.
Use the other option of the splitter - split into rows.
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
69 | |
61 | |
18 | |
16 | |
13 |