Get certified in Microsoft Fabric—for free! For a limited time, the Microsoft Fabric Community team will be offering free DP-600 exam vouchers. Prepare now
Hello,
I have looked around and cannot find an answer that works for me. I have data from a SmartSheet form pulled into Power BI and a few of the columns allow for multiple selections. I would like to separate these out to unique columns for my visualizations. Below is a before and after I am trying to accomplish. I apprecaite any help you can bring.
Existing | New | |||||||||
Project Name | Project Info | Design | Project Name | Project Info.1 | Project Info.2 | Design.1 | Design.2 | Design.3 | Design.4 | |
1 | Quality | Floor Fixture, Architectural | 1 | Quality | null | Floor Fixture | Architectural | null | null | |
2 | Completeness | Design Development | 2 | null | Completeness | null | null | Design Development | null | |
3 | Quality, Completeness | Adherence, Floor Fixture | 3 | Quality | Completeness | Floor Fixture | null | null | Adherence |
Solved! Go to Solution.
No problem. Here's how I think you should be structuring your table for 1) maximum efficiency and 2) ease of reporting:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("XY3BCoMwEER/Jew5F9svkAbvnkMOkg4aWBNJVtG/79JLqbd5MDPPe+rI0rhPnOTSNHAp1QzplL3Cmr7GJQmi0sQUrKeHll5l3RiCjNYUHVqas3E4wGVbkeXbfP6OrblN+veCihxV8WekED4=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Project Name" = _t, #"Project Info" = _t, Design = _t]),
chgTypes = Table.TransformColumnTypes(Source,{{"Project Name", Int64.Type}, {"Project Info", type text}, {"Design", type text}}),
// Relevant steps ----->
unpivOthCols = Table.UnpivotOtherColumns(chgTypes, {"Project Name"}, "improvArea", "Value"),
addImprovItem = Table.AddColumn(unpivOthCols, "improvItem", each Text.Split([Value], ", ")),
expandImprovItem = Table.ExpandListColumn(addImprovItem, "improvItem"),
// <----- Relevant steps
remOthCols = Table.SelectColumns(expandImprovItem,{"Project Name", "improvArea", "improvItem"})
in
remOthCols
This gives us the following fact table output structure:
This fact structure then makes calculations over the whole table very easy, for example:
_noofProjects = DISTINCTCOUNT(factProjects[Project Name])
_noofImprovAreas = DISTINCTCOUNT(factProjects[improvArea])
_noofImprovItems = COUNTROWS(factProjects)
These measures can be used in visuals with any of your fact table columns to provide the insights you're looking for. The way you were asking for your data to be structured previously would have required a separate measure for each [Project Info ~] column, and each [Design ~] column, which would have quickly become difficult to maintain, challenging to get visuals to show what you wanted, and impossible for improvement areas and improvement items to be used in slicers etc.
Above the fact table you would likely have a dimProject dimension table which would have a single row per [Project Name], and would contain info such as Project Manager, Client, Project Start Date/End Date etc. which would be related to this fact table on Dimension[Project Name] ONE : MANY Fact[Project Name], but this is getting slightly beyond scope of this post I think.
Pete
Proud to be a Datanaut!
Hi @kene76 ,
While I appreciate that @tackytechtom has done a great job in creating what you've asked for, I think this is essentially an XY Problem in the making, and you're likely asking for fundamentally the wrong thing.
Power Query should be used to get data into the most efficient structure possible before being sent to the data model to be reformatted into your desired layout. Your request appears to be trying to create the format/layout in Power Query while moving the data a long way away from an efficient structure.
Are you able to describe/provide examples of the types of visuals you want to display based on this dataset so we can then look at the most efficient (and future-proofed) data structure please?
My initial thought would be to break your table out into separate fact and dimension tables, but would be helpful to ensure this would fit the needs of your desired visuals.
Pete
Proud to be a Datanaut!
Thank you for the response and you are most likely correct. I have only been using Power BI for a year and keep getting asked to perform more and more advanced reporting. This data will ultimately report the perfomance of our franchise teams. So we have multiple categories to grade them on and instead of doing a 1 - 5 scoring method, we are giving them a simple pass/fail. If they fail, we call out what specific taks they need to improve - that is what is above. Our project team fills out a form after project completion and they can multi-select everything that needs to be improved which is what the screen shot above shows. I want to separte these out so I can show charts that detail which projects need assistance with Floor Fixtures, for example. The form brings in the data in one cell and comma separates each value. With what I currently know, having each one in a separate column seemed the logical choice.
Thank you for your time.
No problem. Here's how I think you should be structuring your table for 1) maximum efficiency and 2) ease of reporting:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("XY3BCoMwEER/Jew5F9svkAbvnkMOkg4aWBNJVtG/79JLqbd5MDPPe+rI0rhPnOTSNHAp1QzplL3Cmr7GJQmi0sQUrKeHll5l3RiCjNYUHVqas3E4wGVbkeXbfP6OrblN+veCihxV8WekED4=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Project Name" = _t, #"Project Info" = _t, Design = _t]),
chgTypes = Table.TransformColumnTypes(Source,{{"Project Name", Int64.Type}, {"Project Info", type text}, {"Design", type text}}),
// Relevant steps ----->
unpivOthCols = Table.UnpivotOtherColumns(chgTypes, {"Project Name"}, "improvArea", "Value"),
addImprovItem = Table.AddColumn(unpivOthCols, "improvItem", each Text.Split([Value], ", ")),
expandImprovItem = Table.ExpandListColumn(addImprovItem, "improvItem"),
// <----- Relevant steps
remOthCols = Table.SelectColumns(expandImprovItem,{"Project Name", "improvArea", "improvItem"})
in
remOthCols
This gives us the following fact table output structure:
This fact structure then makes calculations over the whole table very easy, for example:
_noofProjects = DISTINCTCOUNT(factProjects[Project Name])
_noofImprovAreas = DISTINCTCOUNT(factProjects[improvArea])
_noofImprovItems = COUNTROWS(factProjects)
These measures can be used in visuals with any of your fact table columns to provide the insights you're looking for. The way you were asking for your data to be structured previously would have required a separate measure for each [Project Info ~] column, and each [Design ~] column, which would have quickly become difficult to maintain, challenging to get visuals to show what you wanted, and impossible for improvement areas and improvement items to be used in slicers etc.
Above the fact table you would likely have a dimProject dimension table which would have a single row per [Project Name], and would contain info such as Project Manager, Client, Project Start Date/End Date etc. which would be related to this fact table on Dimension[Project Name] ONE : MANY Fact[Project Name], but this is getting slightly beyond scope of this post I think.
Pete
Proud to be a Datanaut!
Thank you so much for this detailed answer. I have used a version of this to get the end result I needed.
No problem, glad it's sorted.
Don't forget to give a thumbs-up to any answers that have helped you along the way 👍
Pete
Proud to be a Datanaut!
Hi @kene76 ,
Here a (admittedly pretty hacky) solution. It is not super dynamic meaning in case you get new columns with comma separated lists, you need to add additional steps. However, if your existing columns add new attributes to the comma separated lists, the solution should work and adapt accordingly.
Before:
After:
Here the code in Power Query M that you can paste into the advanced editor (if you do not know, how to exactly do this, please check out this quick walkthrough)
let Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("XY3BCoMwEER/Jew5F9svkAbvnkMOkg4aWBNJVtG/79JLqbd5MDPPe+rI0rhPnOTSNHAp1QzplL3Cmr7GJQmi0sQUrKeHll5l3RiCjNYUHVqas3E4wGVbkeXbfP6OrblN+veCihxV8WekED4=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [#"Project Name" = _t, #"Project Info" = _t, Design = _t]), #"Split Column by Delimiter" = Table.SplitColumn(Source, "Project Info", Splitter.SplitTextByDelimiter(", ", QuoteStyle.Csv), {"Project Info.1", "Project Info.2"}), #"Split Column by Delimiter1" = Table.SplitColumn(#"Split Column by Delimiter", "Design", Splitter.SplitTextByDelimiter(", ", QuoteStyle.Csv), {"Design.1", "Design.2"}), #"Changed Type" = Table.TransformColumnTypes(#"Split Column by Delimiter1",{{"Project Name", Int64.Type}, {"Project Info.1", type text}, {"Project Info.2", type text}, {"Design.1", type text}, {"Design.2", type text}}), #"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"Project Name"}, "Attribute", "Value"), #"Split Column by Delimiter2" = Table.SplitColumn(#"Unpivoted Columns", "Attribute", Splitter.SplitTextByDelimiter(".", QuoteStyle.Csv), {"Attribute.1", "Attribute.2"}), #"Removed Columns" = Table.RemoveColumns(#"Split Column by Delimiter2",{"Attribute.2"}), #"Filtered Rows" = Table.SelectRows(#"Removed Columns", each ([Attribute.1] = "Design")), #"Grouped Rows" = Table.Group(#"Filtered Rows", {"Value"}, {{"Count", each Table.RowCount(_), Int64.Type}}), #"Added Index" = Table.AddIndexColumn(#"Grouped Rows", "Index", 1, 1, Int64.Type), #"Merged Queries" = Table.NestedJoin(#"Filtered Rows", {"Value"}, #"Added Index", {"Value"}, "Added Index", JoinKind.LeftOuter), #"Expanded Added Index" = Table.ExpandTableColumn(#"Merged Queries", "Added Index", {"Index"}, {"Added Index.Index"}), #"Added Custom" = Table.AddColumn(#"Expanded Added Index", "Custom", each [Attribute.1] & "." & Number.ToText( [Added Index.Index])), #"Removed Columns1" = Table.RemoveColumns(#"Added Custom",{"Attribute.1", "Added Index.Index"}), #"Filtered Rows1" = Table.SelectRows(#"Removed Columns", each ([Attribute.1] = "Project Info")), #"Grouped Rows1" = Table.Group(#"Filtered Rows1", {"Value"}, {{"Count", each Table.RowCount(_), Int64.Type}}), #"Added Index1" = Table.AddIndexColumn(#"Grouped Rows1", "Index", 1, 1, Int64.Type), #"Merged Queries1" = Table.NestedJoin(#"Filtered Rows1", {"Value"}, #"Added Index1", {"Value"}, "Added Index1", JoinKind.LeftOuter), #"Expanded Added Index1" = Table.ExpandTableColumn(#"Merged Queries1", "Added Index1", {"Index"}, {"Added Index1.Index"}), #"Added Custom1" = Table.AddColumn(#"Expanded Added Index1", "Custom", each [Attribute.1] & "." & Number.ToText( [Added Index1.Index] )), #"Merged Queries2" = Table.NestedJoin(#"Removed Columns1", {"Project Name"}, #"Added Custom1", {"Project Name"}, "Added Custom1", JoinKind.LeftOuter), #"Expanded Added Custom1" = Table.ExpandTableColumn(#"Merged Queries2", "Added Custom1", {"Value", "Custom"}, {"Added Custom1.Value", "Added Custom1.Custom"}), #"Pivoted Column1" = Table.Pivot(#"Expanded Added Custom1", List.Distinct(#"Expanded Added Custom1"[#"Added Custom1.Custom"]), "Added Custom1.Custom", "Added Custom1.Value"), #"Pivoted Column" = Table.Pivot(#"Pivoted Column1", List.Distinct(#"Pivoted Column1"[Custom]), "Custom", "Value") in #"Pivoted Column"
Once again, there is probably a smarter version to do this (maybe with some advanced functions). Yet, it might get you where you want anyway.
Let me know if this helps 🙂
/Tom
https://www.tackytech.blog/
https://www.instagram.com/tackytechtom/
Did I answer your question❓➡️ Please, mark my post as a solution ✔️ |
Also happily accepting Kudos 🙂 |
Feel free to connect with me on LinkedIn! |
#proudtobeasuperuser |
Check out the October 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
67 | |
61 | |
23 | |
17 | |
12 |