Join us for an expert-led overview of the tools and concepts you'll need to pass exam PL-300. The first session starts on June 11th. See you there!
Get registeredPower BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.
Hello all - I have 3 tasks I have been able to accomplish on a small sample set of data, but with my full dataset the refresh never finishes due to what I am assuming is a performance issue on the larger table. I am looking for an alternative method to accomplish at least 2 of the 3 tasks (and the third if possible).
Starting data:
Table named "Sales", relevant columns (there may be duplicates in this table, as there are also sales by month in other columns):
Code | Dashboard.1 | Dashboard.2 | Dashboard.3 |
XXXX | CAR ACCOUNT | CAR ACCOUNT | null |
YYYY | CAR ACCOUNT | CAR ACCOUNT | null |
ZZZZ | BOAT ACCOUNT | null | BOAT ACCOUNT |
AAAA | null | PLANE ACCOUNT | null |
BBBB | null | PLANE ACCOUNT | null |
CCCC | TRAIN ACCOUNT | null | TRAIN ACCOUNT |
DDDD | TRAIN ACCOUNT | null | TRAIN ACCOUNT |
Table named "SpecificLists", relevant columns (there are no duplicates in this table):
Code | Updated Name | Dashboard |
YYYY | CAR ACCOUNT (BY LIST) | Dashboard.2 |
CCCC | TRAIN ACCOUNT (BY LIST) | Dashboard.3 |
Table named "Columns" (no duplicates in this table):
Dashboard |
Dashboard.1 |
Dashboard.2 |
Dashboard.3 |
Table named "Dashboard Assignments", relevant columns (no duplicates in this table):
Account_Dashboard | Name |
CAR ACCOUNT|Dashboard.1 | CAR ACCOUNT |
BOAT ACCOUNT|Dashboard.1 | BOAT ACCOUNT |
TRAIN ACCOUNT|Dashboard.1 | TRAIN ACCOUNT |
CAR ACCOUNT (BY LIST)|Dashboard.2 | CAR ACCOUNT (BY LIST) |
PLANE ACCOUNT|Dashboard.2 | PLACE ACCOUNT |
BOAT ACCOUNT|Dashboard.3 | BOAT ACCOUNT |
TRAIN ACCOUNT (BY LIST)|Dashboard.3 | TRAIN ACCOUNT (BY LIST) |
Desired Results (if these can be combined into one transformation, even better!):
Important Note: the number of Dashboards is Dynamic, so any reference to Column Names must be Dynamic where "Dashboard.#" is concerned.
Task 1: In the Sales table, find the intersection of Sales[Code]=SpecificLists[Code] and the corresponding column in Sales that matches SpecificLists[Dashboard], and replace the field value in that corresponding column with the [Code].
Example results:
Code | Dashboard.1 | Dashboard.2 | Dashboard.3 |
XXXX | CAR ACCOUNT | CAR ACCOUNT | null |
YYYY | CAR ACCOUNT | YYYY | null |
ZZZZ | BOAT ACCOUNT | null | BOAT ACCOUNT |
AAAA | null | PLANE ACCOUNT | null |
BBBB | null | PLANE ACCOUNT | null |
CCCC | TRAIN ACCOUNT | null | CCCC |
DDDD | TRAIN ACCOUNT | null | TRAIN ACCOUNT |
I was able to accomplish this on a small data set by merging Sales with SpecificLists on the Code column, and expanding the Dashboard column, then using List.Generate:
ColNames = #"Columns"[Dashboard],
NumColNames = List.Count(ColNames),
#"Merged Queries" = Table.NestedJoin(
#"Sales",
{"Code"},
SpecificLists,
{"Code"},
"SpecificLists",
JoinKind.LeftOuter
),
#"Expanded Dashboard" = Table.ExpandTableColumn(
#"Merged Queries",
"SpecificLists",
{"Dashboard"},
{"Dashboard"}
),
ReplaceWithCodes = List.Generate(
() => [i = - 1, Table = #table({}, {})],
each [i] < NumColNames,
each [
i = [i] + 1,
Table = Table.ReplaceValue(
(if i <= 0 then #"Expanded Dashboard" else [Table]),
each Record.Field(_, ColNames{i}),
each if [Dashboard] = ColNames{i} then [Code] else Record.Field(_, ColNames{i}),
Replacer.ReplaceText,
{ColNames{i}}
)
]
),
#"Expand Last Iteration" = List.LastN(ReplaceWithCodes, 1){0}[Table],
#"Removed Columns" = Table.RemoveColumns(#"Expand Last Iteration", {"Dashboard"})
However, on the full dataset (300,000+ rows, 48 columns - 8 of which are "Dashboard.#" columns) this code never finished running. I am assuming this is a performance issue with the size of the dataset.
Task 2: I am able to accomplish this task, but if this can be combined with Task 1, I can reduce the number of steps in the code and hopefully shorten the refresh time. Replace the Code that was placed in each Dashboard.# column in the first Task with the corresponding SpecificLists[Updated Name].
Current solution that also works on the large data set:
#"ReplaceCodeValues" = List.Accumulate(
List.Numbers(0, Table.RowCount(SpecificLists)),
#"Removed Columns",
(state, current) =>
Table.ReplaceValue(
state,
SpecificLists[Code]{current},
SpecificLists[Updated Name]{current},
Replacer.ReplaceValue,
ColNames
)
)
Result:
Code | Dashboard.1 | Dashboard.2 | Dashboard.3 |
XXXX | CAR ACCOUNT | CAR ACCOUNT | null |
YYYY | CAR ACCOUNT | CAR ACCOUNT (BY LIST) | null |
ZZZZ | BOAT ACCOUNT | null | BOAT ACCOUNT |
AAAA | null | PLANE ACCOUNT | null |
BBBB | null | PLANE ACCOUNT | null |
CCCC | TRAIN ACCOUNT | null | TRAIN ACCOUNT (BY LIST) |
DDDD | TRAIN ACCOUNT | null | TRAIN ACCOUNT |
Task 3: Remove all values in each Dashboard.# column where the Record Value &"|"& Dashboard.# (e.g., "CAR ACCOUNT|Dashboard.2", etc.) is not found in the "Dashboard Assignments" table.
Example Results:
Code | Dashboard.1 | Dashboard.2 | Dashboard.3 |
XXXX | CAR ACCOUNT | null | null |
YYYY | CAR ACCOUNT | CAR ACCOUNT (BY LIST) | null |
ZZZZ | BOAT ACCOUNT | null | BOAT ACCOUNT |
AAAA | null | PLANE ACCOUNT | null |
BBBB | null | PLANE ACCOUNT | null |
CCCC | TRAIN ACCOUNT | null | TRAIN ACCOUNT (BY LIST) |
DDDD | TRAIN ACCOUNT | null | null |
Again, I was able to accomplish this on a small data set by using List.Generate:
RemoveNoMatch = List.Generate(
() => [i = - 1, Table = #table({}, {})],
each [i] < NumColNames,
each [
i = [i] + 1,
Table = Table.ReplaceValue(
(if i <= 0 then #"ReplaceCodeValues" else [Table]),
each Record.Field(_, ColNames{i}),
each try
(
(
let
AccountDashboardKey = Text.Combine({Record.Field(_, ColNames{i}), ColNames{i}}, "|")
in
Table.SelectRows(
#"Dashboard Assignments",
each [Account_Dashboard] = AccountDashboardKey
)
){0}[#"Name"]
)
otherwise
null,
Replacer.ReplaceValue,
{ColNames{i}}
)
]
),
#"Expand Last Iteration2" = List.LastN(RemoveNoMatch, 1){0}[Table]
But again, on the full dataset (300,000+ rows, 48 columns - 8 of which are "Dashboard.#" columns) this code never finished running.
I need to accomplish the above without adding any additional columns for the Dashboards (i.e., no "Dashboard.2 Updated" or anything like that). Task 2 is currently solved but I am hoping it can be worked into Task 1 if there is a solution for Task 1.
Also, I am sure my existing code is not as elegant as it could be, so any improvements there are much appreciated as well!
Solved! Go to Solution.
let
Sales = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
SpecificLists = Table.Buffer(Table.Group(Excel.CurrentWorkbook(){[Name="SpecificLists"]}[Content],"Code",{"n",each _})),
Columns = Table.Buffer(Excel.CurrentWorkbook(){[Name="Columns"]}[Content]),
DashboardAssignments = Table.Buffer(Excel.CurrentWorkbook(){[Name="DashboardAssignments"]}[Content]),
Custom1 = Table.FromRecords(
Table.TransformRows(
Sales,
each let a=SpecificLists{[Code=[Code]]}?[n]?,
b=Record.ToTable(_)
in Record.FromTable(
Table.ReplaceValue(
b,
each [Name],
each List.Contains(Columns[Dashboard],[Name]),
(x,y,z)=>if z
then DashboardAssignments{[Account_Dashboard=(a{[Dashboard=y]}?[Updated Name]? ??x)&"|"&y]}?[Name]?
else x,
{"Value"}
)
)
)
)
in
Custom1
let
Sales = Excel.CurrentWorkbook(){[Name="Sales"]}[Content],
SpecificLists = Table.Buffer(Table.Group(Excel.CurrentWorkbook(){[Name="SpecificLists"]}[Content],"Code",{"n",each _})),
Columns = Table.Buffer(Excel.CurrentWorkbook(){[Name="Columns"]}[Content]),
DashboardAssignments = Table.Buffer(Excel.CurrentWorkbook(){[Name="DashboardAssignments"]}[Content]),
Custom1 = Table.FromRecords(
Table.TransformRows(
Sales,
each let a=SpecificLists{[Code=[Code]]}?[n]?,
b=Record.ToTable(_)
in Record.FromTable(
Table.ReplaceValue(
b,
each [Name],
each List.Contains(Columns[Dashboard],[Name]),
(x,y,z)=>if z
then DashboardAssignments{[Account_Dashboard=(a{[Dashboard=y]}?[Updated Name]? ??x)&"|"&y]}?[Name]?
else x,
{"Value"}
)
)
)
)
in
Custom1
So it took me a while, but I think I understand what each piece is doing (although it needs more time to really sink in) and all I can say is you've opened up a whole new world of M language for me, haha. Thanks again!
This worked like a dream! Refresh is manageable on the large dataset at about 1.5 hours, and so far the results look correct. Thank you for the quick response - now for me to dissect your answer to understand everything in your code.