Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
So this is more of a 'Power Query - best practices' scenario. I have two table, an Account table (~600k rows - from SQL via Dataflow) and a Reports table (~200 rows from Sharepoint via Dataflow).
The reports table is a user generated table in which they enter ranges and variables to which a report key should be assigned. This table is one used to as an input for a reporting software program, so there are validations in place to make sure that this table is consistent.
I am combining these tables into a MappedAccounts table by pulling in a buffered Reports table into each row of the Accounts table, then filtering it down.
This works for what I need - however the refresh is taking almost half an hour! I work with hundreds of millions of rows that use query folding and refreshes of a matter of minutes, so it's not great that the reportmapping process is taking so long. Am I utilising powerquery in the best way to do this?
I have created tables with example tables - the actual data is not as consistent as this but it gives you a picture of what i am doing:
//Reports
let
Source =
Table.FromRecords(
{
[CompanyID = 0, AccountFrom = 100000, AccountTo = 199999, LocationID = "H", ReportKey = 1],
[CompanyID = 0, AccountFrom = 200000, AccountTo = 299999, LocationID = "C", ReportKey = 2],
[CompanyID = 1, AccountFrom = 300000, AccountTo = 399999, LocationID = "T", ReportKey = 3],
[CompanyID = 1, AccountFrom = 400000, AccountTo = 499999, LocationID = "*", ReportKey = 4]
}
)
in
Source
//Accounts
let
Source =
Table.FromRecords(
{
[Linekey = {1..999}, CompanyID = 0, Account = Number.Round(Number.RandomBetween(100000, 199999),0), LocationID = "H"],
[Linekey = {1000..1999},CompanyID = 0, Account = Number.Round(Number.RandomBetween(200000, 299999),0), LocationID = "C"],
[Linekey = {2000..2999},CompanyID = 1, Account = Number.Round(Number.RandomBetween(300000, 399999),0), LocationID = "T"],
[Linekey = {3000..3999},CompanyID = 1, Account = Number.Round(Number.RandomBetween(400000, 499999),0), LocationID = "B"]
}
),
ExpandLineKey = Table.ExpandListColumn(Source, "Linekey")
in
ExpandLineKey
//MappedAccounts
let
ReportSource = Table.Buffer(Reports),
AccountSource = Accounts,
AddReportKey =
Table.AddColumn(AccountSource, "ReportKey",
each
let
//pull through the fields from the parent table
companyid = [CompanyID],
account = [Account],
location = [LocationID],
//appyly filters to the reportsource
filtercompanyid =
Table.SelectRows(ReportSource,
each
[CompanyID] = companyid
),
filteraccount =
Table.SelectRows(filtercompanyid,
each
[AccountFrom] <= account and
[AccountTo] >= account
),
filterlocation =
if [LocationID] ="*"
then filteraccount
else
Table.SelectRows(filteraccount,
each
[LocationID] = location
)
in
Table.FirstN(filterlocation,1)
),
ExpandHeaderKey =
Table.ExpandTableColumn(AddReportKey,
"ReportKey",
{"ReportKey"}, {"ReportKey"}
)
in
ExpandHeaderKey
Is there a better way to achieve this type of merging of tables without the costly overhead of iterative filtering of tables?
Solved! Go to Solution.
Hi @JamesMcEwan ,
Firstly, +2 for the perfectly executed provision of example data.
Now, I usually advise to do EVERYTHING in Power Query, and to NEVER use DAX calculated columns. However, conditional merges is the one scenario where I break from this. There are a number of ways to do it in Power Query, but not one of them I've found so far comes even close to matching the performance of DAX for this.
Send both your original tables to your data model then, on your Accounts table, create a calculated column like this:
..reportKey =
CALCULATE(
VAR __compRow = VALUES(Accounts[CompanyID])
VAR __locnRow = VALUES(Accounts[LocationID])
VAR __acctRow = VALUES(Accounts[Account])
RETURN
MAXX(
FILTER(
Reports,
Reports[CompanyID] = __compRow
&& Reports[LocationID] = __locnRow
&& Reports[AccountFrom] <= __acctRow
&& Reports[AccountTo] >= __acctRow
),
Reports[ReportKey]
)
)
You'll want to tweak a bit to handle your "*" location escape, but this should give you a solid starting point.
Give it a go and let me know how you get on.
Pete
Proud to be a Datanaut!
Hi @JamesMcEwan ,
Firstly, +2 for the perfectly executed provision of example data.
Now, I usually advise to do EVERYTHING in Power Query, and to NEVER use DAX calculated columns. However, conditional merges is the one scenario where I break from this. There are a number of ways to do it in Power Query, but not one of them I've found so far comes even close to matching the performance of DAX for this.
Send both your original tables to your data model then, on your Accounts table, create a calculated column like this:
..reportKey =
CALCULATE(
VAR __compRow = VALUES(Accounts[CompanyID])
VAR __locnRow = VALUES(Accounts[LocationID])
VAR __acctRow = VALUES(Accounts[Account])
RETURN
MAXX(
FILTER(
Reports,
Reports[CompanyID] = __compRow
&& Reports[LocationID] = __locnRow
&& Reports[AccountFrom] <= __acctRow
&& Reports[AccountTo] >= __acctRow
),
Reports[ReportKey]
)
)
You'll want to tweak a bit to handle your "*" location escape, but this should give you a solid starting point.
Give it a go and let me know how you get on.
Pete
Proud to be a Datanaut!
Hi @BA_Pete,
Glad it helps! You get out what you put in! 😁
Thanks for the input - I had thought of doing it via dax but wanted to keep all transformations in powerquery. It's good to hear confirmation that this might be an exceptional circumstance to the ETL best practice.
The code you provided will work for what I need. I will let you know how I get one.
Thanks, again.
No problem.
Note that this is just my personal preference, certainly not an official exception to best practice.
This is still doable in PQ, if you really want to do it, just let me know, but performance will likely be significantly worse.
Pete
Proud to be a Datanaut!
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 19 | |
| 10 | |
| 9 | |
| 8 | |
| 7 |