Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have the following situation that I would like to fix with power query. I have a sales fact table and a department dimension. In the sales table, I have a seller column, where each seller belongs to a default department. But sometimes, during a certain time period, the seller can temporarily switch departments.
So, in the Sales table I would like to create a 'Current Department' column that calculates the current department based on the invoice date:
if sales.[Invoice Date] >= Department.[FromDate] and sales.[Invoice Date] <= Department.[ToDate] then Department.[Department] else sales.[DepartmentDefault]
This is the result
Help? I tried to use a merge with left outer join between those two tables, but I 'duplicate' the rows in sales, to include the different periods for the same seller, so then the values I have in other columns (like sales value, quantity, margin) get duplicated as well and the totals for time periods get messed up.
Solved! Go to Solution.
You can add a column in sales with this formula:
Table.SelectRows(Department,
(Dept) => (Dept[FromDate] < [Invoice Date]) and
(Dept[ToDate] >= [Invoice Date]) and
(Dept[Seller] = [Seller])
)[Department]{0}
This will lookup the desired value from te Department-table (and if it is null, replace it with the value of your current row).
But it might be slow. To speed it up you can "partition" your Sales table by grouping it on Seller, merge with Department on seller and apply the above selection-formula in the partitioned fields: Just omit the "(Dept[Seller] = [Seller])"-part then.
Another alternative is to create an intermediate table that you dont load to the data model where you expand the dates of the time intervals of your Department-table so that every day will have one row. Then you can simply merge that new date-column with your Sales-table.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Invoking @ImkeF
You can add a column in sales with this formula:
Table.SelectRows(Department,
(Dept) => (Dept[FromDate] < [Invoice Date]) and
(Dept[ToDate] >= [Invoice Date]) and
(Dept[Seller] = [Seller])
)[Department]{0}
This will lookup the desired value from te Department-table (and if it is null, replace it with the value of your current row).
But it might be slow. To speed it up you can "partition" your Sales table by grouping it on Seller, merge with Department on seller and apply the above selection-formula in the partitioned fields: Just omit the "(Dept[Seller] = [Seller])"-part then.
Another alternative is to create an intermediate table that you dont load to the data model where you expand the dates of the time intervals of your Department-table so that every day will have one row. Then you can simply merge that new date-column with your Sales-table.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
@ImkeF why is this necessary?
@ImkeF wrote:
You can add a column in sales with this formula:
Table.SelectRows(Department,
(Dept) => (Dept[FromDate] < [Invoice Date]) and
(Dept[ToDate] >= [Invoice Date]) and
(Dept[Seller] = [Seller])
)[Department]{0}
This will lookup the desired value from te Department-table (and if it is null, replace it with the value of your current row).
In other words, what exactly is (Dept) => doing here?
This was super helpful for me, but not 100% sure what is going on. I cannot figure out why I cannot simply use something like
=Table.AddColumn(#"Changed Type", "Other", each Table.SelectRows(OtherTable, each [OtherTableColumn] = 1))
I can do this with other steps in the current query, using something like this:
Table.AddColumn(#"Changed Type", "Other", each Table.SelectRows(#"Changed Type", each [SomeColumn] = 1))
But obviously the table I'm referencing in my first example (OtherTable) isn't the same as a table in another step (#"Changed Type").
Your code works beautifully, but not tracking it...
DAX is for Analysis. Power Query is for Data Modeling
Proud to be a Super User!
MCSA: BI Reporting@ImkeF Thank you for the solution! I have tried it on the sample dataset, and I get an error for the rows where i am expecting it to be replaced with the current row (value in DepartmentDefault).
You'll probably get the errors are probably where there is no match with the other table?
Either replace it with the value from the other column or write a conditional statement that uses that column if the other operatoin failed.
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Thank you very much @ImkeF, that worked! I used try...otherwise to fix it. I implemented it now in my original data model also and as expected, the processing takes much longer now. My data model is in AAS actually, and I connect to it using Direct Query. So I am wondering at which level my performance will be affected. Because we are processsing the AAS every hour right now, so will this code in power query only make the automatic processing longer? Or will it also have an effect on the rendering of the reports in PowerBI for the end user that is using this generated column? Haven't tested it yet, so I'll get to that soon after.
I am also looking into the partitioning option, and it's a bit too advanced for me, so I need to study it a bit to understand how to implement in my case and how that affects my model.
@ImkeF, Since my M skills are not yet at that level so I can implement the partitioning option, I decided to implement the solution at a different level. We actually have full control over the SQL database, so we actually created a separate view of the table where we added the extra column by following the logic I detailed. So we used SQL to fix this, and I guess the performance has improved a lot, since creating that view takes just seconds.
In your experience, if I do have the option to fix such issues at SQL database level, is it preffered to do it there rather than in M?
There is a clear answer to it:
Cleaning at the data source beats cleaning in Power BI ... always 😉
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
and a link to an article I've written about "partitioning": https://wp.me/p6lgsG-yh
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
Check out the September 2024 Power BI update to learn about new features.
Learn from experts, get hands-on experience, and win awesome prizes.
User | Count |
---|---|
116 | |
103 | |
87 | |
35 | |
35 |
User | Count |
---|---|
152 | |
98 | |
81 | |
61 | |
55 |