The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event! Join us in Stockholm, Sweden from September 24-27, 2024.
2-for-1 sale on June 20 only!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi,
I've 2 tables:
- the first contains a forecast on a month (with a breakdown by customers and products) => imported in PQ as "OLD"
- the second countains the true realization on the same month => imported in PQ as "NEW"
I would like to get a third table giving me the difference New vs Old on all possible breakdown customer/products
Then append the 3 tables and Load to a Pivot Table, for which I can select the type of data I want to display: True, Forcast, Change vs Forecast.
I tried a way that works but my query is long and may crash sometimes:
Duplicate OLD, multiply each value column *-1 (18 columns!), then Append NEW and OLD(-1) and then GroupBy... it works but that's soooo long.
alternatively I unpivot my 18 columns, multiply *-1 and then repivoting... but that's even worst in term of timings.
PowerQuery Only, I work in Excel not in Power BI.
Do you know a better way ?
Thank you.
Hello @LaurentZ
here another approach by using Table.Join, expanding the result, making your calculation and reshaping it.
After that join all 3 created tables
let
New = #table
(
{"Scenario","Customer","Product","Quantity"},
{
{"Actuals","A","1","5"}, {"Actuals","B","1","5"}, {"Actuals","B","2","1"}
}
),
Old = #table
(
{"Scenario","Customer","Product","Quantity"},
{
{"Forecast","A","1","3"}, {"Forecast","B","2","2"}
}
),
Difference =
let
ToJoin = Table.NestedJoin(New, {"Customer", "Product"}, Old, {"Customer", "Product"}, "Old", JoinKind.FullOuter ),
#"Expanded Old" = Table.ExpandTableColumn(ToJoin, "Old", {"Scenario", "Customer", "Product", "Quantity"}, {"Old.Scenario", "Old.Customer", "Old.Product", "Old.Quantity"}),
#"Replaced Value" = Table.ReplaceValue(#"Expanded Old",null,"0",Replacer.ReplaceValue,Table.ColumnNames(#"Expanded Old")),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Custom", each Number.From([Quantity])-Number.From([Old.Quantity])),
#"Added Custom1" = Table.AddColumn(#"Added Custom", "Custom.1", each "Difference"),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom1",{"Custom.1", "Custom", "Customer", "Product"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"Custom", "Quantity"}, {"Custom.1", "Scenario"}})
in
#"Renamed Columns",
Append = New & Old & Difference,
#"Changed Type" = Table.TransformColumnTypes(Append,{{"Quantity", Int64.Type}})
in
#"Changed Type"
Copy paste this code to the advanced editor in a new blank query to see how the solution works.
If this post helps or solves your problem, please mark it as solution (to help other users find useful content and to acknowledge the work of users that helped you)
Kudoes are nice too
Have fun
Jimmy
Dear @Jimmy801 , @v-alq-msft , @ziying35 , @Anonymous , @Greg_Deckler ,
Thank you all for your answer.
I saw some solutions but all are working with small tables and limited volume of data.
So let me share a table (excel) with totally fake data (figures are generated with RANDBETWEEN function).
OneDrive - Excel File (edit I also put a pbix file, but I need an output in Excel, not in PowerBI)
So there are 2 tabs: NEW and OLD with Customer / Product in common, and up to 12 "Value Drivers", already imported in PowerQuery.
The aim here is to create a table corresponding to NEW - OLD, by keeping the same level of data (customer/product/all value drivers), named DIF
And at the end, append NEW, OLD & DIF and play with the Status to display the data I want (Actuals, Forecast, Difference) - no problem here I know how to do it.
Until now I tried this:
Take OLD, multiply all Value Column * -1, then append NEW and OLD-1 and then GroupBy to only kepp the difference... very long.
I'm using this workaround but I would really like to get a better way, more flexible if you know how to proceed.
Thank you.
Laurent
I had not given code of any kind, I had limited myself to sharing some reflections (of common sense) on the dimensional analysis of the problem.
One of those considerations (*) I have roughly put it into practice on the tables you provided, just to get an idea of the times.
It seems to me that everything you ask for can be done within minutes.
I attach a link to the pbix file.
(*) the one that suggests the Roman dictum "divide et impera!"
empty post
Imke Feldmann (The BIccountant)
If you liked my solution, please give it a thumbs up. And if I did answer your question, please mark this post as a solution. Thanks!
How to integrate M-code into your solution -- How to get your questions answered quickly -- How to provide sample data -- Check out more PBI- learning resources here -- Performance Tipps for M-queries
I got the following result in less than 1':
you have to group by Customer field both table NEW and OLD:
then produce the difference Table:
let
Source = Table.NestedJoin(OLD, {"Customer"}, NEW, {"Customer"}, "NEW", JoinKind.LeftOuter),
#"Expanded NEW" = Table.ExpandTableColumn(Source, "NEW", {"cust"}, {"NEW.cust"}),
#"Added Custom" = Table.AddColumn(#"Expanded NEW", "diff", each diff([NEW.cust],[cust])),
#"Removed Other Columns" = Table.SelectColumns(#"Added Custom",{"diff"}),
#"Renamed Columns" = Table.RenameColumns(#"Removed Other Columns",{{"diff", "cust"}})
in
#"Renamed Columns"
which use this function:
let
diff = (old,new)=>
let
names=Table.ColumnNames(old),
mm=Table.TransformRows(old, (rowOld)=> Record.FromList({"Difference",rowOld[Customer],rowOld[Product]}&List.Transform({3..20}, each let rowNew=new{[Product=rowOld[Product]]}? in Record.FieldValues(rowOld){_}-Record.FieldValues(rowNew){_}),names))
in Table.FromRecords(mm)
in
diff
and finally put all together:
let
Source = Table.Combine({OLD[[cust]], NEW[[cust]], Difference}),
#"Expanded cust" = Table.ExpandTableColumn(Source, "cust", {"Scenario", "Customer", "Product", "Val Driv 1", "Val Driv 2", "Val Driv 3", "Val Driv 4", "Val Driv 5", "Val Driv 6", "Val Driv 7", "Val Driv 8", "Val Driv 9", "Val Driv 10", "Val Driv 11", "Val Driv 12", "Val Driv 13", "Val Driv 14", "Val Driv 15", "Val Driv 16", "Val Driv 17", "Val Driv 18"}, {"Scenario", "Customer", "Product", "Val Driv 1", "Val Driv 2", "Val Driv 3", "Val Driv 4", "Val Driv 5", "Val Driv 6", "Val Driv 7", "Val Driv 8", "Val Driv 9", "Val Driv 10", "Val Driv 11", "Val Driv 12", "Val Driv 13", "Val Driv 14", "Val Driv 15", "Val Driv 16", "Val Driv 17", "Val Driv 18"})
in
#"Expanded cust"
PS
you could find usefull place some if.. the .. else in the function to check case where new and old doesn't match!
Thank you all for your answers and help.
I was off few days, so let me have a look at all your solutions and test them, then I'll tell you which solution suits me.
Hi, @LaurentZ
Based on your description, I created data to reproduce your scenario. The pbix file is attached in the end.
Actual:
Forecast:
You may create a blank query and input the following codes to create a table for difference.
let
Source = Table.AddColumn(Actual, "New Scenario", each "Difference"),
#"Added Custom" = Table.AddColumn(Source, "New Quantity", each let customer=[Customer],product=[Product],
tab=Table.SelectRows(Forecast,each [Customer]=customer and [Product]=product)
in
if Table.RowCount(tab)>0 then
[Quantity]-tab[Quantity]{0}
else [Quantity]),
#"Removed Columns" = Table.RemoveColumns(#"Added Custom",{"Scenario", "Quantity"}),
#"Reordered Columns" = Table.ReorderColumns(#"Removed Columns",{"New Scenario", "Customer", "Product", "New Quantity"}),
#"Renamed Columns" = Table.RenameColumns(#"Reordered Columns",{{"New Scenario", "Scenario"}, {"New Quantity", "Quantity"}})
in
#"Renamed Columns"
Then you may create another blank query for final result as below.
let
#"Appended Query" = Table.Combine({ Actual, Forecast, New})
in
#"Appended Query"
Best Regards
Allan
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Put it in very general terms, the question is schematized by saying that having to "cross" two tables of 10 ^ 5 lines, 10 ^ 10 comparisons will have to be made in the worst case. All these comparisons take a certain amount of time, which can be different if you use different lookup or merge techniques or something else.
but if instead we could divide the problem into, say, 100 under problems of 10 ^ 3 elements, the comparisons would be 10 ^ 3 X 10 ^ 3 X 100. Therefore, we would have a time 100 times shorter, with the same technique, compared to the original time.
In this case, how could one divide the problem into subproblems?
We should have more information to find an effective solution.
For example, how many customers are there?
How many products are there?
####
another way would be to limit the search for the row corresponding to the current new row in the old table to a small subset of the rows instead of scrolling through all 100,000 rows for each search. If you manage to limit the search to a group of 1000 lines, you improve the search time by 100 times.
just to give an idea, but an effective solution strongly depends on the structure of the tables, you can sort the tables in increasing order with respect to the customer column and when you do the search you use Table.PositionOf to find the first (and only) occurrence of the row with the same customer and product. The next search is done starting from the previous position on a reduced table table.skip (tab, positionof); again you can limit the search towards the end of the table by stopping when records are found with customerIDold> customerIDnew.
in this way, for each search, a few hundred comparisons are made instead of 100000. In theory, obtaining a gain of 1000 times!?!?!?!?
@LaurentZ - Not really enough information to go on, please first check if your issue is a common issue listed here: https://community.powerbi.com/t5/Community-Blog/Before-You-Post-Read-This/ba-p/1116882
Also, please see this post regarding How to Get Your Question Answered Quickly: https://community.powerbi.com/t5/Community-Blog/How-to-Get-Your-Question-Answered-Quickly/ba-p/38490
The most important parts are:
1. Sample data as text, use the table tool in the editing bar
2. Expected output from sample data
3. Explanation in words of how to get from 1. to 2.
Ok, then
I've this table in Excel imported in Power Query as NEW:
Scenario | Customer | Product | Quantity |
Actuals | A | 1 | 5 |
Actuals | B | 1 | 5 |
Actuals | B | 2 | 1 |
I've this table in Excel imported in Power Query as OLD:
Scenario | Customer | Product | Quantity |
Forecast | A | 1 | 3 |
Forecast | B | 2 | 2 |
First, I want a creation of a table by Power Query, giving me the difference between Tables NEW and OLD
Scenario | Customer | Product | Quantity |
Difference | A | 1 | 2 |
Difference | B | 1 | 5 |
Difference | B | 2 | -1 |
and finaly append those 3 tables to a final table:
Scenario | Customer | Product | Quantity |
Actuals | A | 1 | 5 |
Actuals | B | 1 | 5 |
Actuals | B | 2 | 1 |
Forecast | A | 1 | 3 |
Forecast | B | 2 | 2 |
Difference | A | 1 | 2 |
Difference | B | 1 | 5 |
Difference | B | 2 | -1 |
How would you proceed ?
Note that this is very simplified, database are both >100,000 lines by tables and I've 18 columns of drivers for values (Quantity, turnover...)
Hi, @LaurentZ
Can you remove sensitive information from your real data and upload it to a cloud drive and share the link here? That way, when people help you solve the problem, they can write code that is more comprehensive in terms of operational etticiency
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the June 2024 Power BI update to learn about new features.
User | Count |
---|---|
34 | |
19 | |
19 | |
16 | |
13 |