Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Dear experts,
I hope you can help me with the following: I have a 4GB financial dataset where we have all data (sales, customers etc) by row, with the years under each other. So it means that the long list of data is simply showing transactions that can be in previous year or current year.
What I need to end up with, is a dataset that puts the years next to each other, meaning that for 1 customer and product we see the sales in previous year and - if applicable - the sales in current year.
The steps I have taken are pretty basic:
1. First I break the CSV file by tab into columns and do some minor changes, not relevant
2. Then I group by year, month, customer code, SKU, and 3/4 other variables with summing sales, margin and costs
3. Next I unpivot the Sales, margin and costs
4. I create a string between the attributed column and the year column
5. Finally I pivot the string column to put previous year and current year next to each other
Now the good part is that the code does what it is supposed to do... The bad part is that it is extremely slow. It can take up to 2 hours which is too long for a 4gb file, and also the operations are basic I would say.
Please tell me if you see any opportunity to improve this code or approach it differently.
THANK YOU!
let
Source = Csv.Document(File.Contents("C:\Users\busserb\OneDrive - BM Europe BV\SharePoint Pricing\7. BMS\BMS Sales Data Dump\20220620_SalesData_BMS_2021-2022ytd.txt"),[Delimiter=","]),
#"Split Column by Delimiter" = Table.SplitColumn(Source, "Column1", Splitter.SplitTextByDelimiter("#(tab)", QuoteStyle.Csv), {"Column1.1", "Column1.2", "Column1.3", "Column1.4", "Column1.5", "Column1.6", "Column1.7", "Column1.8", "Column1.9", "Column1.10", "Column1.11", "Column1.12", "Column1.13", "Column1.14", "Column1.15", "Column1.16", "Column1.17", "Column1.18", "Column1.19", "Column1.20", "Column1.21", "Column1.22", "Column1.23", "Column1.24", "Column1.25", "Column1.26", "Column1.27", "Column1.28", "Column1.29", "Column1.30", "Column1.31", "Column1.32", "Column1.33", "Column1.34", "Column1.35", "Column1.36", "Column1.37", "Column1.38", "Column1.39"}),
#"Promoted Headers" = Table.PromoteHeaders(#"Split Column by Delimiter", [PromoteAllScalars=true]),
#"Removed Columns" = Table.RemoveColumns(#"Promoted Headers",{"Customer Name", "Customer assigned segment level 1", "Customer assigned segment level 2", "Customer assigned segment level 3", "Customer Role", "Customer Size", "External vs Intercompany customer", "Product description", "Product family level 1", "Product family level 2", "Product family level 3", "Item Group", "IsIFRS15", "Sales_IFRS15", "Sales_PostIFRS15", "Price Calculation Category Origin", "Price Calculation", "Price Calculation Category"}),
#"Removed Columns1" = Table.RemoveColumns(#"Removed Columns",{"OrderNo_Sales", "BU/Region", "Branch", "Sales Manager"}),
#"Reordered Columns1" = Table.ReorderColumns(#"Removed Columns1",{"Month", "Year", "Customer Code", "Product code (SKU)", "One Time Article flag", "Sales channel", "Price Calculation Origin", "isServiceItem", "Sales Agent", "Manual overwrite?", "Base Unit of Measure", "Currency", "Qty sold", "Invoice Value", "Cost Value", "Margin Value", "Theoritical Revenue"}),
#"Changed Type" = Table.TransformColumnTypes(#"Reordered Columns1",{{"Qty sold", type number}, {"Invoice Value", type number}, {"Cost Value", type number}, {"Margin Value", type number}, {"Theoritical Revenue", type number}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Month", "Year", "Customer Code", "Product code (SKU)", "One Time Article flag", "Sales channel", "Price Calculation Origin", "isServiceItem", "Sales Agent", "Manual overwrite?", "Base Unit of Measure", "Currency"}, {{"Volumes", each List.Sum([Qty sold]), type nullable number}, {"Net Sales", each List.Sum([Invoice Value]), type nullable number}, {"Costs", each List.Sum([Cost Value]), type nullable number}, {"Margin", each List.Sum([Margin Value]), type nullable number}, {"Theoretical Sales", each List.Sum([Theoritical Revenue]), type nullable number}}),
#"Unpivoted Columns" = Table.UnpivotOtherColumns(#"Grouped Rows", {"Month", "Year", "Customer Code", "Product code (SKU)", "One Time Article flag", "Sales channel", "Price Calculation Origin", "isServiceItem", "Sales Agent", "Manual overwrite?", "Base Unit of Measure", "Currency"}, "Attribute", "Value"),
#"Merged Columns" = Table.CombineColumns(#"Unpivoted Columns",{"Attribute", "Year"},Combiner.CombineTextByDelimiter(" | ", QuoteStyle.None),"Headers"),
#"Pivoted Column" = Table.Pivot(#"Merged Columns", List.Distinct(#"Merged Columns"[Headers]), "Headers", "Value")
in
#"Pivoted Column"
Handling this type of transformation in M with a source file of that size is going to be slow for sure, and it will be even slower if it's served from SharePoint. There's a couple things you could do to marginally speed things up, but that's not the crux of your issue. All the steps starting from Grouped Rows would be better done in DAX. Shape your data with M so that it's in tabular, granular format, then compute the aggregates with DAX measures (avoid calculated columns though).
Just 1 remark: I built the model following the STAR method to avoid unnecessary columns
Check out the November 2025 Power BI update to learn about new features.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!