Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
newtoPowerBI
New Member

Query Performance Slow 2,500 KB File Reading as 6GB

Hello,

 

I need help figuring out why a small .csv file is coming into Power Query and reading as 6GB. I have a few merged queries in my workbook so I am wondering if that is whats causuing the slow down. Can anyone help?

 

Below is the source query

 

let
Source = Table.Buffer(Csv.Document(File.Contents("C:\Users\eddy.gramajo\Desktop\Invoice Upload\Billing Report (Post-Upload).csv"),[Delimiter=",", Columns=22, Encoding=1252, QuoteStyle=QuoteStyle.None])),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"OPPORTUNITY_ID", type text}, {"Arrangement Id", type text}, {"Customer Name", type text}, {"Bill To Customer Name", type text}, {"Item Number", type text}, {"Item Description", type text}, {"Sales Order", Int64.Type}, {"Sales Order Line", Int64.Type}, {"Sales Order Id", Int64.Type}, {"Sales Order Line Id", Int64.Type}, {"Qty Ordered", type text}, {"So Book Date", type text}, {"Qty Shipped", type text}, {"Invoice Number", type text}, {"Invoice Line Id", Int64.Type}, {"Invoice Line", Int64.Type}, {"Qty Invoiced", type number}, {"Invoice Date", type date}, {"Ext Sell Price", type number}, {"Rec Amt", type number}, {"Def Amt", type number}, {"Cancelled Flag", type text}})
in
#"Changed Type"

 

 

Here is an example of a merged query

 

let
Source = Table.Buffer(Csv.Document(File.Contents("C:\Users\eddy.gramajo\Desktop\Invoice Upload\INV_API_01Nov2017_30Nov2017_3cded995-0889-4798-97cb-86c7ff08e2d3-2017.12.05 (1).csv"),[Delimiter=",", Columns=73, Encoding=1252, QuoteStyle=QuoteStyle.None])),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"TRAN_TYPE", type text}, {"OPPORTUNITY_ID", type text}, {"OPPORTUNITY_NAME", type text}, {"PRODUCT_LINE", type text}, {"PRODUCT_FAMILY", type text}, {"PRODUCT_CLASS", type text}, {"PRODUCT_CATEGORY", type text}, {"BILL_TO_CUSTOMER_NUMBER", Int64.Type}, {"BILL_TO_CUSTOMER_NAME", type text}, {"BILL_TO_STATE", type text}, {"BILL_TO_COUNTRY", type text}, {"SHIP_TO_CUSTOMER_NUMBER", Int64.Type}, {"SHIP_TO_CUSTOMER_NAME", type text}, {"SHIP_TO_STATE_OR_PROVINCE", type text}, {"SHIP_TO_COUNTRY", type text}, {"PO_NUMBER", type text}, {"SO_BOOK_DATE", type date}, {"SALES_ORDER", Int64.Type}, {"SALES_ORDER_ID", Int64.Type}, {"SALES_ORDER_LINE_ID", Int64.Type}, {"SALES_ORDER_LINE", Int64.Type}, {"ORDER_TYPE", type text}, {"INVOICE_DATE", type date}, {"INVOICE_NUMBER", type text}, {"INVOICE_ID", Int64.Type}, {"INVOICE_LINE", Int64.Type}, {"INVOICE_LINE_ID", Int64.Type}, {"ORIG_INV_LINE_ID", type text}, {"ITEM_ID", Int64.Type}, {"ITEM_NUMBER", type text}, {"ITEM_DESCRIPTION", type text}, {"QUANTITY_ORDERED", type text}, {"QUANTITY_INVOICED", type number}, {"QUANTITY_TIER", type text}, {"UNIT_OF_MEASURE", type text}, {"UNIT_LIST_PRICE", type number}, {"UNIT_SELL_PRICE", type number}, {"EXT_LIST_PRICE", type number}, {"EXT_SELL_PRICE", type number}, {"RULE_START_DATE", type date}, {"RULE_END_DATE", type date}, {"DURATION_IN_MONTHS", Int64.Type}, {"TRANS_CURR_CODE", type text}, {"BASE_CURR_CODE", type text}, {"EX_RATE", Int64.Type}, {"RCURR_EX_RATE", Int64.Type}, {"DEF_ACCTG_SEG1", Int64.Type}, {"REV_ACCTG_SEG1", Int64.Type}, {"DEF_ACCTG_SEG2", type text}, {"REV_ACCTG_SEG2", type text}, {"DEFERRED_REVENUE_FLAG", type text}, {"UNBILLED_ACCOUNTING_FLAG", type text}, {"UNDELIVERED_FLAG", type text}, {"ELIGIBLE_FOR_FV", type text}, {"ELIGIBLE_FOR_CV", type text}, {"STANDALONE_FLAG", type text}, {"FLAG_97_2", type text}, {"PCS_FLAG", type text}, {"RETURN_FLAG", type text}, {"CANCELLED_FLAG", type text}, {"ORG_ID", Int64.Type}, {"BUSINESS_UNIT", type text}, {"REGION", type text}, {"DEPARTMENT", type text}, {"SALES_TEAM", type text}, {"DISTRIBUTION_TYPE", type text}, {"SOB_ID", Int64.Type}, {"CUSTOMER_CLASS", type text}, {"INDUSTRY", type text}, {"METRO", type text}, {"GLOBAL_EMP_COUNT", type text}, {"OPEN_JOB_COUNT", type text}, {"SEC_ATTR_VALUE", Int64.Type}}),
#"Removed Columns" = Table.RemoveColumns(#"Changed Type",{"QUANTITY_TIER", "GLOBAL_EMP_COUNT", "OPEN_JOB_COUNT"}),
#"Merged Queries" = Table.NestedJoin(#"Removed Columns",{"SALES_ORDER_LINE_ID"},#"All Arrangement Report",{"Sales Order Line Id"},"All Arrangement Report",JoinKind.LeftOuter),
#"Expanded All Arrangement Report" = Table.ExpandTableColumn(#"Merged Queries", "All Arrangement Report", {"Ext Sell Price"}, {"All Arrangement Report.Ext Sell Price"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded All Arrangement Report", each ([All Arrangement Report.Ext Sell Price] <> null)),
#"Merged Queries1" = Table.NestedJoin(#"Filtered Rows",{"INVOICE_LINE_ID"},#"Billing Report (Pre-Upload)",{"Invoice Line Id"},"Billing Report (Pre-Upload)",JoinKind.LeftOuter),
#"Expanded Billing Report (Pre-Upload)" = Table.ExpandTableColumn(#"Merged Queries1", "Billing Report (Pre-Upload)", {"Invoice Line Id"}, {"Billing Report (Pre-Upload).Invoice Line Id"}),
#"Filtered Rows1" = Table.SelectRows(#"Expanded Billing Report (Pre-Upload)", each ([#"Billing Report (Pre-Upload).Invoice Line Id"] = null)),
#"Removed Columns1" = Table.RemoveColumns(#"Filtered Rows1",{"Billing Report (Pre-Upload).Invoice Line Id"}),
#"Filtered Rows2" = Table.SelectRows(#"Removed Columns1", each [DEPARTMENT] = "Sales and Marketing : B2B Sales and Marketing : 110 Sales - Employer"),
#"Renamed Columns" = Table.RenameColumns(#"Filtered Rows2",{{"All Arrangement Report.Ext Sell Price", "SO Amount"}}),
#"Merged Queries2" = Table.NestedJoin(#"Renamed Columns",{"SALES_ORDER_LINE_ID"},#"Billing Report (Pre-Upload)",{"Sales Order Line Id"},"Billing Report (Pre-Upload)",JoinKind.LeftOuter),
#"Aggregated Billing Report (Pre-Upload)" = Table.AggregateTableColumn(#"Merged Queries2", "Billing Report (Pre-Upload)", {{"Ext Sell Price", List.Sum, "Sum of Billing Report (Pre-Upload).Ext Sell Price"}}),
#"Replaced Value" = Table.ReplaceValue(#"Aggregated Billing Report (Pre-Upload)",null,0,Replacer.ReplaceValue,{"Sum of Billing Report (Pre-Upload).Ext Sell Price"}),
#"Added Custom" = Table.AddColumn(#"Replaced Value", "Total Billings to Date", each [#"Sum of Billing Report (Pre-Upload).Ext Sell Price"] +[EXT_SELL_PRICE]),
#"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"Sum of Billing Report (Pre-Upload).Ext Sell Price", type number}, {"Total Billings to Date", type number}}),
#"Rounded Off" = Table.TransformColumns(#"Changed Type1",{{"Total Billings to Date", each Number.Round(_, 2), type number}}),
#"Added Custom1" = Table.AddColumn(#"Rounded Off", "Rebill?", each if [Total Billings to Date] > [SO Amount] then "Yes" else "No" ),
#"Filtered Rows3" = Table.SelectRows(#"Added Custom1", each ([#"Rebill?"] = "No")),
#"Renamed Columns1" = Table.RenameColumns(#"Filtered Rows3",{{"Sum of Billing Report (Pre-Upload).Ext Sell Price", "Sum of Billing Report Ext Sell Price"}})
in
#"Renamed Columns1"

5 REPLIES 5
yoava
Regular Visitor

Hi @Anonymous @newtoPowerBI,

 

I am also experianceing unusual performance degredataion with small files being read as very large ones

I have a 2,500 KB csv that shows the same behaviour described above. 

 

 

 

let
    Source = Csv.Document(File.Contents("C:\input.csv"),[Delimiter=",", Columns=2, Encoding=65001, QuoteStyle=QuoteStyle.Csv]),
    #"Use First Row as Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
    #"Trimmed Text" = Table.TransformColumns(#"Use First Row as Headers",{{"ADD_CONFIG", Text.Trim, type text}}),
    #"Cleaned Text" = Table.TransformColumns(#"Trimmed Text",{{"ADD_CONFIG", Text.Clean, type text}}),
    #"Filtered Rows" = Table.SelectRows(#"Cleaned Text", each [ADD_CONFIG] <> null and [ADD_CONFIG] <> ""),
    #"Parsed XML" = Table.TransformColumns(#"Filtered Rows",{{"ADD_CONFIG", Xml.Tables}}),
    #"Expanded ADD_CONFIG" = Table.ExpandTableColumn(#"Parsed XML", "ADD_CONFIG", List.Union(List.Transform(#"Parsed XML"[ADD_CONFIG], each Table.ColumnNames(_))))
in
    #"Expanded ADD_CONFIG"

 

 

Using some trial and errors, i figured the performance degradation is directly related to the following expression:

 

    #"Expanded ADD_CONFIG" = Table.ExpandTableColumn(#"Parsed XML", "ADD_CONFIG", List.Union(List.Transform(#"Parsed XML"[ADD_CONFIG], each Table.ColumnNames(_))))

 

If i replace it with an actual comma seperated list (generated using the same List.Union(List.Transform)), the expansion is immediate:

 

= Table.ExpandTableColumn(#"Parsed XML", "ADD_CONFIG", {"foobar0","foobar1","foobar10","foobar100","foobar101","foobar102","foobar103","foobar104","foobar107","foobar108","foobar109","foobar11","foobar110","foobar112","foobar12","foobar13","foobar14","foobar15","foobar16","foobar17","foobar18","foobar19","foobar2","foobar20","foobar21","foobar22","foobar23","foobar24","foobar25","foobar27","foobar28","foobar29","foobar3","foobar30","foobar31","foobar32","foobar33","foobar34","foobar35","foobar36","foobar37","foobar38","foobar39","foobar4","foobar40","foobar41","foobar42","foobar43","foobar44","foobar45","foobar46","foobar47","foobar48","foobar49","foobar5","foobar50","foobar51","foobar52","foobar53","foobar54","foobar55","foobar56","foobar57","foobar58","foobar59","foobar6","foobar60","foobar61","foobar62","foobar63","foobar64","foobar65","foobar67","foobar68","foobar69","foobar7","foobar70","foobar71","foobar72","foobar73","foobar74","foobar75","foobar76","foobar77","foobar78","foobar79","foobar8","foobar81","foobar82","foobar83","foobar84","foobar85","foobar86","foobar87","foobar88","foobar89","foobar9","foobar91","foobar92","foobar93","foobar94","foobar95","foobar96","foobar97","foobar98","foobar99"})

 

 Note that this expansion does not create additional rows, only additional columns. 

Anonymous
Not applicable

Hi @newtoPowerBI,

 

Can you please share pbix file and the csv file for test ? It is hard to find out the issue from your query formula.

 

Notice: please do mask on sensitive data before sharing.

 

Regards,

Xiaoxin Sheng

HI @Anonymous

 

It looks like when I merge and aggregate data from another table it increases the size significantly. In one case I merged a query and aggregated a sum of a column and the file read as 10GB. Does anyone know a way to not have this happen?


Thank you!


Eddy

Anonymous
Not applicable

Hi @newtoPowerBI,

 

Have you try to expand summary columns in your merge table?(column which stored list, records, tables )

When you expand merge table, it will generate lots duplicate rows.

 

Regards,

Xiaoxin Sheng

HI @Anonymous,

 

Here is a link to the data where most of the query is derived from.

 

https://www.dropbox.com/sh/5kzn1vm38mkis87/AABSRMdhGpGDi6TsOSkpZ5u5a?dl=0

 

Thank you!

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.