Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Anonymous
Not applicable

Redshift native query file size

So the short version of the story is that when we were creating the data set in Power Query and using M, we were having a lot of issues with the dataset not refreshing and getting a lot of timeout errors. We decided to rebuild using the native query in the Redshift connector and had it all done as a single SQL statement. After doing so, there haven't been any refresh issues so far, but the filesize is 44% larger, so I'm just trying to figure out why.

 

Our old data model for our commission line items was built using the Redshift connector. It pulled in the entire table of 20 years of data, deleted half of the columns, then merged in 2-3 columns each from 3 other tables. One of those extra columns is a run date, which is then filtering to keep anything within the last 400 days. When all is said and done, (for size testing purposes) I disabled the load on those other tables and only kept the commission items table. It ends up being about 15.5 million rows and 12 columns, with a file size of 310MB.

 

When I rebuild the same merges/appends as in the description above using the optional SQL statement in the same Redshift connector, with the only difference being it's pulling the last 12 months from the start of the current month instead of the last 400 days. All the other merges/appends are identical. The table ends up being 13.8 million rows and 12 columns, but the file size is 446MB.

 

I get the row count being less because is has a month less of data. But why is the file size nearly 44% larger? The final table size is exactly the same in terms of columns and there are fewer rows. The content of what is there is identical. So why does the SQL queried table take up so much more space? We are really loving the improved refreshing with the native query, but not too thrilled with how much larger the file size is.

 

Is there any way to tweak the native query so that the file size is more in lines of the older M code way we did it? I just can't figure out where the bloat is coming from.

 
1 REPLY 1
Anonymous
Not applicable

Hi @Anonymous ,

 

As the official document said,Power Query gives you the flexibility to import data from wide variety of databases that it supports. It can run native database queries, which can save you the time it takes to build queries using the Power Query interface. 

 

Here is a blog talking about a similar scenario, you may kindly refer to:

Solved! Query folding for native SQL in Power BI

Differences between the M Language and DAX in Power BI

Query Folding On SQL Queries In Power Query

 

Best Regards,
Eyelyn Qin

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors