Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
ace12
Regular Visitor

Project doubles in size when creating dimension tables

I have one large data set (300MB about 700k rows).

 

I would like to make a series of dimension tables to reduce the size. My process has been to duplicate the original query, remove the columns I don't need, remove duplicates, add an index column. Then I will merge it back to the original query and just keep the index column.

 

However when I refresh my project it goes to 600MB. And when I add another query and follow the dame process it goes to 900MB. How can I avoid this I thought creating Dim tables would make the project run smoother with less data..

4 REPLIES 4
Anonymous
Not applicable

Hi,

Thanks for the solutions @Anonymous  and @Joe_Barry  provided and i want to offer some more information for user to refer to.

hello @ace12 , based on your description , you can consider to use calculated table to create the demension table in desktop. such as 

 

Table=summerize(table,[index]...)

 

 You can refer to the following link about it.

Using calculated tables in Power BI Desktop - Power BI | Microsoft Learn

 

Best Regards!

Yolo Zhu

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

Anonymous
Not applicable

First, are you confusing the query size with the data model size? You might see the table queried 3 times, but your resulting data model will be much smaller if you've replaced a non-integer dimension with integers.

 

Second, instead of duplicating the table, duplicate just the data source, remove all of the other columns (after duplicating the filter logic if necessary), then make distinct and add the index. Then perform your join. If your indexes are both sorted the same way, then you can use Table.Join instead of NestedJoin, and use the JoinAlgorithm.Sortmerge for the JoinAlgorithm parameter. This way the data will be streamed and not have the table brought into memory.

 

--Nate

Joe_Barry
Super User
Super User

Hi @ace12 

 

From which source are you getting your data from? Can you write an SQL query just to bring in the columns you need?

 

I would also do the following

 

  • Remove any unnecessary columns from your Fact table before duplicating.
  • Convert any DateTime columns to Date if you don't need the time. If you Do create a separate column for time.
  • Do you have a Date table in your model? (see my Bio) If yes turn off Auto date/time for new files in both sections of the settings
    Joe_Barry_0-1726810998368.png

     

  • There isalso something I came across recently, you will need to install tabular editor, but it can reduce the size by a few hundred MBs. https://datamozart.substack.com/p/hidden-little-gem-that-can-save-your

 

Hope this helps

Joe




Did I answer your question? Mark my post as a solution!

Proud to be a Super User!





Learn about the Star Schema, it will solve many issues in Power BI!

Date tables help! Learn more



LinkedIn


Let's connect on LinkedIn


I don't have access to SQL on my work computer but that could be an option. Thanks for your response. I didn't realize that the duplication would still be a large file after I removed the columns I didn't need.

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors