Solved: How to exclude large historical table from refresh...

katushka_enko · ‎03-16-2025

Hi Power BI Community,

I’m facing an issue with refresh performance in my Power BI report due to a large historical dataset. I’ve tried the Hot & Cold Data approach, but the historical table is still being queried during refresh, even though it’s marked as "not included in report refresh."

My setup:

Historical Table (COLD): ~200 million rows, 93 columns, stored in an SQL database.
- Enable Load: Enabled
- Include in Report Refresh: Disabled
Current Table (HOT): ~17 million rows, but growing every month.
- Enable Load: Enabled
- Include in Report Refresh: Enabled
Final Table (APPEND): Combining both tables in Power Query using Table.Combine({Historical, Current}).

The issue:

Even though Historical is excluded from refresh, Power BI still queries it when updating the final appended table.
The Incremental Refresh approach is not feasible due to file size constraints and the number of columns (93).
The SQL queries are simple SELECT statements, but they take too long due to the large data volume.
The goal is to only refresh the Current table while keeping the Historical table static in the model.

What I’ve tried so far:

Disabled refresh for Historical – but Power BI still queries it during Append.
Table.Buffer() – didn’t work, as Power BI still re-executes the SQL query.
Using Dataflows – not an option at the moment.
DirectQuery for Historical – makes queries slower due to large data volume.

My Question:

1 - How can I append the Current and Historical tables while ensuring that Power BI does NOT re-query the Historical data during refresh?
2 - Is there a way to store the Historical table in the model without it being queried every time the Current table refreshes?

Any advice or workarounds would be greatly appreciated!

katushka_enko · ‎04-07-2025

Hi everyone!
I wanted to share how I finally resolved the issue I posted about earlier regarding excluding a large historical table from refresh in Power BI.

🧩 The problem recap:

We had a large fact table (~200M rows, 93 columns), and incremental refresh failed during full dataset refresh due to timeout and memory limits – partitions weren’t being created.
We attempted a Hot & Cold Data approach by splitting into historical and current tables in Power Query and appending them, but even with “Include in refresh” disabled for historical, Power BI still queried it during refresh.
Using DAX UNION to combine both tables increased dataset size and hurt performance due to additional in-memory tables and complex measures.

✅ What worked:

We switched to custom partitioning using Tabular Editor and SQL Server Management Studio (SSMS).
I created one partition per year manually and gradually loaded data year by year.
This approach allowed us to:
- Successfully configure incremental refresh.
- Avoid timeouts.
- Reduce memory pressure during refresh.

🚀 Bonus:

Once all partitions were added and processed, incremental refresh ran smoothly.
This method avoided unnecessary reloading of historical data during refresh and ensured performance remained stable.

I hope this helps someone facing a similar challenge. Feel free to ask if you’d like more technical details! 😊

View solution in original post

katushka_enko · ‎04-07-2025

Hi everyone!
I wanted to share how I finally resolved the issue I posted about earlier regarding excluding a large historical table from refresh in Power BI.

🧩 The problem recap:

We had a large fact table (~200M rows, 93 columns), and incremental refresh failed during full dataset refresh due to timeout and memory limits – partitions weren’t being created.
We attempted a Hot & Cold Data approach by splitting into historical and current tables in Power Query and appending them, but even with “Include in refresh” disabled for historical, Power BI still queried it during refresh.
Using DAX UNION to combine both tables increased dataset size and hurt performance due to additional in-memory tables and complex measures.

✅ What worked:

We switched to custom partitioning using Tabular Editor and SQL Server Management Studio (SSMS).
I created one partition per year manually and gradually loaded data year by year.
This approach allowed us to:
- Successfully configure incremental refresh.
- Avoid timeouts.
- Reduce memory pressure during refresh.

🚀 Bonus:

Once all partitions were added and processed, incremental refresh ran smoothly.
This method avoided unnecessary reloading of historical data during refresh and ensured performance remained stable.

I hope this helps someone facing a similar challenge. Feel free to ask if you’d like more technical details! 😊

v-menakakota · ‎04-08-2025

Hi @katushka_enko ,

Please accept your reply as accept as solution,so the community members can easily find it.

Thank you.

v-menakakota · ‎03-23-2025

Hi @katushka_enko ,

I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.

Thank you.

v-menakakota · ‎03-26-2025

Hi @katushka_enko ,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-menakakota · ‎03-30-2025

Hi @katushka_enko ,

I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.

Thank you.

BA_Pete · ‎03-17-2025

Hi @katushka_enko ,

Firstly, disabling refresh on a query only works if that query is not subsequently referenced later. The fact you are appending your current and historical tables means that your history table is referenced and, therefore, forced to refresh regardless of this setting. You could try using this setting but keeping the current/historical fact tables separate (i.e. no subsequent references to historical), but this may cause you further complications with time intelligence etc. Or you could potentially use this method but then UNION the two tables in the model using DAX, but you're adding quite a significant processing step by doing this.

For what you're trying to achieve here, and the sheer number of rows (and growing) that you're dealing with, I think you will need to use either incremental refresh and/or custom partitions. If your historical data is 100% cold and has absolutely 0% chance of needing to be refreshed then you could get away with just using incremental refresh, but I think a 'belt and braces' approach using both would be prudent (what happens when current data needs to be archived to cold storage, for example?).

Using custom partitions will allow you to load your incremental refresh history in bite-size chunks, as well as only refreshing selected chunks going forward.

There's plenty of resources online about how to set up custom partitions but, I'll warn you, it's not for the faint-hearted!

Pete

Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!

katushka_enko · ‎03-17-2025

Hi @BA_Pete ,

Thanks for your response!

We actually used Incremental Refresh, and it worked well initially. However, the problem arose when we had to make changes to the dataset – each time we modified it, we had to refresh the entire dataset from scratch. Since the dataset size and number of records increased significantly, Incremental Refresh started failing consistently, throwing timeout errors. Eventually, the capacity got overloaded and stopped working altogether.

We also tried managing partitions via XMLA, but that required loading both the historical and current tables into the model and then combining them using DAX UNION, which didn’t seem optimal in terms of performance.

A few questions regarding your recommendations:

1) Do you have any recommendations for handling Incremental Refresh in cases where dataset size is too large, and full refresh becomes unmanageable?
2) Regarding custom partitions – are there any hidden pitfalls or challenges we should be aware of before implementing this approach?

BA_Pete · ‎03-18-2025

Hi @katushka_enko ,

1) Yes, load the incremental refresh history in chunks using custom partitions.

2) Yes:

i) It's pretty technical and requires more advanced skills and knowledge. It's not the sort of thing you will easily be able to hand over to a new/junior developer and expect them to be able to pick it up straight away.

ii) Using XMLA endpoints prevents you from being able to download items (models etc.) back out of the Service. This is no different from incrementally refreshed models, but can have wider-reaching effects depending on how many touchpoints you actually use this method for.

As I said previously, custom partitions aren't for the faint of heart. In addition to just getting the partitions created that can be run manually, you'll probably want/need to set up some Power Automate structure(s) behind it to handle the loads and manage partition successes/failures etc.

You can also do all this into dataflows, set the Enhanced Compute Engine to ON, then Direct Query + Agg table into your model for reporting, or you can just break up all of your history and current data into dataflows of smaller equal parts (e.g. fiscal year or similar) and slowly combine them through a series of dataflows or within your model.

Once you start getting into these types of data sizes, your options obviously become more limited. I don't think you'll get to a place where it performs like a 1m row import model. You'll probably need to test a number of these options, or combinations of them, and pick whichever is the best of the bad bunch.

Pete

Now accepting Kudos! If my post helped you, why not give it a thumbs-up?

Proud to be a Datanaut!

v-menakakota · ‎03-18-2025

Hi @katushka_enko ,

Thank you for reaching out to us on the Microsoft Fabric Community Forum.

Instead of refreshing all partitions, configure incremental refresh to retain only the necessary historical data while limiting refreshes to recent partitions. For example, store up to five years of data but refresh only the last three months.

Rather than maintaining a single large dataset, break it into smaller datasets based on business logic (e.g., by year or region) and refresh them individually.
If using Power BI Premium, Hybrid Tables allow mixing Import mode (for Current) and DirectQuery (for Historical), reducing memory usage.

Custom partitions provide better control over refresh but require XMLA scripting and management. Here are potential pitfalls:
1.Even with partitioning, if the total dataset is too large, it can still hit capacity limits.
2.Requires using tools like SSMS, Tabular Editor, or Power BI REST APIs.

If this post was helpful, please give us Kudos and consider marking Accept as solution to assist other members in finding it more easily.

katushka_enko · ‎03-17-2025

Hi Pete,

Thanks for your response!

We actually used Incremental Refresh, and it worked well initially. However, the problem arose when we had to make changes to the dataset – each time we modified it, we had to refresh the entire dataset from scratch. Since the dataset size and number of records increased significantly, Incremental Refresh started failing consistently, throwing timeout errors. Eventually, the capacity got overloaded and stopped working altogether.

We also tried managing partitions via XMLA, but that required loading both the historical and current tables into the model and then combining them using DAX UNION, which didn’t seem optimal in terms of performance.

A few questions regarding your recommendations:

1) Do you have any recommendations for handling Incremental Refresh in cases where dataset size is too large, and full refresh becomes unmanageable?
2) Regarding custom partitions – are there any hidden pitfalls or challenges we should be aware of before implementing this approach?

Akash_Varuna · ‎03-16-2025

Hi @katushka_enko You could check these for workarounds to your questions

To append the Current and Historical tables without re-querying the Historical table during refresh, you can export the Historical table as a static file (CSV or Parquet). Import this file into Power BI, disable "Include in Report Refresh," and keep it static. Then, append it to the Current table in Power Query or use a DAX-calculated table (UNION(Historical, Current)) within the model.
To store the Historical table in the model without it being queried every time the Current table refreshes, load the Historical table once as a static dataset and ensure it is excluded from refresh. Avoid transformations involving the Historical table in Power Query to prevent Power BI from triggering a query.
If this post helped please do give a kudos and accept this as a solution
Thanks In Advance

How to exclude large historical table from refresh in Power BI?

My setup:

The issue:

What I’ve tried so far:

My Question:

🧩 The problem recap:

✅ What worked:

🚀 Bonus:

🧩 The problem recap:

✅ What worked:

🚀 Bonus:

A few questions regarding your recommendations:

Helpful resources

Power BI Monthly Update - September 2025

Fabric Community Update - August 2025

FabCon is coming to Atlanta

How to exclude large historical table from refresh in Power BI?

My setup:

The issue:

What I’ve tried so far:

My Question:

🧩 The problem recap:

✅ What worked:

🚀 Bonus:

🧩 The problem recap:

✅ What worked:

🚀 Bonus:

A few questions regarding your recommendations:

Helpful resources

Power BI Monthly Update - September 2025

Fabric Community Update - August 2025