Solved: PowerBI Incremental update of fact table partition...

PeterBJones · ‎02-02-2025

Hello Power BI Experts,

we have a question and we are wondering if anyone else out there knows the answer.

We are testing partitioned updates of summaries in PowerBI service. We have test data from a data warehouse where there are many summary tables in the data warehouse. We are testing partitioning by month for a start.

We have added an audit timestamp to the rows in the summary table and we have added the first of the month actual date to the summary fact table rather than just the integer key. Then we set up powerBI to tell it that the table was partitioned. We loaded the whole table. We changed the audit timestamp on one row in one partition and that partition was then deleted and reloaded.

This is all working exactly as documented and we are really happy this feature is in there. We had not heard of it before so if anyone else does not know about this then you might want to read up on it because it's a very valuable feature.

My question is that PBI is generating a LOT of SQL into the database to look at all tables and all columns and all constraints. This one test generated 130K of SQL running against the database. An example being as follows below.

Now, I can understand why it is looking at vf_sale_txn_sum_01_inc. This is our test table that has sales in it for incremental update testing. So it has to look at that. But it is looking at every other table and view in the database as well in the "or" condition.

And since we plan to have a LOT of tables / views in out data warehouses this is a problem.

If it has to generate this code and run it for this incremental update then what we would have to do is to have subject area databases where the number of views in the subject area is lower and the database is just a set of views pointing back into the data warehouse.

This is no problem. And we can do it very easily if we have to. So we are just wondering why this very large amount of SQL is being generated to look at the metadata for all the tables.

And, we are wondering if there is an option to set it to just look at the tables needed that is easier than creating subject area databases pointing back in to the data warehouse?

Thank You in advance for any assistance you may be able to give us.

Peter Jones

bida.ro

select
convert(nvarchar, fk.object_id) [FK_NAME], cast(f.constraint_column_id as bigint) [ORDINAL],
s1.name [TABLE_SCHEMA_1], o1.name [TABLE_NAME_1], c1.name [PK_COLUMN_NAME_1],
s2.name [TABLE_SCHEMA_2], o2.name [TABLE_NAME_2], c2.name [PK_COLUMN_NAME_2],
f.constraint_object_id, f.constraint_column_id
from sys.foreign_key_columns f join sys.foreign_keys fk on f.constraint_object_id = fk.object_id
join sys.objects o1 on o1.object_id = f.parent_object_id join sys.schemas s1 on s1.schema_id = o1.schema_id
join sys.objects o2 on o2.object_id = f.referenced_object_id join sys.schemas s2 on s2.schema_id = o2.schema_id
join sys.columns c1 on c1.object_id = o1.object_id and c1.column_id = f.parent_column_id
join sys.columns c2 on c2.object_id = o2.object_id and c2.column_id = f.referenced_column_id
where ((s1.name = N'dbo' and o1.name = N'vf_sale_txn_sum_01_inc')
or (s1.name = N'dbo' and o1.name = N'vm_xxxxxx')
-----up to 100 tables it seems.
)
union
select
convert(nvarchar, fk.object_id) [FK_NAME], cast(f.constraint_column_id as bigint) [ORDINAL],
s1.name [TABLE_SCHEMA_1], o1.name [TABLE_NAME_1], c1.name [PK_COLUMN_NAME_1],
s2.name [TABLE_SCHEMA_2], o2.name [TABLE_NAME_2], c2.name [PK_COLUMN_NAME_2],
f.constraint_object_id, f.constraint_column_id
from sys.foreign_key_columns f join sys.foreign_keys fk on f.constraint_object_id = fk.object_id
join sys.objects o1 on o1.object_id = f.parent_object_id join sys.schemas s1 on s1.schema_id = o1.schema_id
join sys.objects o2 on o2.object_id = f.referenced_object_id join sys.schemas s2 on s2.schema_id = o2.schema_id
join sys.columns c1 on c1.object_id = o1.object_id and c1.column_id = f.parent_column_id
join sys.columns c2 on c2.object_id = o2.object_id and c2.column_id = f.referenced_column_id
where ((s2.name = N'dbo' and o2.name = N'vf_sale_txn_sum_01_inc')
or (s2.name = N'dbo' and o2.name = N'vm_xxxxxx')
-----up to 100 tables it seems.
)
order by f.constraint_object_id, f.constraint_column_id;

nilendraFabric · ‎02-02-2025

Hello @PeterBJones

Great observation first of all.

First of all lets understand why PowerBI is doing such broader scan.

Power BI prioritizes data reliability over performance in complex environments

Power BI performs these broad schema checks during incremental refreshes to enforce data integrity and ensure compatibility across your entire data model, even for tables not directly used in the report.

Here’s why this happens:
1. Relational Integrity Enforcement
Power BI automatically validates foreign key relationships to prevent orphaned records or broken dependencies. For example:
• If `vf_sale_txn_sum_01_inc` has a FK to `vm_xxxxxx`, Power BI scans all related tables to ensure:
• Referenced rows exist.
• Data types match.
• Constraints are respected.
This generates `sys.foreign_keys` queries with `OR` conditions for all tables in the same schema, even unused ones.

2. Schema Consistency Validation
Power BI checks `INFORMATION_SCHEMA` and system tables to:
• Detect schema changes (e.g., new columns, modified data types).
• Verify partition boundaries align with source data structures.

3. Query Folding Verification
If transformations (e.g., filters, calculated columns) break query folding, Power BI reverts to in-memory processing, triggering:
• Full scans of system tables to rebuild metadata.
• Redundant checks on all tables to compensate for lost server-side optimizations.

few thing which you can try to minimise it:

Use explicit SQL queries in Power Query to push filters to the source, avoiding GUI-based transformations that break folding.

https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-troubleshoot

Isolate Incremental Tables
Move the table to a dedicated schema to limit FK checks

https://www.sqlshack.com/an-overview-of-power-bi-incremental-refresh

Create filtered views to exclude non-essential FKs

If you have premium Use XMLA endpoints to skip metadata check

Hope this helps.

Thanks

View solution in original post

Poojara_D12 · ‎02-09-2025

Hi @PeterBJones

Power BI is generating a large amount of SQL queries to retrieve metadata about all tables, columns, and constraints in your database, rather than focusing only on the partitioned table used for incremental updates. This happens because Power BI performs a metadata discovery process to ensure schema consistency before executing updates. However, this can lead to excessive overhead, especially in large data warehouses with numerous tables and views. To reduce this, you can create a separate schema or subject-area database with only the required tables, limiting unnecessary queries. Another option is restricting Power BI’s database permissions so it only has access to relevant tables. If applicable, using DirectQuery instead of Import mode may also help minimize metadata queries. Additionally, optimizing views to abstract necessary tables or monitoring query execution in SQL Profiler can provide insights for performance tuning. Since Power BI does not offer direct control over metadata scans, restructuring the database setup is the most effective way to reduce unwanted queries.

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos"

Kind Regards,
Poojara - Proud to be a Super User
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS

View solution in original post

Poojara_D12 · ‎02-09-2025

Hi @PeterBJones

Power BI is generating a large amount of SQL queries to retrieve metadata about all tables, columns, and constraints in your database, rather than focusing only on the partitioned table used for incremental updates. This happens because Power BI performs a metadata discovery process to ensure schema consistency before executing updates. However, this can lead to excessive overhead, especially in large data warehouses with numerous tables and views. To reduce this, you can create a separate schema or subject-area database with only the required tables, limiting unnecessary queries. Another option is restricting Power BI’s database permissions so it only has access to relevant tables. If applicable, using DirectQuery instead of Import mode may also help minimize metadata queries. Additionally, optimizing views to abstract necessary tables or monitoring query execution in SQL Profiler can provide insights for performance tuning. Since Power BI does not offer direct control over metadata scans, restructuring the database setup is the most effective way to reduce unwanted queries.

Did I answer your question? Mark my post as a solution, this will help others!
If my response(s) assisted you in any way, don't forget to drop me a "Kudos"

Kind Regards,
Poojara - Proud to be a Super User
Data Analyst | MSBI Developer | Power BI Consultant
Consider Subscribing my YouTube for Beginners/Advance Concepts: https://youtube.com/@biconcepts?si=04iw9SYI2HN80HKS

v-ssriganesh · ‎02-09-2025

Hi @PeterBJones,
I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions. If my response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.

v-ssriganesh · ‎02-06-2025

Hi @PeterBJones,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

v-ssriganesh · ‎02-03-2025

Hello @PeterBJones,
Thank you for your detailed observation and question. It’s great to hear that partitioned updates in Power BI are working well for you! Now, addressing your concern about the extensive SQL queries generated during incremental refresh. The response provided by the @nilendraFabric is indeed correct. I’d like to add a few additional points that might help further optimize your scenario.

Ensure your database has proper indexing on system tables (e.g: sys.foreign_keys, sys.objects) to improve query performance.

If you're using Power BI Service, dataflows can pre-process and store incremental updates before loading into the model, reducing direct metadata queries.

If your database supports native partitioning, structuring partitions at the source can further optimize incremental refresh behavior.

If you're using SQL Server, running SQL Server Profiler during refreshes can help analyse the exact queries Power BI generates and identify possible optimizations.

If this helps, then please Accept it as a solution and dropping a "Kudos" so other members can find it more easily.
Thank you.

v-ssriganesh · ‎02-11-2025

Hi @PeterBJones,
I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.
Thank you.

nilendraFabric · ‎02-02-2025

Hello @PeterBJones

Great observation first of all.

First of all lets understand why PowerBI is doing such broader scan.

Power BI prioritizes data reliability over performance in complex environments

Power BI performs these broad schema checks during incremental refreshes to enforce data integrity and ensure compatibility across your entire data model, even for tables not directly used in the report.

Here’s why this happens:
1. Relational Integrity Enforcement
Power BI automatically validates foreign key relationships to prevent orphaned records or broken dependencies. For example:
• If `vf_sale_txn_sum_01_inc` has a FK to `vm_xxxxxx`, Power BI scans all related tables to ensure:
• Referenced rows exist.
• Data types match.
• Constraints are respected.
This generates `sys.foreign_keys` queries with `OR` conditions for all tables in the same schema, even unused ones.

2. Schema Consistency Validation
Power BI checks `INFORMATION_SCHEMA` and system tables to:
• Detect schema changes (e.g., new columns, modified data types).
• Verify partition boundaries align with source data structures.

3. Query Folding Verification
If transformations (e.g., filters, calculated columns) break query folding, Power BI reverts to in-memory processing, triggering:
• Full scans of system tables to rebuild metadata.
• Redundant checks on all tables to compensate for lost server-side optimizations.

few thing which you can try to minimise it:

Use explicit SQL queries in Power Query to push filters to the source, avoiding GUI-based transformations that break folding.

https://learn.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-troubleshoot

Isolate Incremental Tables
Move the table to a dedicated schema to limit FK checks

https://www.sqlshack.com/an-overview-of-power-bi-incremental-refresh

Create filtered views to exclude non-essential FKs

If you have premium Use XMLA endpoints to skip metadata check

Hope this helps.

Thanks