Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more
Hi everyone,
We are currently migrating to Power BI and need some clarity around how Power BI handles and stores data, especially for long-term planning.
Specifically:
How does Power BI store data internally when using Import or DirectQuery modes?
Where is the data actually stored in both cases?
What is the internal data flow architecture?
Since we are planning to work with historical data spanning 5+ years and also for upcoming years in my organisation, we want to understand:
How scalable is Power BI (especially with Premium Per User) for long-term data retention?
Are there limits or constraints for storing large datasets over several years?
How can we upgrade capacity if we reach storage or performance limits?
Would appreciate guidance or documentation that explains:
The architecture behind data storage and processing in Power BI, Long-term scalability strategies, And how to plan capacity upgrades proactively.
Thanks in advance!
Solved! Go to Solution.
Specifically:
How does Power BI store data internally when using Import or DirectQuery modes?
A: in DirectQuery it does not store anything, data stays in the source, in Import it stores into VertiPaq (a columnar database, compressing data)
Where is the data actually stored in both cases?
A:See above answer
What is the internal data flow architecture?
A. Please clarify what you mean
How scalable is Power BI (especially with Premium Per User) for long-term data retention?
A: not any problem, you can buy a capacity if you need it
Are there limits or constraints for storing large datasets over several years?
A: not if you have enough capacity
How can we upgrade capacity if we reach storage or performance limits?
A: there are several ways, one dinamic and one static. In the dinamic, your storage (and bill) will go up automatically if you exceed the limits, in the other, you will have to upgrade your license manually
If this helped, please consider giving kudos and mark as a solution
@me in replies or I'll lose your thread
Want to check your DAX skills? Answer my biweekly DAX challenges on the kubisco Linkedin page
Consider voting this Power BI idea
Francesco Bergamaschi
MBA, M.Eng, M.Econ, Professor of BI
Thanks, @FBergamaschi , for your response!
To clarify what I meant by internal data flow architecture:
I'm trying to understand how the data flows internally when we use either Import or DirectQuery mode — specifically:
In Import mode, how exactly is the data ingested and transformed through Power Query and then stored into VertiPaq?
Does Power BI apply any compression or optimization during that process?
In DirectQuery, since data isn't stored, how does Power BI handle the query lifecycle and caching?
I'm looking to understand the architecture behind these layers — like:
Data source → Power Query → Model → Storage (VertiPaq or Live Query) → Report rendering.
Also, regarding capacity upgrades:
Could you please explain more about the Dynamic vs Static capacity scaling options?
Since we’re planning a long-term migration to Power BI, understanding how scalable and maintainable this will be over the next 10+ years is important.
Appreciate your insights!
In Import mode, how exactly is the data ingested and transformed through Power Query and then stored into VertiPaq?Data is stored into VertiPaq in separated columns, compressing it based on dictionaries and encoding
Does Power BI apply any compression or optimization during that process?Yes, see above answer
In DirectQuery, since data isn't stored, how does Power BI handle the query lifecycle and caching?All Queries are translated in SQL (or the source language query) and directed to the DataSource when you are in DirectQuery
For capacity upgrades, I believe it is better to refer to Fabric pages, here:
https://learn.microsoft.com/en-us/fabric/enterprise/scale-capacity
as it is a wide topic
If this helped, please consider giving kudos and mark as a solution
@me in replies or I'll lose your thread
Want to check your DAX skills? Answer my biweekly DAX challenges on the kubisco Linkedin page
Consider voting this Power BI idea
Francesco Bergamaschi
MBA, M.Eng, M.Econ, Professor of BI
Specifically:
How does Power BI store data internally when using Import or DirectQuery modes?
A: in DirectQuery it does not store anything, data stays in the source, in Import it stores into VertiPaq (a columnar database, compressing data)
Where is the data actually stored in both cases?
A:See above answer
What is the internal data flow architecture?
A. Please clarify what you mean
How scalable is Power BI (especially with Premium Per User) for long-term data retention?
A: not any problem, you can buy a capacity if you need it
Are there limits or constraints for storing large datasets over several years?
A: not if you have enough capacity
How can we upgrade capacity if we reach storage or performance limits?
A: there are several ways, one dinamic and one static. In the dinamic, your storage (and bill) will go up automatically if you exceed the limits, in the other, you will have to upgrade your license manually
If this helped, please consider giving kudos and mark as a solution
@me in replies or I'll lose your thread
Want to check your DAX skills? Answer my biweekly DAX challenges on the kubisco Linkedin page
Consider voting this Power BI idea
Francesco Bergamaschi
MBA, M.Eng, M.Econ, Professor of BI
Thanks, @FBergamaschi , for your response!
To clarify what I meant by internal data flow architecture:
I'm trying to understand how the data flows internally when we use either Import or DirectQuery mode — specifically:
In Import mode, how exactly is the data ingested and transformed through Power Query and then stored into VertiPaq?
Does Power BI apply any compression or optimization during that process?
In DirectQuery, since data isn't stored, how does Power BI handle the query lifecycle and caching?
I'm looking to understand the architecture behind these layers — like:
Data source → Power Query → Model → Storage (VertiPaq or Live Query) → Report rendering.
Also, regarding capacity upgrades:
Could you please explain more about the Dynamic vs Static capacity scaling options?
Since we’re planning a long-term migration to Power BI, understanding how scalable and maintainable this will be over the next 10+ years is important.
Appreciate your insights!
In Import mode, how exactly is the data ingested and transformed through Power Query and then stored into VertiPaq?Data is stored into VertiPaq in separated columns, compressing it based on dictionaries and encoding
Does Power BI apply any compression or optimization during that process?Yes, see above answer
In DirectQuery, since data isn't stored, how does Power BI handle the query lifecycle and caching?All Queries are translated in SQL (or the source language query) and directed to the DataSource when you are in DirectQuery
For capacity upgrades, I believe it is better to refer to Fabric pages, here:
https://learn.microsoft.com/en-us/fabric/enterprise/scale-capacity
as it is a wide topic
If this helped, please consider giving kudos and mark as a solution
@me in replies or I'll lose your thread
Want to check your DAX skills? Answer my biweekly DAX challenges on the kubisco Linkedin page
Consider voting this Power BI idea
Francesco Bergamaschi
MBA, M.Eng, M.Econ, Professor of BI
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.