Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more

Reply
Soumeli
New Member

Large data volume

For a report, I have around 500 million data. User wants all historical data, so volume will keep on increasing. How can I handle this much of data in power bi  without filtering on date range? Please help.

7 REPLIES 7
v-tejrama
Community Support
Community Support

Hi @Soumeli ,

 

Thank you @krishnakanth240  for the response provided!


Has your issue been resolved? If the response provided by the community member addressed your query, could you please confirm? It helps us ensure that the solutions provided are effective and beneficial for everyone.


Thank you for your understanding.

krishnakanth240
Continued Contributor
Continued Contributor

Hi @Soumeli 

 

Pattern 1 - Import + Incremental Refresh

Even if users want all history, refresh does not need to reload all history.
How?
Keep historical partitions static
Refresh only recent data (e.g., last 7–30 days)

 

Key point
Incremental Refresh is about refresh cost, not filtering user access.
Required for large datasets
Works with full historical reporting


What not to do

Import raw transactional data at full grain
Rely on slicers to “limit” memory usage
Expect Power BI Pro to behave like a database
Recalculate heavy logic in DAX instead of ETL

 

Recommended Architecture by License
Power BI Pro
Import aggregated fact
Incremental refresh
Drill-through to limited detailMax practical size - 10–30M rows

 

Power BI Premium/PPU
Large semantic model
Aggregations
Composite models
Can scale to 100M+ rows

 

With Fabric / Data Warehouse
Raw Data (500M+)
→ Warehouse / Lakehouse
→ Aggregated Fact Tables
→ Power BI Semantic Model


Power BI is designed to be a semantic and analytical layer.


Finally
For 500M+ and growing
Pre-aggregate data
Using incremental refresh
Exposing full history via aggregates
Allowing  drill-through for detail
Planning for Premium/Fabric if growth continues

 

Please mark it as a solution with headup if this helps you. Thank You!

cengizhanarslan
Solution Sage
Solution Sage

Power BI cannot handle 500M+ growing rows “as-is” in Import mode. With Pro, semantic models are limited to ~1 GB compressed, so this volume will exceed limits very quickly. DirectQuery alone also won’t solve it, queries will be slow and the user experience poor once people start slicing or drilling.

The practical pattern is:

  • Keep raw historical data outside Power BI (SQL, Databricks, Fabric, etc.).

  • Expose pre-aggregated tables (daily / monthly / entity level) to Power BI.

  • Use incremental refresh or composite models so recent data is detailed and older data stays aggregated.
    Users still “see all history”, but not at raw-row level unless they explicitly drill.

If this must keep growing indefinitely, the realistic long-term solution is a warehouse / Fabric Lakehouse + Power BI semantic model. Power BI becomes the semantic and visualization layer, not the place where 500M raw rows live.

_________________________________________________________
If this helped, ✓ Mark as Solution | Kudos appreciated
Connect on LinkedIn
danextian
Super User
Super User

Hi @Soumeli 

 

Do you have the capacity to support data at this scale? I’m asking because one of my semantic models already exceeds 350 million rows and is well beyond the 1 GB dataset limit for a Pro workspace - even after extensive optimization.

 

I’ve stripped the model down as much as possible: the datetime column was split into separate date and time fields, the time was rounded to 30-minute intervals, and all non-essential columns were removed. Despite these steps, the dataset remains over the limit and is expected to grow toward 500 million records and beyond. Given the volume and the need to keep the data in Import mode, incremental refresh is already enabled.

 

I think this needs to go back to the stakeholders. It’s worth clarifying whether keeping all 500 million records is truly necessary, or if it’s simply something they want just in case. An aggregated dataset—by date and a few key grains—may be more than sufficient for the actual reporting needs. Note: I have several just in case semantic models but certainly not at this scale.

 

It would also help to understand what they intend to do with the data. In practice, it’s unlikely anyone will be analyzing or viewing all 500 million rows at that level of detail, so retaining that granularity may not add much value compared to a well-designed aggregated model.

 





Dane Belarmino | Microsoft MVP | Proud to be a Super User!

Did I answer your question? Mark my post as a solution!


"Tell me and I’ll forget; show me and I may remember; involve me and I’ll understand."
Need Power BI consultation, get in touch with me on LinkedIn or hire me on UpWork.
Learn with me on YouTube @DAXJutsu or follow my page on Facebook @DAXJutsuPBI.
Ritaf1983
Super User
Super User

Hi @Soumeli 

Power BI is not designed to visualize hundreds of millions of raw rows directly.
At this scale, the solution is not loading everything, but changing the architecture.

Recommended approach:

Pre-aggregate data in the source (DWH / SQL / Lake)

Use aggregated tables for reporting

Combine Import + DirectQuery with Aggregations

Use Incremental Refresh if Import mode is involved

“All historical data” does not mean “all rows shown at once”.
Power BI visuals work on aggregations, not raw data exploration.

If this post helps, then please consider Accepting it as the solution to help the other members find it more quickly

Regards,
Rita Fainshtein | Microsoft MVP
https://www.linkedin.com/in/rita-fainshtein/
Blog : https://www.madeiradata.com/profile/ritaf/profile
burakkaragoz
Community Champion
Community Champion

Hi @Soumeli ,

You are dealing with a "Big Data" scenario where the standard rules of Power BI Desktop no longer apply. 500 million rows will almost certainly crash a standard Import model (especially on Pro licenses) and will be painfully slow in pure DirectQuery mode without optimization.

Building on the excellent advice about Aggregations, here are two specific architectural patterns you should investigate to make this usable:

1. The "Hybrid Table" Approach (Incremental Refresh) If you are on a Premium (or PPU) capacity, you can use Hybrid Tables.

  • How it works: You configure Incremental Refresh to keep the last 2-3 months of data in Import Mode (for blazing fast speed on recent data) and leave the rest of the 500M rows in DirectQuery Mode (archived).

  • The Benefit: To the user, it looks like one single seamless table. They can scroll back 10 years, and Power BI automatically switches the query method behind the scenes. They get speed where it matters (recent data) and access where it matters (history).

2. Fabric "Direct Lake" (The Modern Solution) If your organization has enabled Microsoft Fabric, this is the silver bullet for your problem.

  • How it works: You load your 500M rows into a Delta Table in OneLake. You connect Power BI using Direct Lake mode.

  • The Benefit: It gives you the performance of Import mode without having to copy the data into memory. It can handle hundreds of millions of rows because it loads columns into memory on-demand from the OneLake storage.

Critical "User Management" Advice Even with these tools, you must manage user expectations.

  • Users typically say "I want raw data," but they usually mean "I want to see the total, and then drill down."

  • Trying to render 500M individual points on a line chart will just show a solid block of color (and likely time out).

  • Ensure your report defaults to a high-level view (Year/Month) and only allows the "Table View" of raw records when the user has filtered down to a manageable subset (e.g., using the "Drillthrough" feature).

Start with Aggregations as suggested above, it is the most robust way to solve this today without requiring new licenses immediately.


If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful.
This response was assisted by AI for translation and formatting purposes.

grazitti_sapna
Super User
Super User

Hi @Soumeli , 

Handling 500 million rows in Power BI Desktop, with full historical data and no date filtering, needs a careful architecture. Power BI can do this—but not with default Import mode.


What you can do is:- 
DirectQuery + Aggregations:- 

This is exactly designed for your scenario.

How it works

  • Detailed historical data (500M rows) → stays in source DB

  • Aggregated data → stored in Power BI Import

  • Power BI automatically switches between them

or 

Push Data to a Semantic Layer

If you control backend:

Best sources for this scale

  • Azure SQL / SQL Server

  • Snowflake

  • Databricks SQL Warehouse

  • Fabric Lakehouse / Warehouse

Use:

  • Pre-aggregated views

  • Partitioned tables

  • Indexed columns

Power BI becomes only a visualization layer.

 

🌟 I hope this solution helps you unlock your Power BI potential! If you found it helpful, click 'Mark as Solution' to guide others toward the answers they need.

💡 Love the effort? Drop the kudos! Your appreciation fuels community spirit and innovation.

🎖 As a proud SuperUser and Microsoft Partner, we’re here to empower your data journey and the Power BI Community at large.

🔗 Curious to explore more? [Discover here].

Let’s keep building smarter solutions together!



Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!

December 2025 Power BI Update Carousel

Power BI Monthly Update - December 2025

Check out the December 2025 Power BI Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.