Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

To celebrate FabCon Vienna, we are offering 50% off select exams. Ends October 3rd. Request your discount now.

Reply
Aleksandra1984
New Member

Power BI: How to Efficiently Analyze 10 Years of Sales Data from PostgreSQL?

Good morning!
I would like to ask for your help. I’m a beginner data analyst. In our company (which operates in the retail sector), we want to implement Power BI analyses and reports. We have a local database (PostgreSQL) that we connect to via ODBC. Currently, our database contains a sales data table with information from the past 10 years. Around 20,000 new records (rows) are added every day.

I’m wondering how to load large datasets into Power BI in a way that allows us to analyze, for example, data from the last 2–3 years. How is this typically done in practice?

I understand that setting up incremental refresh in Power BI would be an efficient solution, but I’m not sure how to do it in practice. Do I first need to load the entire sales data table from the last 10 years into Power Query? Is it even possible to load such a large table into Power Query? I assume there’s a more practical approach. I’ve also read that ODBC doesn’t always handle query folding correctly...

I would appreciate any guidance or suggestions.

2 ACCEPTED SOLUTIONS
DataNinja777
Super User
Super User

Hi @Aleksandra1984 ,

 

You're correct that loading 10 years of data directly into Power BI Desktop is not the practical approach. The standard and most efficient method is to use incremental refresh. This involves a clever strategy where you only work with a small sample of data in Power BI Desktop to build your report, and then configure the Power BI Service to handle the full historical load and subsequent daily updates. This keeps your development environment fast and responsive.

 

Your first step is to define a filterable date range within the Power Query Editor. You achieve this by creating two required, case-sensitive parameters. Create a new parameter named RangeStart with a Date/Time type and set its value to a recent date, like the first of last month. Then, create a second Date/Time parameter named RangeEnd and set its value to the end of that same month. These parameters will define the small sample of data you work with locally.

 

With the parameters created, you'll apply a filter to your sales table's primary date column (e.g., OrderDate). Use the "Custom Filter" option to set up a rule where the date column is after or equal to the RangeStart parameter and is before the RangeEnd parameter. Using "is before" for the end date is crucial to prevent data duplication. After applying this filter, only that small, one-month slice of data will be loaded into your Power BI Desktop file when you click "Close & Apply," allowing you to build your model and visuals quickly.

 

Once your report design is complete, you can configure the incremental refresh policy itself. Back in the main Power BI Desktop view, right-click your sales table and select "Incremental refresh." Here, you'll define the rules for the Power BI Service. You can set it to archive data starting 10 years before the refresh date, which handles your full historical load. Then, you'll configure it to incrementally refresh data from a much shorter, more recent period, such as the last 7 days, to efficiently capture your daily new records.

 

After publishing your report to a Power BI Premium workspace, the initial data refresh will be a long, one-time operation as the service pulls in all 10 years of historical data. However, every scheduled refresh after that will be significantly faster, as it only needs to process and append the last 7 days of data you specified.

You also correctly pointed out a potential issue regarding query folding and ODBC. Query folding is the critical process where Power Query translates your filter steps into a native SQL query that gets executed directly by your PostgreSQL database. This is vital because it means the database does the heavy lifting of filtering the data before sending it to Power BI. A generic ODBC connection can sometimes fail to fold queries correctly. To avoid this, it's highly recommended to use the native PostgreSQL connector built into Power BI. You can verify if folding is working by right-clicking on your filter step in Power Query; if the "View Native Query" option is available, you're in good shape. If it's grayed out, folding has broken, and performance will suffer dramatically.

 

Finally, for optimal performance, remember to model your data using a star schema rather than a single flat table. This means having a central fact table for sales connected to dimension tables for things like products, customers, and dates. Always include a dedicated date dimension table and mark it as such in your model to enable powerful time intelligence calculations in DAX. Your chosen path of using Import mode combined with incremental refresh is the ideal solution for this scenario, providing the best possible analysis performance.

Best regards,

View solution in original post

v-dineshya
Community Support
Community Support

Hi @Aleksandra1984 ,

Thank you for reaching out to the Microsoft Community Forum.

 

Hi @DataNinja777 , Thank you for your prompt response.

 

Hi @Aleksandra1984 , In addition to @DataNinja777  response. I am adding some more points.

 

Please try below things.

 

1. Filter at source side, In Power Query, use a SQL statement to bring only the last 3 years of data. Please refer sample SQL query.

 

SELECT * FROM table WHERE Sales_date >= DATE '2022-01-01'

 

2. Please check indexes on Sale_date and other filter columns. Consider creating materialized views or partitioned tables for better performance.

 

3. Store older than 3 years data in aggregated form. And keep last 3 year detailed transactions.

Note: Always use filter by date to reduce data volume.

 

Please refer below Microsoft official document

Incremental refresh for semantic models in Power BI - Power BI | Microsoft Learn

 

I hope this information helps. Please do let us know if you have any further queries.

 

Regards,

Dinesh

View solution in original post

6 REPLIES 6
Shahid12523
Community Champion
Community Champion

Don’t load all 10 years of sales → keep only last 2–3 years in detail, aggregate older data.

Use PostgreSQL connector (not ODBC) so query folding works.

Set up Incremental Refresh with RangeStart/RangeEnd → Power BI only refreshes recent partitions, old ones stay locked.

If data is huge, consider Hybrid Model (Import recent + DirectQuery old) or pre-aggregate old data in PostgreSQL

 

 

Best approach: PostgreSQL connector + Incremental Refresh.

Shahed Shaikh
v-dineshya
Community Support
Community Support

Hi @Aleksandra1984 ,

Thank you for reaching out to the Microsoft Community Forum.

 

Hi @DataNinja777 , Thank you for your prompt response.

 

Hi @Aleksandra1984 , In addition to @DataNinja777  response. I am adding some more points.

 

Please try below things.

 

1. Filter at source side, In Power Query, use a SQL statement to bring only the last 3 years of data. Please refer sample SQL query.

 

SELECT * FROM table WHERE Sales_date >= DATE '2022-01-01'

 

2. Please check indexes on Sale_date and other filter columns. Consider creating materialized views or partitioned tables for better performance.

 

3. Store older than 3 years data in aggregated form. And keep last 3 year detailed transactions.

Note: Always use filter by date to reduce data volume.

 

Please refer below Microsoft official document

Incremental refresh for semantic models in Power BI - Power BI | Microsoft Learn

 

I hope this information helps. Please do let us know if you have any further queries.

 

Regards,

Dinesh

Hi @Aleksandra1984 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. And, if you have any further query do let us know.

 

Regards,

Dinesh

Hi @Aleksandra1984 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. And, if you have any further query do let us know.

 

Regards,

Dinesh

Hi @Aleksandra1984 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. And, if you have any further query do let us know.

 

Regards,

Dinesh

DataNinja777
Super User
Super User

Hi @Aleksandra1984 ,

 

You're correct that loading 10 years of data directly into Power BI Desktop is not the practical approach. The standard and most efficient method is to use incremental refresh. This involves a clever strategy where you only work with a small sample of data in Power BI Desktop to build your report, and then configure the Power BI Service to handle the full historical load and subsequent daily updates. This keeps your development environment fast and responsive.

 

Your first step is to define a filterable date range within the Power Query Editor. You achieve this by creating two required, case-sensitive parameters. Create a new parameter named RangeStart with a Date/Time type and set its value to a recent date, like the first of last month. Then, create a second Date/Time parameter named RangeEnd and set its value to the end of that same month. These parameters will define the small sample of data you work with locally.

 

With the parameters created, you'll apply a filter to your sales table's primary date column (e.g., OrderDate). Use the "Custom Filter" option to set up a rule where the date column is after or equal to the RangeStart parameter and is before the RangeEnd parameter. Using "is before" for the end date is crucial to prevent data duplication. After applying this filter, only that small, one-month slice of data will be loaded into your Power BI Desktop file when you click "Close & Apply," allowing you to build your model and visuals quickly.

 

Once your report design is complete, you can configure the incremental refresh policy itself. Back in the main Power BI Desktop view, right-click your sales table and select "Incremental refresh." Here, you'll define the rules for the Power BI Service. You can set it to archive data starting 10 years before the refresh date, which handles your full historical load. Then, you'll configure it to incrementally refresh data from a much shorter, more recent period, such as the last 7 days, to efficiently capture your daily new records.

 

After publishing your report to a Power BI Premium workspace, the initial data refresh will be a long, one-time operation as the service pulls in all 10 years of historical data. However, every scheduled refresh after that will be significantly faster, as it only needs to process and append the last 7 days of data you specified.

You also correctly pointed out a potential issue regarding query folding and ODBC. Query folding is the critical process where Power Query translates your filter steps into a native SQL query that gets executed directly by your PostgreSQL database. This is vital because it means the database does the heavy lifting of filtering the data before sending it to Power BI. A generic ODBC connection can sometimes fail to fold queries correctly. To avoid this, it's highly recommended to use the native PostgreSQL connector built into Power BI. You can verify if folding is working by right-clicking on your filter step in Power Query; if the "View Native Query" option is available, you're in good shape. If it's grayed out, folding has broken, and performance will suffer dramatically.

 

Finally, for optimal performance, remember to model your data using a star schema rather than a single flat table. This means having a central fact table for sales connected to dimension tables for things like products, customers, and dates. Always include a dedicated date dimension table and mark it as such in your model to enable powerful time intelligence calculations in DAX. Your chosen path of using Import mode combined with incremental refresh is the ideal solution for this scenario, providing the best possible analysis performance.

Best regards,

Helpful resources

Announcements
September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Kudoed Authors