Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Reply
ROCKYDO12
Helper III
Helper III

Incremental Refresh and CSV Files

Hey,

 

I have a shared folder set up which has two folders. One which is historical data and the other is data which gets updated daily. These CSV files have a one to one relationship between the historical data folder and the updated data folder. I am trying to combine the historical CSV file with the file thats gets updated daily. I have done this succesfully by adding the folder path and transforming the data so I am only keeping the two files with the one to one relationship and combine.

 

The next step I am creating the RangeStart and RangeEnd date parameters using the date field in my data set. Then I right click on the table and create incremtental refresh details and publish the report.

 

Once I publish the report it wont allow me to refresh and states "Scheduled refresh is disabled because at least one data source is missing credentials". I also read somewhere I have to estblish a gateway connection but I dont have any. Can anyone please help.

 

Thanks

 

 

 

 

8 ACCEPTED SOLUTIONS
lbendlin
Super User
Super User

Here's an article to get you going.

https://community.powerbi.com/t5/Community-Blog/Incremental-Refresh-with-Slow-Changing-Source-Data/b...

 

If you place your CSV files on a sharepoint then you don't need a gateway.

View solution in original post

A gateway will work if needed, but it's a lot of extra effort - that gateway needs to stay up 24x7, needs to be updated every month, is very finicky if you have proxies etc etc. If you can avoid that then your solution will be much easier.

View solution in original post

Check the code sample in the blog post - it enumerates all CSV files in a particular location and then decides which ones to load based on some sort of datestamp - could be last modified date, or a date that is part of the file name.

 

(Note: Incremental Refresh requires DateTime values so you need to do a little bit of casting)

View solution in original post

That is entirely up to you.  It depends on the size of the CSV files - try not to go over 500-ish MB per file.  It also depends on the "reach"  of changes you expect for older data. In our case we can see transactions change not only for the current quarter but also for prior quarters, so our buffer for the incremental refresh is bigger.  If your data is immutable (doesn't change after submission) then you can use a much tighter interval.

 

NOTE: Incremental Refresh is not the same as selective refresh.  If you want selective refresh you can look into refreshing individual partitions in your dataset.

 

https://docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-xmla

View solution in original post

"Essentially, we just want to incrementally refresh any changes to the old data, while loading in the new data. "

 

That is exactly what incremental refresh cannot do for you.  This would be a selective or differential refresh.  As I mentioned you can fudge it by enforcing a wide enough flush-and-fill window.  For example instead of 

 

"keep last five years and refresh last 7 days"

 

you would specify

 

"keep last five years and refresh last three months" 

 

with the assumption that your data changes all happen for that window.  If you have a risk of older data changing then you need to make the window wider, or do an occasional full refresh or refresh impacted partitions. 

 

There is that "detect data changes"  option but I haven't found it to be useful for this scenario yet as it requires a different column - which we mostly don't have.

View solution in original post

If you have that field and are confident that it does what it says then yes, you can specify it in the extra setting. That is supposed to instruct the Power BI service to refresh all partitions that fall into that range.  You will still want to validate that it does this correctly, and you don't end up with duplicate, missing or un-updated data.

 

This article is worth reading: https://docs.microsoft.com/en-us/power-bi/admin/service-premium-incremental-refresh#override-increme...

 

I assume you know how to check partition refreshes with SSMS or DAX Studio?

View solution in original post

Sorry, I don't have experience beyond Premium Capacity, and no sandbox available to test other setups. Maybe someone else can chime in.

View solution in original post

Premium is a capacity, not a license.  The smallest capacity is a P1.

 

What you can look into is PPU - Premium per user. That might work in your senario.

View solution in original post

20 REPLIES 20
lbendlin
Super User
Super User

Here's an article to get you going.

https://community.powerbi.com/t5/Community-Blog/Incremental-Refresh-with-Slow-Changing-Source-Data/b...

 

If you place your CSV files on a sharepoint then you don't need a gateway.

Hi @lbendlin ,

 

I am trying to do incremental refresh based on file name (.csv files) from sharepoint. I've tried using your link but didn't work. It's a sharepoint folder where we have all the files in this format "monthly_cost_hourly_2021-01, monthly_cost_hourly_2021-02 etc.,". I want to retain last two years of .csv files but refresh runs only for current month. Do you've anything handy based on my requirement?  

Regards,

Laiq

Please open a new post if you have a different question.  In your case you need to convert the file name 

 

monthly_cost_hourly_2021-01

 

into a DateTime value like

 

2021-01-01T00:00:00.000

 

Which you can then use as a filter for RangeStart and RangeEnd.

If I connect with a shared folder on local will the gateway work? Or does the gateway need to work with a local database connection.  Would I need SQL server setup on local to place the data from the shared folder to have the gateway work?

Thanks for sharing the link. I took a quick look through and this will help me a lot! What is the benifit of putting the files on sharepoint? Is there anyway around this or will a gateway work with CSV files?

A gateway will work if needed, but it's a lot of extra effort - that gateway needs to stay up 24x7, needs to be updated every month, is very finicky if you have proxies etc etc. If you can avoid that then your solution will be much easier.

Amazing, that you! So one more questions... Is there a way I can connect my shared folder (Multiple CSV files included) in sharepoint or would I have to add each file? I have a data export updating these CSV files on shared drive. If these files get updated on shared drive would they do the same on sharepoint? Any good links?

Check the code sample in the blog post - it enumerates all CSV files in a particular location and then decides which ones to load based on some sort of datestamp - could be last modified date, or a date that is part of the file name.

 

(Note: Incremental Refresh requires DateTime values so you need to do a little bit of casting)

Okay Thank you. Do I need to extract the CSV files in months or quarters in order to do Incremental refresh or will it work just using one CSV file?

That is entirely up to you.  It depends on the size of the CSV files - try not to go over 500-ish MB per file.  It also depends on the "reach"  of changes you expect for older data. In our case we can see transactions change not only for the current quarter but also for prior quarters, so our buffer for the incremental refresh is bigger.  If your data is immutable (doesn't change after submission) then you can use a much tighter interval.

 

NOTE: Incremental Refresh is not the same as selective refresh.  If you want selective refresh you can look into refreshing individual partitions in your dataset.

 

https://docs.microsoft.com/en-us/power-bi/connect-data/incremental-refresh-xmla

Okay good to know thanks. Essentially, we just want to incrementally refresh any changes to the old data, while loading in the new data. The purpose of this would be to avoid a full refresh when we can just only refresh only the changed data in the old and detect any new data. Would this be a selective rehresh then? In the past using a different software I would use Record ID to preform this refresh, rather than date/datetime field so its all new to me.

"Essentially, we just want to incrementally refresh any changes to the old data, while loading in the new data. "

 

That is exactly what incremental refresh cannot do for you.  This would be a selective or differential refresh.  As I mentioned you can fudge it by enforcing a wide enough flush-and-fill window.  For example instead of 

 

"keep last five years and refresh last 7 days"

 

you would specify

 

"keep last five years and refresh last three months" 

 

with the assumption that your data changes all happen for that window.  If you have a risk of older data changing then you need to make the window wider, or do an occasional full refresh or refresh impacted partitions. 

 

There is that "detect data changes"  option but I haven't found it to be useful for this scenario yet as it requires a different column - which we mostly don't have.

Okay amazing this saves me so much time on my end trying to test and figure this out. That column would I need for detect data changes? I have a Last Changed on field in my data set.

If you have that field and are confident that it does what it says then yes, you can specify it in the extra setting. That is supposed to instruct the Power BI service to refresh all partitions that fall into that range.  You will still want to validate that it does this correctly, and you don't end up with duplicate, missing or un-updated data.

 

This article is worth reading: https://docs.microsoft.com/en-us/power-bi/admin/service-premium-incremental-refresh#override-increme...

 

I assume you know how to check partition refreshes with SSMS or DAX Studio?

I do not, any link or explanation would be very helpful.

I am using Power BI Pro and it states that only Datasets on Premium capacities support using tools like SQL Server Management Studio (SSMS) to view and selectively refresh individual partitions. Power BI Pro datasets on the other hand do not provide tool access through the XMLA endpoint, so troubleshooting incremental refresh issues may require a little more trial and error. Could you shed some light on how I could preform this?

Sorry, I don't have experience beyond Premium Capacity, and no sandbox available to test other setups. Maybe someone else can chime in.

No problem, thank you. Could I just purchase one premium license to develop the report and incremental refresh so I can use this method of testing? And the rest of the users would have pro license. My question is can I publish, set up incremental refresh using premium and then have pro users use these reports?

Premium is a capacity, not a license.  The smallest capacity is a P1.

 

What you can look into is PPU - Premium per user. That might work in your senario.

Helpful resources

Announcements
Power BI DataViz World Championships

Power BI Dataviz World Championships

The Power BI Data Visualization World Championships is back! It's time to submit your entry.

January Power BI Update Carousel

Power BI Monthly Update - January 2026

Check out the January 2026 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.