Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
karimm
Helper II
Helper II

Incremental refresh on api considering last modified date

Hi all,

 

I'm cretating reports based on JIRA data. 

JIRA contains entities named "Issues" where each issue has fields of "Created" and "Updated" to identify when the issue was created and when it was last modified respectively.

 

The data is being fetched using REST APIs. And the refresh happens every 4 hours.

 

Considering that I have access to a workspace on Premium capacity, I was wondering whether I could use the Incremental Refresh  mechanism in order to allow for quicker refreshes and bigger datasets.

The idea is to first extract all entries from last 2 years for example, and consequently extract only those issues that have been created/modified in the last 2 days only.

 

Any issue can be modified at any point of time. When an issue created long time ago is modified today, I want it to be retrieved in the refresh and replace the respective row in the dataset.     

 

I have read and watched so many blogposts/videos about this feature but I'm still perplexed on whether this can be done, and more specifically the part where "Updated" is in the last 2 days.

If yes, Should I extract the Updated column in the API and filter it to be between RangeStart and RangeEnd, or should I maybe filter on RangeStart and RangeEnd and the use "Detect Changes" option while giving the "Updated" column?

 

Thank you

 

 

1 ACCEPTED SOLUTION
lbendlin
Super User
Super User

Incremental refresh expects immutable data. What you are looking for is differential refresh.

 

One workaround is to make the refresh scope large enough so you can shoehorn your data into the incremental refresh partition paradigm.   Ask yourself what the average maximum period of time is that a issue change can still make a difference in your business process, and then set that as the change period for the incremental refresh (for example "last updated in the last three months".

View solution in original post

7 REPLIES 7
PBILover
Helper V
Helper V

@karimm : Did you got any solution on this?

Sofim86
Frequent Visitor

Hi all! Could anyone share with me how to set incremental refresh upon Jira API? because as far as I know, you can't set it on nonfoldable queries. I really appreciate your help!

as far as I know, you can't set it on nonfoldable queries. 

That is inaccurate.  Folding is desired but not mandatory.  All you need is a DateTime or Date Integer value that can be compared against RangeStart and RangeEnd.

AnonymousPerson
Advocate V
Advocate V

@karimm I landed here with the EXACT same issue with Jira and Updated dates. What did you end up doing?

 

All I can think of is to import my [Jira Issues] table using incremental refresh for the past 2 days, ralizing that there will be duplicate Issue Keys. So my table is basically a heap (instead of having Issue Key as the key), with old obsolete records. 

 

Then, I would create a calculated {Issues} table in DAX using some fancy logic to get the row with the most recent updated timestamp per Issue Key.  That would be the table used in reports/visuals.

 

Another hacky solution I thought about would be to use a Power BI datamart to just pull the data into a SQL format, whish would make it a bit easier (at least for me) to model for uniqueness while still geting 5-second refreshes.

The only "solution"  is to monitor partition "truthiness"  and periodically do full refreshes of these partitions if they become too corrupted.

lbendlin
Super User
Super User

Incremental refresh expects immutable data. What you are looking for is differential refresh.

 

One workaround is to make the refresh scope large enough so you can shoehorn your data into the incremental refresh partition paradigm.   Ask yourself what the average maximum period of time is that a issue change can still make a difference in your business process, and then set that as the change period for the incremental refresh (for example "last updated in the last three months".

Thank you so much for your reply. Much appreciated.

Unfortunately, an issue might be created even 2 years ago and get update today. This might happen mainly due to many automations doing bulk updates.

So using the workaround you suggested, will lead still to big partitions, something that I wanted to avoid.

I hoped the Power BI incremental refresh mechanism would allow me to avoid the need for a more sophisticated DWH. Seems it can't be done in this case 🙂

 

Thank you for your help. I've learnt something important!

 

P.S. I don't remember seeing the assumption of "immutable data" in the documentation. I guess it should be mentioned more explicitly...

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.