The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi all,
I'm cretating reports based on JIRA data.
JIRA contains entities named "Issues" where each issue has fields of "Created" and "Updated" to identify when the issue was created and when it was last modified respectively.
The data is being fetched using REST APIs. And the refresh happens every 4 hours.
Considering that I have access to a workspace on Premium capacity, I was wondering whether I could use the Incremental Refresh mechanism in order to allow for quicker refreshes and bigger datasets.
The idea is to first extract all entries from last 2 years for example, and consequently extract only those issues that have been created/modified in the last 2 days only.
Any issue can be modified at any point of time. When an issue created long time ago is modified today, I want it to be retrieved in the refresh and replace the respective row in the dataset.
I have read and watched so many blogposts/videos about this feature but I'm still perplexed on whether this can be done, and more specifically the part where "Updated" is in the last 2 days.
If yes, Should I extract the Updated column in the API and filter it to be between RangeStart and RangeEnd, or should I maybe filter on RangeStart and RangeEnd and the use "Detect Changes" option while giving the "Updated" column?
Thank you
Solved! Go to Solution.
Incremental refresh expects immutable data. What you are looking for is differential refresh.
One workaround is to make the refresh scope large enough so you can shoehorn your data into the incremental refresh partition paradigm. Ask yourself what the average maximum period of time is that a issue change can still make a difference in your business process, and then set that as the change period for the incremental refresh (for example "last updated in the last three months".
Hi all! Could anyone share with me how to set incremental refresh upon Jira API? because as far as I know, you can't set it on nonfoldable queries. I really appreciate your help!
as far as I know, you can't set it on nonfoldable queries.
That is inaccurate. Folding is desired but not mandatory. All you need is a DateTime or Date Integer value that can be compared against RangeStart and RangeEnd.
@karimm I landed here with the EXACT same issue with Jira and Updated dates. What did you end up doing?
All I can think of is to import my [Jira Issues] table using incremental refresh for the past 2 days, ralizing that there will be duplicate Issue Keys. So my table is basically a heap (instead of having Issue Key as the key), with old obsolete records.
Then, I would create a calculated {Issues} table in DAX using some fancy logic to get the row with the most recent updated timestamp per Issue Key. That would be the table used in reports/visuals.
Another hacky solution I thought about would be to use a Power BI datamart to just pull the data into a SQL format, whish would make it a bit easier (at least for me) to model for uniqueness while still geting 5-second refreshes.
The only "solution" is to monitor partition "truthiness" and periodically do full refreshes of these partitions if they become too corrupted.
Incremental refresh expects immutable data. What you are looking for is differential refresh.
One workaround is to make the refresh scope large enough so you can shoehorn your data into the incremental refresh partition paradigm. Ask yourself what the average maximum period of time is that a issue change can still make a difference in your business process, and then set that as the change period for the incremental refresh (for example "last updated in the last three months".
Thank you so much for your reply. Much appreciated.
Unfortunately, an issue might be created even 2 years ago and get update today. This might happen mainly due to many automations doing bulk updates.
So using the workaround you suggested, will lead still to big partitions, something that I wanted to avoid.
I hoped the Power BI incremental refresh mechanism would allow me to avoid the need for a more sophisticated DWH. Seems it can't be done in this case 🙂
Thank you for your help. I've learnt something important!
P.S. I don't remember seeing the assumption of "immutable data" in the documentation. I guess it should be mentioned more explicitly...
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
107 | |
79 | |
72 | |
46 | |
39 |
User | Count |
---|---|
135 | |
108 | |
69 | |
64 | |
56 |