Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi,
I have a Gen1 Dataflow in Power BI sourcing data from oracle using on-premise data gateway. This dataflow has 5 tables and 3 tables are configured with incremental refresh and out of these 3 tables, 2 tables are configured with Detect data changes also. we have triggered the first refresh and it should ideally be completed in 5 hrs (since it has ~600M records), but it was running for >18 hrs and we cancelled the refresh.
Then we triggered the refresh again and ideally this should be full refresh again where Power BI puts all data into partitions and refresh logs should display all the partitions. But the second refresh has been completed in 3.5 hrs and suprisingly refresh logs shows only latest partitions (i.e only the partitions where there was data change ). So is it telling us that previous refresh which was cancelled earlier after running for 18 hrs loaded some data and then in next refresh it utilised the same and triggered just incremental refresh?
I am confused with the behaviour Power BI does, can someone please explain this
Then
@vamshikrishna20, Can you please respond to super user probing question to further progress on your ask?
Thanks,
prashanth
Hi @vamshikrishna20 ,
Power BI incremental refresh will only re-process partitions where data has changed, or those that failed in a previous run. When you cancel a refresh after it’s already loaded some partitions, Power BI doesn’t roll everything back it will pick up from the last successful partition next time, only reloading the ones that need it.
So after a cancelled refresh, your next refresh is typically incremental and much faster, which is why you only saw the latest partitions in the logs. This is normal behavior Power BI is designed to save time and resources by not reloading data it already ingested.
If you ever need a true full refresh (reload all partitions from scratch), you’d have to change the incremental refresh policy or delete and re-create the dataflow. But for most cases, letting Power BI handle it this way is fine.
For more insight, check the refresh history and the CSV logs to see exactly which partitions loaded and how many rows were written.
@lbendlin we manually cancelled the refresh and the refresh log for this shows all tables as failed and it shows NA for all other fields
When you ingest the dataflow into a semantic model can you see which partitions are actually filled?
and we cancelled the refresh
Not exactly. You requested a cancellation. The refresh may have advanced too far to be canceled. Look at the resulting CSV log, it should have information about the partitions and rows per partition.