Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
eisaac
Advocate I
Advocate I

Data Pipeline Scheduled Run reports Success but doesn't modify Warehouse

I've setup a pipeline that pulls data from a lakehouse (with shortcuts to tables in another workspace) into a data-warehouse via two stored procedures.

 

It succeeds when I run the pipline manually, or when I schedule it for the near future (5 - 15 minutes from the current time). However, every overnight run reports success, runs for the appropriate time, but does not seem to actually modify the data-warehouse (I'm checking via the max time-stamp in the source vs. the destination tables).

 

Has anyone else observed anything similar, or know an appropriate workaround for the time being? I will file a bug as soon as I find the appropriate place to do so (I'm relatively new to Fabric, and these forums). 

 

More info - I've tried having the data pipleine use scripts instead of stored procedures, adding a step that runs the same time-stamp check in the warehouse (to check if it may be a sync issue, and my test queries were forcing an endpoint refresh), and tried deleting and recreating the pipeline already. I've also tried increasing the time between the ingest in the source workspace and the pipeline, but still saw the same behavior (the original period I tried was a 2 hour difference between the two, increased to 4 hours to test if that was an issue).

1 ACCEPTED SOLUTION
eisaac
Advocate I
Advocate I

Support got back to me quickly; we found a workaround that involves running the pipeline twice (the second time, it succeeds). It sounds like I'm not the only person who has observed this bug, and the product team is working on a fix.

View solution in original post

14 REPLIES 14
renevanduren
New Member

Hello,

I am seeing the same behaviour with data updates in a lakehouse when running an orchestration data pipeline.

Running it twice is a workaround but comes at a (capacity) cost. Did someone have found a better solution in the mean time or is this still an issue?

Hi, I did found a solution on a separate post

issue is that the SQL Lakehouse endpoint can take a while to update, in the solution there is a Python script that manually refreshes the SQL-endpoint, I added this notebook to the piepline after any lakehouse update and I have not had any more issues

you can find the solution in the following link: 

https://community.fabric.microsoft.com/t5/Dataflow/DataFlow-Gen2-runs-successfully-but-data-is-missi...

Anonymous
Not applicable

I think I have exactly the same problem. I am pulling from a SharePoint folder into a lakehouse, then using a stored proc to copy into a staging table. The pipeline runs successfully each day, but no data gets copied to the staging table. If I run it a second time it seems to work ok. I have thought it was a timing issue - i.e. when the stored proc runs it is somehow pulling the old data - so I am trying a wait between the two tasks. But very frustrating as logically it should work!

I am afraid it's been a long time since logic left the building.  😉  The product on the backend is too complex and there's a long string of bugs plagueing its use.  That being said, does this happen with a Trial license or a paid SKU?  When I was using Trial, I noticed there were lots of sync delays across artifacts and one needs to give it time to propagate chnages.  But on a paid SKU, I'd think that for the money, Microsoft would make the effort to provide a feedback loop at the very least as close as possible to real-time.  But who knows.

 

If I were you, I'd open a support ticket.  That's what I do when I run into such issues. I  came across so many bugs I ended up opening more than 2 dozen tickets over 3 months or something.  So go forth, Padhawan, and may the Force be with you!  😊

Anonymous
Not applicable

Thanks for responding (and the sympathy!) 

 

I did put a 60 second arbitrary delay between the tasks and it worked fine this morning 🙂 So maybe is a timing issue. I have thought about some check function, but there could be a danger of infinite loops and a nice cash generator for MS!

eisaac
Advocate I
Advocate I

Support got back to me quickly; we found a workaround that involves running the pipeline twice (the second time, it succeeds). It sounds like I'm not the only person who has observed this bug, and the product team is working on a fix.

hi, I am experiencing the same issue, could you please give more details about the solution? How do you run it twice? Also, does this mean that we will use more CUs (more charege)?

By run it twice, I mean that I scheduled the entire pipeline to run twice daily, 15 minutes apart. I have no idea why, but the changes are reflected when I do this. Unfortunatley, a delay didn't work for the issue I encountered - several hours later the changes were still not reflected. In the UI where the pipeline can be scheduled, there should be a '+' to add another time (which is how I'm running it twice). This was the solution we came up with after working with Microsoft's Technical Support on the issue.

The issue is that the inserts/updates you make in Fabric don't immediately get reflected in your SELECT statements. The solution for me and for others was inserting a 1 minute "settle-in" delay after you inserted stuff into your tables before starting querying them or running SPs that rely on the updated data

On what type of SKU do you need to do this?

Apparently, any. We run ours on F32, but I've seen the same behaviour on Trial (which is F64)

Element115
Power Participant
Power Participant

Are you using any pipeline variables to filter the rows you want to ingest from the source?  And if so, do you use only 1 such variable but have 2 different Set variable activities modifying this variable in conjuction with a conditional IF activity?

No variables used, no IF activity; the procedure is actually runnable as a Script, but I put it as a Procedure because that is where the other developers I am working with would think to look for it.

Dronec
Advocate II
Advocate II

Looks like a bug for me. I usually log the support tickets for Fabric here: https://admin.powerplatform.microsoft.com/

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.