Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi,
Just to start off, I'm not keen on doing physical deletes but we have a need to carry out ad-hoc deletion of individual records after they have been loaded into a dataflow and I'm looking for a workround to do this.
Currently I have an incremental load that runs to keep a dataflow in sync with a Sales Opportunity table on SQL Server.
Occassionaly, a record is physically deleted from this table and the Opportunity ID is written to a separate table for audit purposes.
What I've tried to do is to use the audit table as a left anti-join to filter out the records in a Transform and then use linked dataflows so I can load the audit table (daily), the Opportunity Staging (half-hourly) then the Opportunity Transform to follow.
So that the Opportunity Transform follows the Staging dataflow, I've set up a Power Automate flow to trigger the dataflow refreshes in sequence. All this is going through our on-premise gateway.
If I refresh through Power BI its fine but using Power Automate it is hit and miss and I often get the message "The backend request failed with error code '400'" which I'm struggling with.
I wondered if there's a better way of removing individual records from the Dataflows as the only way to be sure is to manually re-load the historic data which is not ideal. We've found the Dataflows useful for reporting as this gives us much better control than direct access to SQL so would like to keep the cloud model. Just wondering if other tools such as data factory might give us more flexibility ?
Unfortunately, I haven't been able to test this yet as a pipeline using Fabric (instead of Power Automate) because I'm waiting for our gateway software to be updated.
Neither dataflows nor datasets offer CDC yet, and Microsoft have so far been unreceptive to our pleas. You need to flush and fill the entire partition.
Thanks, that's interesting to know. Wouldn't be quite as bad if it was possible to schedule a "Full" load once in a while (eg. overnight).
You have full control over the partition refresh on the dataset side (it behaves just like OLAP cube partitions) but to my knowledge there is no such option for dataflows. To force a full load you have to make meta data changes (ie mess with the Power Query).
Thanks again. I remember seeing a "Guy in the Cube" video about refreshing dataset partitions but unfortunately we're not at the stage where were building new reports from datasets yet. Dataflows have proved to be very useful to us except for this issue. Thanks for the confirmation that you have to "tweak" the Power Query to force a full load as thats been my experience so far. Its almost that once the data is checked in they don't want it to leave - aka "Hotel Califormia" 😀
The only reason I can see for dataflows is to shield you (the developer) from slow data sources. If your data source are not slow, don't use dataflows.