Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
Hello,
we are encountering more and more issues with Fabric, unfortunately. The newest addition is a Gen2 Dataflow that runs now since 7 hours while usually it took about 1 hour to complete. I understand there is not cancelling option as of now, how can we stop the Dataflow from running and restart it?
Also, when is the option to cancel refreshes released?
@JayJay11 can you paste the request ID and/or dataflow ID for this? I want to take a look on why the refresh got stuck
@pqian_MSFT - Just curious, do you know when the cancel refresh option will be available for DFG2? The release plan says Q4 2023, but the page hasn't been updated since 10/5/2023.
Cancellation is coming soon - it's been tested and rolling out in the March release.
Here you go:
Thank you!!
Thanks @JayJay11 , the difference between the last fast run and this run seems to be on the amount of OData requests. The fast run only issued 4 requests, the slow run is still running and issuing thousands of OData requests.
I'll need to look into these requests to see what they are. Meanwhile, if you can share the query definition, that would be helpful too
Actually I am not able to see any history on this dataflow indicating that there were 1-hour runs before. Was this perhaps another dataflow? Can you share the 1-hour run dataflow\request IDs?
Here is the Gen1 Dataflow that runs quite smooth since weeks: a3571cca-34a2-4f0b-8f38-4c53d77378fd
And here is the query:
let
Source = OData.Feed("https://myREDACTED-api.s4hana.cloud.sap/sap/opu/odata/sap/API_GLACCOUNTLINEITEM", null, [Implementation="2.0"]),
GLAccountLineItem_table = Source{[Name="GLAccountLineItem",Signature="table"]}[Data],
#"Removed Other Columns" = Table.SelectColumns(GLAccountLineItem_table, {"CompanyCode", "AccountingDocument", "Ledger", "GLRecordType", "ControllingArea", "ChartOfAccounts", "GLAccount", "BusinessTransactionType", "CostCenter", "ProfitCenter", "FunctionalArea", "Segment", "PartnerCompany", "GlobalCurrency", "AmountInGlobalCurrency", "BaseUnit", "Quantity", "DebitCreditCode", "FiscalPeriod", "FiscalYearVariant", "FiscalYearPeriod", "PostingDate", "DocumentDate", "AccountingDocumentType", "PostingKey", "LastChangeDateTime", "OriginObjectType", "GLAccountType", "DocumentItemText", "Product", "Supplier", "Customer", "OffsettingAccount", "WBSElementInternalID", "ProjectInternalID"}),
#"Filter Ledger = 0L" = Table.SelectRows(#"Removed Other Columns", each ([Ledger] = "0L")),
#"Changed column type" = Table.TransformColumnTypes(#"Filter Ledger = 0L", {{"PostingDate", type date}, {"DocumentDate", type date}, {"LastChangeDateTime", type date}})
in
#"Changed column type"
Again, thank you for checking this. Curious to see the differences between the Gen1 and Gen2 Dataflow ...
Ah, I had missed the ok-dataflow is Gen1. Do you know if it's running on Premium capacity or shared?
I'll look into the long running time for Gen 2, at a glance it seems to be network bound (specifically, the amount of requests to Odata endpoint).
It is running on a shared capacity workspace with Premium Per User license.
I looked into a few runs for your Gen 1 and Gen 2 refresh - the main difference is the time span in OData/GetResponse. There are thousands of these OData requests, they average to 0.5s each on the Gen 1 nodes:
However, the on the Gen 2 nodes, they average around 3.79s each
We see the network throughput on Gen 2 node at around 100kbps, and the entire refresh is network IO bound.
Both the Gen 1 and Gen 2 nodes are from our Germany data center, and so there shouldn't be any latency differences.
Is the SAP server perhaps throttling the Gen 2 refresh for whatever reason?
Thank you for checking. The Gen1 and Gen2 dataflows have the exact same queries. Only the Gen2 Dataflow has destinations in a Lakehouse. They both use the exact same data sources in the same Gateway.
I just tried re-running one of the Gen1 flows, no issues. However, at this time I cannot even load the smallest table anymore to a Lakehouse using a Gen2 flow with the same queries, see here:
I re-created the Gen2 flows many times already, tried a different Lakehouse, installed the newest version of the Data Gateway .. nothing is helping. I don't know what we can do at this point...
This refresh just happened right? I'll need to wait for the telemetry to look at it.
Meanwhile, neither of your SAP OData dataflow above (Gen1\Gen2) is refreshing on a gateway. They are both happening on our cloud nodes. Did you mean to use a Gateway? If so, make sure the _connection_ itself has a gateway, not just the Options dialog
Ok, this newest error from dataflow 51ff6412-e905-4ab9-a5b5-daab3f976d3d is indeed coming from the GW. And because of that, I don't get to see any logs or details (it's on the GW machine itself).
Can you drill into the entity refresh history to see what's going on?
If there isn't anything useful, you need to review Gateway logs with additional logging enabled:
Follow this guide: Monitor and optimize on-premises data gateway performance | Microsoft Learn
you can send the additional logs to pqian at microsoft dot com.
When I drill into one of them, I see the following message:
999999 Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: Der Container wurde unerwartet mit Code 0x000000FF beendet. PID: 9992. Details: GatewayObjectId: 0675d568-7f5a-494f-8900-497c15e6ad9b
Actually, we just updated the Gateway to the newest version. Since then, we have even more problems. What is the currently supported Gateway version from Fabric perspective? (so we can downgrade again..)
The latest gateway should always be the one Fabric recommend. This does sound like a installation issue - the 0x000000FF code is not one that I recognize, it seems something is interfering with mashup containers in your GW environment.
Definitely pull GW logs to see if there are additional errors, as well as look into Windows Event logs
worse case scenario: pave the machine and have a fresh installation of the GW.
Hello @JayJay11 ,
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .
In case if you have any resolution please do share that same with the community as it can be helpful to others .
Otherwise, will respond back with the more details and we will try to help .
Hi @Anonymous no there is no solution as currently a dataflow cannot be stopped and I don't know why the flow even runs 8 hours. It simply seems that the Gen2 Dataflows are unstable at the moment.
Hello @JayJay11 ,
Apologies for the issue you are facing.
If its a bug, we will definitely would like to know and properly address it.
Please go ahead and raise a support ticket to reach our support team: Link
After creating a Support ticket please provide the ticket number as it would help us to track for more information.
Thank you.
As of now the UI feature to cancel refresh is not available but it can be done via API :
https://fabric.guru/cancelling-dataflow-gen2-refresh
timeline for the cancel feature :
https://learn.microsoft.com/en-us/fabric/release-plan/data-factory#cancel-refresh
Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.
Check out the November 2024 Fabric update to learn about new features.
User | Count |
---|---|
5 | |
5 | |
5 | |
3 | |
3 |