The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I'm trying to write data from a Dataflow (not Gen2) to a SQL DB in Fabric. The query is set to refresh incrementally, but when I try to publish/refresh the query, I get the following error message:
Apply_all_login_data_inc_Incremental_Over_Periods: There was a problem refreshing the dataflow: 'Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataSource.Error: We can't update the value. The operation or its arguments aren't supported by the value.
When I view the refresh details, this is what is returned:
I'm at a loss as to even understand where to start looking to troubleshoot. The query itself builds a table with a list of IDs, date/time values, and a pair of parameters that are used to make API calls. It looks something like this:
I then add the function as a custom column with the following step:
The query runs successfully, and I then configure incremental refresh to trigger off the Date column in the above table.
Next I add a default data destination for the query output, with the new data replacing existing data and the schema fixed, since this is what the settings are forced to be with incremental refreshes.
When I publish the dataflow, or try to refresh, I get the error messages above. The new table does get created in the destination Fabric SQL DB, which I expect since the all_incremental_data_Incremental_CreateDestination publishing step succeeds, but the other two publishing steps fail.
If I removed the data destination from the query results table in the DF, the DF then publishes/refreshes without any issues. In addition, as a troubleshooting step, I created another Fabric DB and pointed the query table to that new DB, still using the incremental refresh, to see if it was possibly an issue with my first DB getting corrupted, but the results were the same.
If anyone has any suggestions as to where I can start looking to diagnose and troubleshoot, that is greatly appreciated.
Solved! Go to Solution.
Hi @AnnOminous
You need to use Notebooks only ( Advanced User ). Also You need to understand Data Lake and Delta Lake with Apache Spark. To answer the question,This is writing data back to Fabric SQL ( INSERT ). This is a script.
Cheers
Bhavesh Patel
This resources are useful for understanding Data Lake and Data Lakehouse: Open source material & based on linux operating system
Overview - Spark 4.0.0 Documentation
Welcome to the Delta Lake documentation — Delta Lake Documentation
This is only possible in Notebooks and you have to use Python. Python is the universal language in OLTP (Online Transaction Processing ) + OLAP ( Online Analytical Processing) these days.
This is how it works: ( Linux Operating System )
Python -- > Data Lake ( Apache Spark ) --> Data Lakehouse ( Delta Lake )
Currently, This is not supported ( WRITE method ) using T-SQL ( INSERT ).
You should use Incremental Refresh via Dataflow Gen 2 or alternatively you should use python notebooks. you have to understand Data Lake and why it started in about 2010 along with Matei Zaharia and why it has gone through Data Lakehouse. In a limited space, I can not write all of this.
Hi @AnnOminous ,
I wanted to check if you had the opportunity to review the information provided. Please feel free to contact us if you have any further questions.
Thank you.
No. I just finished setting it up using DFG2, but I'm still getting the same error message.
Hi @AnnOminous ,
Thank you for reaching out to the Microsoft Fabric Community Forum.
To clarify, writing data from a Dataflow (non-Gen2) directly into a Fabric SQL DB (or any Fabric-native destination such as Lakehouse or Warehouse) is not supported. This limitation is what’s leading to the MashupException.Error during the publishing and refresh operations, particularly when incremental refresh and parameterized function logic are involved.
As correctly said by @BhaveshPatel @, to perform this kind of operation successfully, you’ll need to migrate your solution to Dataflow Gen2, which is fully integrated with the Fabric ecosystem. Dataflow Gen2 supports writing to Fabric SQL DB, Lakehouse, and Warehouse while aligning with the unified data model and schema expectations required by incremental refresh scenarios
Further recommended next steps are as follow:
Hope this helps. Please reach out for further assistance.
Thank you.
You can not perform Dataflow to Fabric SQL. For that you need to have Dataflow Gen 2. Also, You need to have Apache Spark with Delta Lake ( Lakehouse ).
Dataflow Gen 2 = Fabric SQL = Fabric Datawarehouse = Semantic Model = Unified Data Structure
@BhaveshPatel, I tried setting it up using just DFG2 pointing to my Fabric SQL instance, but I'm still getting the same error message.
Can you provide a little more detail why I also need Spark with Delta Lake?
Hi @AnnOminous
You need to use Notebooks only ( Advanced User ). Also You need to understand Data Lake and Delta Lake with Apache Spark. To answer the question,This is writing data back to Fabric SQL ( INSERT ). This is a script.
Cheers
Bhavesh Patel
@BhaveshPatel, Sorry, why can't I just write an INSERT command in T-SQL to write the data from the staging table to the "permanent" table once the incremental refresh has updated the staging table with the latest data?
@BhaveshPatel, got it, thank you again. If this is for advanced users then, pretty obviously, I'm just starting to work with it, so I don't see it as a feasible solution for me at this time.
Frustrating that MS is building a data management framework that doesn't support T-SQL for what seems to me to be a straightforward process, but I appreciate the information you provided, helps me understand it a bit better.
One last request, if I may - understanding the amount of information required to understand this well can't be captured in this forum, are there any links you recommend to help me understand datalakes and Apache Spark better? I've done some searching but am still trying to find some good sites for learning these things.
User | Count |
---|---|
7 | |
2 | |
2 | |
2 | |
2 |
User | Count |
---|---|
17 | |
16 | |
6 | |
5 | |
5 |