Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
Hi,
I have a daily pipeline in Microsoft Fabric where:
# read
df_table = (spark.read
.option(Constants.WorkspaceId, bronze_workspaceId)
.synapsesql("lake_bronze.table_brz")
)
# transform
.....
#save
df_table .write.mode("overwrite").option("mergeSchema", "true").saveAsTable("lake_silver.schema.table_slv")(spark.read.synapsesql("lake_silver.schema.table_slv"))\
.write.mode("overwrite").option(Constants.WorkspaceId, gold_workspaceId).synapsesql("warehouse_gold.dbo.table_gld")Both notebooks are triggered sequentially from a main notebook.
Problem:
Sometimes records are missing in Gold.
Questions:
Any guidance or examples would be appreciated.
Solved! Go to Solution.
Hi @tyro_ploter,
Personally I just add the other lakehouse from the other workspace to the notebook. I've never had an issue with it.
If you must use the SQL endpoint, use the API to force a refresh first, then wait for 5 minutes.
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
Hi @tyro_ploter,
@Gpop13 is correct, Read the data from the lakehouse using Spark (spark.table) and then you can write data to the warehouse using synapsesql. It's the reading operation that's causing you greif right now.
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
Hi @tayloramy , @Gpop13 ,
I’m using synapsesql because my architecture relies on separate workspaces for Bronze, Silver, and Gold layers, all managed through Git CI/CD. This approach makes it easier to handle cross-workspace reads and writes without relying on shortcuts or workspace-specific artifacts, which are not version-controlled and could cause inconsistencies during branch merges.
Another reason is that the Gold layer is a Warehouse, and Fabric notebooks can only attach Lakehouses, not Warehouses. Therefore, synapsesql is the only way to write data from Spark into the Gold Warehouse.
That said, I might be missing a better approach. If there’s a recommended pattern for cross-workspace Lakehouse → Warehouse notebook that avoids snapshot latency and still works well with CI/CD, I’d love to hear your suggestions.
Hi @tyro_ploter,
Personally I just add the other lakehouse from the other workspace to the notebook. I've never had an issue with it.
If you must use the SQL endpoint, use the API to force a refresh first, then wait for 5 minutes.
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
Hi @tayloramy
Thank you for the suggestion!
In my case, the main reason is that Gold is a Warehouse, and Fabric notebooks can only attach Lakehouses, not Warehouses. I need to write a DataFrame into a Warehouse in another workspace, and as far as I know, synapsesql is the supported way to do this.
I’ll consider your idea about adding a refresh and a timeout before reading.
Hi @tyro_ploter,
@Gpop13 is correct, Read the data from the lakehouse using Spark (spark.table) and then you can write data to the warehouse using synapsesql. It's the reading operation that's causing you greif right now.
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
Hi @tyro_ploter , the suggestion is only to see if synapsesql can be avoided while reading it from the lakehouse. I believe you can still continue to use it to write it to the warehouse. Because synapsesql reads it from the sql endpoint, you may be facing this issue.
Hi @tyro_ploter.
@Gpop13 is asking the right questions.
spark.read.synapsesql(...) will use the SQL Endpoint, which has some delays before getting updated data.
I recommend reading directly from the delta table using spark.table(...).
If you found this helpful, consider giving some Kudos. If I answered your question or solved your problem, mark this post as the solution.
Hi @tyro_ploter ,
What is the reason behind using synapsesql to read, I presume silver is a lakehouse?
can we not use spark.read.table("lake_silver.schema.table_slv") and then write it to the Gold warehouse?
Check out the November 2025 Fabric update to learn about new features.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!