Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

Reply
RenatoDM
Regular Visitor

Spark Data Lineage

I see the OpenLineage libraries are by default included as built-in library in Spark. When a notebook reads and writes to OneLake does it emit lineage events automatically? According to Copilot it does and lineage visualization in Purview is optional. Where are those events stored? I see a SparkLineage folder in OneLake but it is always empty. I am not able to find clear documentation regarding this topic. I appreciate comments. Thank you.

1 ACCEPTED SOLUTION
nilendraFabric
Super User
Super User

Hi @RenatoDM 

 

The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
• To emit granular OpenLineage events (e.g., column-level lineage), you must:
• Implement a SparkListener to intercept Spark execution plans.
• Configure diagnostic emitters to route logs to Azure Storage or Log Analytics

 

 

Native Purview integration captures basic item-level lineage (e.g., notebook → Lakehouse table) but doesn’t populate `SparkLineage`

 

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/azure-synapse-diagnostic-emitters-az...

 

 

 

 

View solution in original post

1 REPLY 1
nilendraFabric
Super User
Super User

Hi @RenatoDM 

 

The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
• To emit granular OpenLineage events (e.g., column-level lineage), you must:
• Implement a SparkListener to intercept Spark execution plans.
• Configure diagnostic emitters to route logs to Azure Storage or Log Analytics

 

 

Native Purview integration captures basic item-level lineage (e.g., notebook → Lakehouse table) but doesn’t populate `SparkLineage`

 

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/azure-synapse-diagnostic-emitters-az...

 

 

 

 

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Fabric Update Carousel

Fabric Monthly Update - March 2026

Check out the March 2026 Fabric update to learn about new features.

Top Kudoed Authors