Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join now60 Days of Data Days! Live and on-demand sessions, challenges, study groups and more! And it's all FREE!. Join now. Learn more
I see the OpenLineage libraries are by default included as built-in library in Spark. When a notebook reads and writes to OneLake does it emit lineage events automatically? According to Copilot it does and lineage visualization in Purview is optional. Where are those events stored? I see a SparkLineage folder in OneLake but it is always empty. I am not able to find clear documentation regarding this topic. I appreciate comments. Thank you.
Solved! Go to Solution.
Hi @RenatoDM
The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
• To emit granular OpenLineage events (e.g., column-level lineage), you must:
• Implement a SparkListener to intercept Spark execution plans.
• Configure diagnostic emitters to route logs to Azure Storage or Log Analytics
Native Purview integration captures basic item-level lineage (e.g., notebook → Lakehouse table) but doesn’t populate `SparkLineage`
Hi @RenatoDM
The `SparkLineage` folder in OneLake is not populated by default. Its presence suggests compatibility with OpenLineage standards, but explicit configuration is required.
• To emit granular OpenLineage events (e.g., column-level lineage), you must:
• Implement a SparkListener to intercept Spark execution plans.
• Configure diagnostic emitters to route logs to Azure Storage or Log Analytics
Native Purview integration captures basic item-level lineage (e.g., notebook → Lakehouse table) but doesn’t populate `SparkLineage`
| User | Count |
|---|---|
| 7 | |
| 7 | |
| 5 | |
| 5 | |
| 5 |
| User | Count |
|---|---|
| 21 | |
| 16 | |
| 16 | |
| 15 | |
| 12 |