Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Next up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now

Reply
adamlob
Advocate I
Advocate I

Hash Function for Row Compare

Hi,

 

I'm working on a Data Pipeline that loads data into a Dataverse table. I do a row compare to detect changes between loads, so I am only loading rows that have changed.

 

Is there anyway to hash the concat of rows? At the moment it seems I can only do plain-text and then convert it to Binary. Hashing would help save on space.

1 ACCEPTED SOLUTION
AntoineW
Super User
Super User

Hi @adamlob,

 

In Dataflow Gen2 / Fabric Data Pipeline (Data Factory), there’s no native “Hash” transformation yet.

You can use Notebooks to do that : 

 

from pyspark.sql.functions import sha2, concat_ws

df_hashed = df.withColumn(
"row_hash",
sha2(concat_ws("|", *df.columns), 256)
)

 

Then save it back to your Lakehouse table and use that hash for change-detection.

Benefits:

  • Very fast and scalable,

  • Produces fixed-length SHA-256 strings (~64 chars),

  • Easy to use as a comparison key.

Doc : 

https://spark.apache.org/docs/latest/api/sql/index.html#sha2

 

 

Hope it can help you !

Best regards,

Antoine

View solution in original post

1 REPLY 1
AntoineW
Super User
Super User

Hi @adamlob,

 

In Dataflow Gen2 / Fabric Data Pipeline (Data Factory), there’s no native “Hash” transformation yet.

You can use Notebooks to do that : 

 

from pyspark.sql.functions import sha2, concat_ws

df_hashed = df.withColumn(
"row_hash",
sha2(concat_ws("|", *df.columns), 256)
)

 

Then save it back to your Lakehouse table and use that hash for change-detection.

Benefits:

  • Very fast and scalable,

  • Produces fixed-length SHA-256 strings (~64 chars),

  • Easy to use as a comparison key.

Doc : 

https://spark.apache.org/docs/latest/api/sql/index.html#sha2

 

 

Hope it can help you !

Best regards,

Antoine

Helpful resources

Announcements
FabCon and SQLCon Highlights Carousel

FabCon &SQLCon Highlights

Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.

New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Fabric Update Carousel

Fabric Monthly Update - March 2026

Check out the March 2026 Fabric update to learn about new features.