Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric Data Days Monthly is back. Join us on March 26th for two expert-led sessions on 1) Getting Started with Fabric IQ and 2) Mapping & Spacial Analytics in Fabric. Register now

Reply
adamlob
Advocate I
Advocate I

Hash Function for Row Compare

Hi,

 

I'm working on a Data Pipeline that loads data into a Dataverse table. I do a row compare to detect changes between loads, so I am only loading rows that have changed.

 

Is there anyway to hash the concat of rows? At the moment it seems I can only do plain-text and then convert it to Binary. Hashing would help save on space.

1 ACCEPTED SOLUTION
AntoineW
Super User
Super User

Hi @adamlob,

 

In Dataflow Gen2 / Fabric Data Pipeline (Data Factory), there’s no native “Hash” transformation yet.

You can use Notebooks to do that : 

 

from pyspark.sql.functions import sha2, concat_ws

df_hashed = df.withColumn(
"row_hash",
sha2(concat_ws("|", *df.columns), 256)
)

 

Then save it back to your Lakehouse table and use that hash for change-detection.

Benefits:

  • Very fast and scalable,

  • Produces fixed-length SHA-256 strings (~64 chars),

  • Easy to use as a comparison key.

Doc : 

https://spark.apache.org/docs/latest/api/sql/index.html#sha2

 

 

Hope it can help you !

Best regards,

Antoine

View solution in original post

1 REPLY 1
AntoineW
Super User
Super User

Hi @adamlob,

 

In Dataflow Gen2 / Fabric Data Pipeline (Data Factory), there’s no native “Hash” transformation yet.

You can use Notebooks to do that : 

 

from pyspark.sql.functions import sha2, concat_ws

df_hashed = df.withColumn(
"row_hash",
sha2(concat_ws("|", *df.columns), 256)
)

 

Then save it back to your Lakehouse table and use that hash for change-detection.

Benefits:

  • Very fast and scalable,

  • Produces fixed-length SHA-256 strings (~64 chars),

  • Easy to use as a comparison key.

Doc : 

https://spark.apache.org/docs/latest/api/sql/index.html#sha2

 

 

Hope it can help you !

Best regards,

Antoine

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

February Fabric Update Carousel

Fabric Monthly Update - February 2026

Check out the February 2026 Fabric update to learn about new features.

Top Kudoed Authors