Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
sudhav
Helper V
Helper V

Need a Datavalidation part by using notebook activity

Hi Team,

there is one incremental load scenario and which is working fine.

 

input source:ADLS, every day one file comming to adls and appending that to table in lakehouse.

 

so 1st day i have one file and having 3 records

2nd day again one file comming to adls and having again 3 records so total 6 records in adls at the End of 2nd day

 

so same number of records i need in LH target table at the End of 2nd day.

 

so i need to count number of records in source and sink, i am able to do this by using lookup activity, but i need this data validation in notebook activity and i am able to write the pyspark query also.

 

code for SourceCount in notebook
spark.conf.set("fs.azure.account.auth.type.sudhadls.dfs.core.windows.net","SAS TOKEN")
access_path = "wasbs://input@sudhadls.blob.core.windows.net/"
df1 = spark.read.csv(access_path + "*.csv", header="True")
 
df1.count()
 
code for SinkCount in notebook
df2 = spark.sql("SELECT * FROM LH_Demo.Test")
df2.count()
 
Now i need to check df.appendcount from source side and df3.count() from sink side equal or not
 
if df1.count == df2.count():
    print("Record count validation passed.")
else:
    print("Record count validation failed")
 
this code i am using in notebook.
Now my question is if i dump this notebook in notebook activity, its publishing. but my scenario is if i got Record count validation failed then i need to trigger a mail by using outlook activity. so in outlook activity i need to give notebook activity output.
 
i have done same scenario using lookup activities but i need notebook activity. please help me team.
TIA

 

 

 

1 ACCEPTED SOLUTION
HimanshuS-msft
Community Support
Community Support

Hello @sudhav 
Thanks for using the Fabric community.
As I undesrtand the ask right now is how to run an validation logic from a notebook and send an email.

Please correct me if my understanding is not right .

 

Step 1 

We can achieve this my adding  the mssparkutils.notebook.exit() in the notebook.

 

df1 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
    ],
    ["id", "label"]  # add your column names here
)


df2 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
        (3, "bar1"),
    ],
    ["id", "label"]  # add your column names here
)


df1.count()
df2.count()

if df1.count == df2.count():
    print("Record count validation passed.")
    mssparkutils.notebook.exit(100)
else:
    print("Record count validation failed")
    mssparkutils.notebook.exit(200)

 

Step 2 

When you run the the notebook from a notebook activity , please add a set variabe activity and read the exitcode 

Expression should be somelike this

@activity('Notebook1').output.result['exitValue']

 

Step 3 

Add a If activity and add the below expression 

@equals(variables('SendAnEmail'),'200') 
Inside the True condition add the Office 365 Outlook activity .
I have tested this logic and it works fine 
 
HimanshuSmsft_0-1692219059099.png

Hope this helps .

Thanks
HImanshu

View solution in original post

2 REPLIES 2
HimanshuS-msft
Community Support
Community Support

Hello @sudhav 
Thanks for using the Fabric community.
As I undesrtand the ask right now is how to run an validation logic from a notebook and send an email.

Please correct me if my understanding is not right .

 

Step 1 

We can achieve this my adding  the mssparkutils.notebook.exit() in the notebook.

 

df1 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
    ],
    ["id", "label"]  # add your column names here
)


df2 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
        (3, "bar1"),
    ],
    ["id", "label"]  # add your column names here
)


df1.count()
df2.count()

if df1.count == df2.count():
    print("Record count validation passed.")
    mssparkutils.notebook.exit(100)
else:
    print("Record count validation failed")
    mssparkutils.notebook.exit(200)

 

Step 2 

When you run the the notebook from a notebook activity , please add a set variabe activity and read the exitcode 

Expression should be somelike this

@activity('Notebook1').output.result['exitValue']

 

Step 3 

Add a If activity and add the below expression 

@equals(variables('SendAnEmail'),'200') 
Inside the True condition add the Office 365 Outlook activity .
I have tested this logic and it works fine 
 
HimanshuSmsft_0-1692219059099.png

Hope this helps .

Thanks
HImanshu

Great, thank you Himanshu. I thought no one will give answer to this lengthy question, but you proved that i am wrong and its working as expected. thankyou man.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

Dec Fabric Community Survey

We want your feedback!

Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.