Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
sudhav
Helper V
Helper V

Need a Datavalidation part by using notebook activity

Hi Team,

there is one incremental load scenario and which is working fine.

 

input source:ADLS, every day one file comming to adls and appending that to table in lakehouse.

 

so 1st day i have one file and having 3 records

2nd day again one file comming to adls and having again 3 records so total 6 records in adls at the End of 2nd day

 

so same number of records i need in LH target table at the End of 2nd day.

 

so i need to count number of records in source and sink, i am able to do this by using lookup activity, but i need this data validation in notebook activity and i am able to write the pyspark query also.

 

code for SourceCount in notebook
spark.conf.set("fs.azure.account.auth.type.sudhadls.dfs.core.windows.net","SAS TOKEN")
access_path = "wasbs://input@sudhadls.blob.core.windows.net/"
df1 = spark.read.csv(access_path + "*.csv", header="True")
 
df1.count()
 
code for SinkCount in notebook
df2 = spark.sql("SELECT * FROM LH_Demo.Test")
df2.count()
 
Now i need to check df.appendcount from source side and df3.count() from sink side equal or not
 
if df1.count == df2.count():
    print("Record count validation passed.")
else:
    print("Record count validation failed")
 
this code i am using in notebook.
Now my question is if i dump this notebook in notebook activity, its publishing. but my scenario is if i got Record count validation failed then i need to trigger a mail by using outlook activity. so in outlook activity i need to give notebook activity output.
 
i have done same scenario using lookup activities but i need notebook activity. please help me team.
TIA

 

 

 

1 ACCEPTED SOLUTION
HimanshuS-msft
Community Support
Community Support

Hello @sudhav 
Thanks for using the Fabric community.
As I undesrtand the ask right now is how to run an validation logic from a notebook and send an email.

Please correct me if my understanding is not right .

 

Step 1 

We can achieve this my adding  the mssparkutils.notebook.exit() in the notebook.

 

df1 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
    ],
    ["id", "label"]  # add your column names here
)


df2 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
        (3, "bar1"),
    ],
    ["id", "label"]  # add your column names here
)


df1.count()
df2.count()

if df1.count == df2.count():
    print("Record count validation passed.")
    mssparkutils.notebook.exit(100)
else:
    print("Record count validation failed")
    mssparkutils.notebook.exit(200)

 

Step 2 

When you run the the notebook from a notebook activity , please add a set variabe activity and read the exitcode 

Expression should be somelike this

@activity('Notebook1').output.result['exitValue']

 

Step 3 

Add a If activity and add the below expression 

@equals(variables('SendAnEmail'),'200') 
Inside the True condition add the Office 365 Outlook activity .
I have tested this logic and it works fine 
 
HimanshuSmsft_0-1692219059099.png

Hope this helps .

Thanks
HImanshu

View solution in original post

2 REPLIES 2
HimanshuS-msft
Community Support
Community Support

Hello @sudhav 
Thanks for using the Fabric community.
As I undesrtand the ask right now is how to run an validation logic from a notebook and send an email.

Please correct me if my understanding is not right .

 

Step 1 

We can achieve this my adding  the mssparkutils.notebook.exit() in the notebook.

 

df1 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
    ],
    ["id", "label"]  # add your column names here
)


df2 = spark.createDataFrame(
    [
        (1, "foo"),  # create your data here, be consistent in the types.
        (2, "bar"),
        (3, "bar1"),
    ],
    ["id", "label"]  # add your column names here
)


df1.count()
df2.count()

if df1.count == df2.count():
    print("Record count validation passed.")
    mssparkutils.notebook.exit(100)
else:
    print("Record count validation failed")
    mssparkutils.notebook.exit(200)

 

Step 2 

When you run the the notebook from a notebook activity , please add a set variabe activity and read the exitcode 

Expression should be somelike this

@activity('Notebook1').output.result['exitValue']

 

Step 3 

Add a If activity and add the below expression 

@equals(variables('SendAnEmail'),'200') 
Inside the True condition add the Office 365 Outlook activity .
I have tested this logic and it works fine 
 
HimanshuSmsft_0-1692219059099.png

Hope this helps .

Thanks
HImanshu

Great, thank you Himanshu. I thought no one will give answer to this lengthy question, but you proved that i am wrong and its working as expected. thankyou man.

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

March 2024 FBC Gallery Image

Fabric Monthly Update - March 2024

Check out the March 2024 Fabric update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.

Top Kudoed Authors