Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered

Reply
Tom_CA
Helper I
Helper I

Azure Pipeline - Lookup and CopyData

Hello colleagues, I'm attempting to store the JSON output from my lookup activity in a file within my Lakehouse. I've experimented with the Copy Data activity, but it hasn't yielded the desired results. Could you suggest an effective method for saving the lookup activity's output to a file?

 

There is no "inline" option as in Azure Data Factory.

 

Thank you

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi @Tom_CA ,

 

While I can't get the Json output of the Lookup activity directly through the data pipeline, I do have a workaround to store the JSON output of the lookup activity in a file in Lakehouse.

 

Please follow my steps to do this:

 

First, run the Lookup activity successfully in the pipeline.

vhuijieymsft_5-1735095598908.png

 

Then, copy the output json result.

vhuijieymsft_0-1735095384735.png

 

Use the following code to save the output json result as a temporary file. Please replace the parts I marked with your own json output.

import json
import os
import shutil

# Define example schema with more columns
test = {
    "firstRow": {
        "SalesOrderNumber": "SO51555",
        "SalesOrderLineNumber": 7,
        "OrderDate": "2021-06-23T00:00:00Z",
        "CustomerName": "Chloe Garcia",
        "Email": "chloe27@adventure-works.com",
        "Item": "Patch Kit/8 Patches",
        "Quantity": 1,
        "UnitPrice": 2.29,
        "Tax": 0.1832
    }
}

# Write to temporary file
temp_path = "/tmp/test.json"
with open(temp_path, 'w') as f:
    json.dump(test, f)

print(f"Schema temporarily saved to {temp_path}")

vhuijieymsft_1-1735095409503.png

 

Use the following code to save it to the file section of lakehouse, where the path uses the File API Path.

# File API path
lakehouse_dir = "/lakehouse/default/Files/"
lakehouse_path = lakehouse_dir + "test.json"

if not os.path.exists(lakehouse_dir):
    os.makedirs(lakehouse_dir)

# Use shutil.copy to copy files to the target location
shutil.copy(temp_path, lakehouse_path)

print(f"Schema saved to {lakehouse_path}")

vhuijieymsft_2-1735095442067.png

vhuijieymsft_3-1735095442068.png

 

Finally you can use this code to see if your stored json output is correct.

import json
import pandas as pd

# set Lakehouse path
lakehouse_path = "/lakehouse/default/Files/test.json"

# use pandas to read JSON file
df = pd.read_json(lakehouse_path, orient='records', lines=True)

schema_loaded = json.loads(df.to_json(orient='records'))

print("Loaded schema:")
print(json.dumps(schema_loaded, indent=4))

vhuijieymsft_4-1735095466232.png

 

By using notebook, you'll be able to save the json file in lakehouse's file.

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

View solution in original post

2 REPLIES 2
Anonymous
Not applicable

Hi @Tom_CA ,

 

While I can't get the Json output of the Lookup activity directly through the data pipeline, I do have a workaround to store the JSON output of the lookup activity in a file in Lakehouse.

 

Please follow my steps to do this:

 

First, run the Lookup activity successfully in the pipeline.

vhuijieymsft_5-1735095598908.png

 

Then, copy the output json result.

vhuijieymsft_0-1735095384735.png

 

Use the following code to save the output json result as a temporary file. Please replace the parts I marked with your own json output.

import json
import os
import shutil

# Define example schema with more columns
test = {
    "firstRow": {
        "SalesOrderNumber": "SO51555",
        "SalesOrderLineNumber": 7,
        "OrderDate": "2021-06-23T00:00:00Z",
        "CustomerName": "Chloe Garcia",
        "Email": "chloe27@adventure-works.com",
        "Item": "Patch Kit/8 Patches",
        "Quantity": 1,
        "UnitPrice": 2.29,
        "Tax": 0.1832
    }
}

# Write to temporary file
temp_path = "/tmp/test.json"
with open(temp_path, 'w') as f:
    json.dump(test, f)

print(f"Schema temporarily saved to {temp_path}")

vhuijieymsft_1-1735095409503.png

 

Use the following code to save it to the file section of lakehouse, where the path uses the File API Path.

# File API path
lakehouse_dir = "/lakehouse/default/Files/"
lakehouse_path = lakehouse_dir + "test.json"

if not os.path.exists(lakehouse_dir):
    os.makedirs(lakehouse_dir)

# Use shutil.copy to copy files to the target location
shutil.copy(temp_path, lakehouse_path)

print(f"Schema saved to {lakehouse_path}")

vhuijieymsft_2-1735095442067.png

vhuijieymsft_3-1735095442068.png

 

Finally you can use this code to see if your stored json output is correct.

import json
import pandas as pd

# set Lakehouse path
lakehouse_path = "/lakehouse/default/Files/test.json"

# use pandas to read JSON file
df = pd.read_json(lakehouse_path, orient='records', lines=True)

schema_loaded = json.loads(df.to_json(orient='records'))

print("Loaded schema:")
print(json.dumps(schema_loaded, indent=4))

vhuijieymsft_4-1735095466232.png

 

By using notebook, you'll be able to save the json file in lakehouse's file.

 

If you have any other questions please feel free to contact me.

 

Best Regards,
Yang
Community Support Team

 

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Hey @Anonymous , thank you for your suggestion. It's really interesting, and I will implement it. I'd like to add some important observations about the process of passing parameters to the notebook:

  1. String conversion: In my tests, it was necessary to convert the output to a string before passing it as a parameter to the notebook. This is crucial to ensure that the data is transmitted correctly.
  2. Parsing in the notebook: If you want to work with a real JSON at the end of your process, it's essential to perform parsing in the notebook after receiving the output from the variable. This can be done as follows:
     
    Python
    array_json = json.loads(JsonVar)
    Where 'JsonVar' is the parameter received in the notebook.
  3. Complete flow:
    • Pipeline: Convert the Lookup output to a string.
    • Parameter passing: Use the string as a parameter for the notebook.
    • Notebook: Parse the string back into a JSON object.

This process ensures that you can work with complex data structures while maintaining data integrity during the transfer between the pipeline and the notebook.

Does that make sense? This approach provides flexibility and robustness when dealing with JSON data in Azure Data Factory or Synapse Analytics pipelines.

Thank you for your support.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June FBC25 Carousel

Fabric Monthly Update - June 2025

Check out the June 2025 Fabric update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.