March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
I have the following use case. I have an FTP where only internal network traffic is allowed and files are stored there every 15 minutes or so. Once I process the files on the FTP I would like to MOVE the files to an Archive folder ON the FTP.
Because the FTP only allows internal network traffic, I am unable to reach it from Fabric via notebooks. I already have python functions that I COULD use if I could access from Fabric. There is no IP range that I can provide to whitelist either, even if external traffic was allowed.
I can use a data pipeline to connect to the FTP and retrive the files through the data gateway, so that gets me halfway there. However, I am unable to move the file using the FTP tool. My only option is to leave the file or delete it when done.
How can I solve for this within a data pipeline? Bascially I need to know how I can issue FTP commands within a data pipeline in a for each loop once the file successfully transfers.
I KNOW we could change our process to either start storing these files directly on OneLake or maybe opening convert to an SFTP with a publiclly avaialble endpoint. But I am trying to figure out how to use the tools within Fabric to solve this fairly simple usecase.
Thank you for any/all help!
Hi @OldDogNewTricks ,
Since it is not possible to issue FTP commands directly from the fabric, this can be solved by utilizing a custom activity in the data pipeline.
Add a custom activity to the data pipeline that can run a script. This script will handle the FTP commands.
Write a script (e.g. Python) that connects to the FTP server and moves the processed files to the archive folder. Here is a basic example using Python and the ftplib library:
from ftplib import FTP
def move_file(ftp, source_path, dest_path):
ftp.rename(source_path, dest_path)
def main():
ftp = FTP('ftp.yourserver.com')
ftp.login(user='username', passwd='password')
# List of files to move
files_to_move = ['file1.txt', 'file2.txt']
for file in files_to_move:
source_path = f'/path/to/source/{file}'
dest_path = f'/path/to/archive/{file}'
move_file(ftp, source_path, dest_path)
ftp.quit()
if __name__ == "__main__":
main()
Upload this script to a location accessible by your data pipeline. Configure the custom activity to run this script after the files are processed.
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Yilong - I appreciate your suggestion, however I am not sure how to get this to work for the following reasons:
1. I do not have a custom script activity available. I do see a script activity, but the only connections available are SQL based (Snowflake, SQL server, Azure SQL, Warehouse in Fabric). I also see an Azure function activity, but I don't have any Azure functions.
2. Even IF I could write a script and have the pipeline run it, I would not know how to ensure that the pipeline has access to it from Fabric. Upload to OneLake and use from a Lakehouse? Again, there are no connections avaialble that are NOT SQL based in the Script activity that I see in data pipeline
3. The issue is that the FTP is only allowing internal network traffic, so even IF I could get a python script to load to a "Custom script" action, I cannot make the custom script use the data gateway to connect to the FTP.
Thank you for your help, I'm open to other options, but it does not appear this will work.
At this point it seems like my options are the following:
1. Change our business process to either copy then delete the files from the FTP and keep in OneLake instead or leave the files on the FTP. This is less than ideal to have to change a business process due to a simple limitation in the toolset that existing toolsets do not have any problems with today
2. Write a pipeline that downloads and deletes the folder from the current source on the FTP then writes the file back to the archive destination in the FTP. Obviously this is less than idea due to the extra CU we will incur and the extra time
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
6 | |
2 | |
2 | |
2 | |
2 |
User | Count |
---|---|
13 | |
7 | |
7 | |
5 | |
4 |