The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi all,
I have a spark job definition which contains a main definition file to programmatically extract a zip file from a lakehouse and extract the contents to a configured destination folder on the same lakehouse. I am using below code for file extraction:
def extract_zip_files(zipfilepath,outputpath):
with zipfile.ZipFile(zipfilepath,'r') as zfile:
zfile.extractall(path=outputpath)
The zip file path (ABFS path), zip file name and destination path (ABFS path) are passed from the command line arguments.
I am encountering below runtime issue while running the spark job:
FileNotFoundError: [Errno 2] No such file or directory: 'abfss://<My workspace ID>@onelake.dfs.fabric.microsoft.com/<My Lakehouse ID>/Files/MyFile.zip'
Please let me know if you have any suggestions or if need more information regarding this issue.
Thank you @Anonymous for taking a look at this issue. I am trying to extract zip files using spark job defintion instead of a notebook. Within my spark job code, I have tried passing the file paths (ABFS, relative path, file api path) via command line arguments but encountered same error i.e. FileNotFoundError: [Errno 2] No such file or directory.
Is there an approach to extract zip files within a spark job definition?
Hi @MK007,
I also not find the fact path of files, perhaps you can try to upload the files to environment and use correspond path to extract data: (you can try to use f"{notebookutils.nbResPath} + 'file the relative path'to read data from resource folder)
zipfilepath=f"{notebookutils.nbResPath}/env/Test/test.zip"
Regards,
Xiaoxin Sheng
Hi @MK007,
I test with your code in notebook and find the 'zipfilepath' and 'outputpath' parameters should be the api path of files: (you can get it by right-click on the zip file and choose 'copy file api path' option)
import zipfile
zipfilepath="/lakehouse/default/Files/test.zip"
outputpath="/lakehouse/default/Files/Test/"
zfile= zipfile.ZipFile(zipfilepath,'r')
zfile.extractall(path=outputpath)
Result:
Regards,
Xiaoxin Sheng