Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
Hey team
we have a set of CSV files in a folder in one lake. we want to write a code in pyspark that would give us the file name and the file create date time.
Please how do we achieve this . Is there any functions that can give us the file createdates pls
Thanks
Solved! Go to Solution.
I'm not sure it's possible to get the create date of a OneLake file, given it's an abstraction layer of ADLS Gen II.
You can obtain the last modified date (which for an unmodified file will be the create date) from the Get Metadata activity of a Pipeline - see the documentation for the ADF version below for which metadata items are available.
https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity
In a pyspark notebook you'd use mssparkutils/notebookutils and the .fs.ls function
files = mssparkutils.fs.ls('Your directory path')
for file in files:
print(file.name, file.isDir, file.isFile, file.path, file.size, file.modifyTime)
If this helps, please consider Accepting as a Solution to help others find it more easily
Hi @msprog ,
Thank you @spencer_sa for the valuable input!
As Spencer_sa suggested, using mssparkutils.fs.ls() is an efficient approach for retrieving file metadata in OneLake. This approach might help you to resolve the issue.
If this helps, please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.
Thank you for being a valued member of the Microsoft Fabric Community Forum!
Regards,
Pallavi.
I'm not sure it's possible to get the create date of a OneLake file, given it's an abstraction layer of ADLS Gen II.
You can obtain the last modified date (which for an unmodified file will be the create date) from the Get Metadata activity of a Pipeline - see the documentation for the ADF version below for which metadata items are available.
https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity
In a pyspark notebook you'd use mssparkutils/notebookutils and the .fs.ls function
files = mssparkutils.fs.ls('Your directory path')
for file in files:
print(file.name, file.isDir, file.isFile, file.path, file.size, file.modifyTime)
If this helps, please consider Accepting as a Solution to help others find it more easily
User | Count |
---|---|
4 | |
4 | |
2 | |
2 | |
2 |
User | Count |
---|---|
10 | |
8 | |
7 | |
6 | |
6 |