Solved: Getting createddatetime of files in one lake

msprog · ‎03-18-2025

Hey team

we have a set of CSV files in a folder in one lake. we want to write a code in pyspark that would give us the file name and the file create date time.

Please how do we achieve this . Is there any functions that can give us the file createdates pls

Thanks

spencer_sa · ‎03-18-2025

I'm not sure it's possible to get the create date of a OneLake file, given it's an abstraction layer of ADLS Gen II.
You can obtain the last modified date (which for an unmodified file will be the create date) from the Get Metadata activity of a Pipeline - see the documentation for the ADF version below for which metadata items are available.
https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity

In a pyspark notebook you'd use mssparkutils/notebookutils and the .fs.ls function

files = mssparkutils.fs.ls('Your directory path')
for file in files:
    print(file.name, file.isDir, file.isFile, file.path, file.size, file.modifyTime)

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/microsoft-spark-utilities?pivots=pro...

If this helps, please consider Accepting as a Solution to help others find it more easily

View solution in original post

v-pagayam-msft · ‎03-18-2025

Hi @msprog ,
Thank you @spencer_sa for the valuable input!
As Spencer_sa suggested, using mssparkutils.fs.ls() is an efficient approach for retrieving file metadata in OneLake. This approach might help you to resolve the issue.

If this helps, please give us Kudos and consider Accept it as a solution to help the other members find it more quickly.
Thank you for being a valued member of the Microsoft Fabric Community Forum!

Regards,
Pallavi.

spencer_sa · ‎03-18-2025

I'm not sure it's possible to get the create date of a OneLake file, given it's an abstraction layer of ADLS Gen II.
You can obtain the last modified date (which for an unmodified file will be the create date) from the Get Metadata activity of a Pipeline - see the documentation for the ADF version below for which metadata items are available.
https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity

In a pyspark notebook you'd use mssparkutils/notebookutils and the .fs.ls function

files = mssparkutils.fs.ls('Your directory path')
for file in files:
    print(file.name, file.isDir, file.isFile, file.path, file.size, file.modifyTime)

https://learn.microsoft.com/en-us/azure/synapse-analytics/spark/microsoft-spark-utilities?pivots=pro...

If this helps, please consider Accepting as a Solution to help others find it more easily

Getting createddatetime of files in one lake

Helpful resources

Fabric Community Update - August 2025

Huge last-minute discounts for FabCon Vienna from September 15-18, 2025

Getting createddatetime of files in one lake

Helpful resources

Fabric Community Update - August 2025