Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now
I am unable to fetch the file modification time when listing the files using mssparkutils.fs.ls
Is there a any way to get modifiaction time of my file from a lakehouse file?
Solved! Go to Solution.
Hi @mahid721 ,
Yes, you can definitely get the file modification time from a Lakehouse file, even though mssparkutils.fs doesn't expose this directly.
The simplest approach is to use the Spark API with the Hadoop FileSystem. Here's a code snippet you can try in your notebook:
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
# Get the Spark session
spark = SparkSession.builder.getOrCreate()
# Path to your file in the lakehouse
file_path = "Files/your_folder/your_file.csv" # adjust this path to your file
# Get the Hadoop FileSystem
fs = spark._jvm.org.apache.hadoop.fs.FileSystem.get(spark._jsc.hadoopConfiguration())
# Get file status which contains modification time
file_status = fs.getFileStatus(spark._jvm.org.apache.hadoop.fs.Path(file_path))
# Get the modification time as a timestamp (in milliseconds)
mod_time_ms = file_status.getModificationTime()
# Convert to a readable datetime format
import datetime
mod_time = datetime.datetime.fromtimestamp(mod_time_ms/1000).strftime('%Y-%m-%d %H:%M:%S')
print(f"Last modified time: {mod_time}")Alternatively, if you want this in a dataframe format (maybe for multiple files), you can do:
# List the files in a directory with their details
files_info = spark.sql(f"CALL lakehouse.system.files('Files/your_folder/')")
files_info.select("name", "size", "modificationTime").show()Hope this helps with your problem!
If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful.
Hi @mahid721
May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.
Thank you.
Hi @mahid721
I hope this information is helpful. Please let me know if you have any further questions or if you'd like to discuss this further. If this answers your question, please Accept it as a solution and give it a 'Kudos' so others can find it easily.
Thank you.
Hi @mahid721
Thank you for reaching out microsoft fabric community forum.
I wanted to check if you had the opportunity to review the information provided by @burakkaragoz . Please feel free to contact us if you have any further questions. If his response has addressed your query, please accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank you.
Hi @mahid721 ,
Just checking in – did the steps I shared help resolve the issue?
✅ If it’s working now, feel free to mark the response as the Accepted Solution. This helps others who face the same issue find the fix faster.
✨ And of course, a little Kudos would be much appreciated!
If you're still running into trouble, let me know what you've tried so far and I’ll help you dig deeper. We’ll get it sorted!
Hi @mahid721 ,
Yes, you can definitely get the file modification time from a Lakehouse file, even though mssparkutils.fs doesn't expose this directly.
The simplest approach is to use the Spark API with the Hadoop FileSystem. Here's a code snippet you can try in your notebook:
from pyspark.sql import SparkSession
from pyspark.sql.functions import *
# Get the Spark session
spark = SparkSession.builder.getOrCreate()
# Path to your file in the lakehouse
file_path = "Files/your_folder/your_file.csv" # adjust this path to your file
# Get the Hadoop FileSystem
fs = spark._jvm.org.apache.hadoop.fs.FileSystem.get(spark._jsc.hadoopConfiguration())
# Get file status which contains modification time
file_status = fs.getFileStatus(spark._jvm.org.apache.hadoop.fs.Path(file_path))
# Get the modification time as a timestamp (in milliseconds)
mod_time_ms = file_status.getModificationTime()
# Convert to a readable datetime format
import datetime
mod_time = datetime.datetime.fromtimestamp(mod_time_ms/1000).strftime('%Y-%m-%d %H:%M:%S')
print(f"Last modified time: {mod_time}")Alternatively, if you want this in a dataframe format (maybe for multiple files), you can do:
# List the files in a directory with their details
files_info = spark.sql(f"CALL lakehouse.system.files('Files/your_folder/')")
files_info.select("name", "size", "modificationTime").show()Hope this helps with your problem!
If my response resolved your query, kindly mark it as the Accepted Solution to assist others. Additionally, I would be grateful for a 'Kudos' if you found my response helpful.
Check out the November 2025 Fabric update to learn about new features.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!