March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi there,
I'm experiencing slow read times when loading data from delta tables into data frames using PySpark in Synapse notebooks.
This does not include the time taken for the Spark cluster to spin up.
The delta table I am loading data from is relatively small, approximately 1 million rows and it takes about 30 seconds to load these rows into a dataframe.
Compared to SQL server this is very slow.
The simple syntax I'm using is:
Hi @pbix
Where are you executing this query, is it in Fabric/Synapse. If it's in synapse what is the spark pool size used to run the notebook?
If you are using Fabric, what type of environment is it, Trail/Dedicated Capacity, if it's dedicated capacity what is the size of sku and node size, if it's trail what is the node size used?
In Synapse serverless, how did you test it, is it simply by select * from table or any other way?
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
10 | |
4 | |
3 | |
2 | |
1 |
User | Count |
---|---|
11 | |
10 | |
6 | |
5 | |
4 |