The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
On this page:
https://learn.microsoft.com/en-us/fabric/data-engineering/spark-compute
It says:
"For example, a Fabric capacity SKU F64 has 64 capacity units, which is equivalent to 128 Spark VCores. You can use these Spark VCores to create nodes of different sizes for your custom Spark pool, as long as the total number of Spark VCores doesn't exceed 128."
Then directly under that there is the following table:
Fabric Capacity Table
How did it go from 64 CU = 128 VCores to 64 CU = 384 VCores?
Then, on this page: https://learn.microsoft.com/en-us/fabric/data-engineering/spark-job-concurrency-and-queueing
We get another table with further conflicting info:
fabric capacity 2
I understand that the P1 SKU is burstable/autoscaling, but my understanding is that F64 is not.
And even so, there are only an additional 8 CUs for the burstable P1 SKU, so going by 1CU = 2 V-Cores, that would only be an additonal 16 V-Cores. So again, going by the 1CU:2VCore math, 128 VCores base + 16 VCores bursting = 144 VCores, not 384 VCores.
Can anyone help clarify this for me? I don't know if I am not understanding something, or the documentation is incorrect?
What is the actual count of spark VCores available? Why is it being calculated as 64CU = 128 VCores , but then the tables show 384?
Solved! Go to Solution.
Spark bursting and Spark autoscaling are two different concepts. Both of them apply to all F SKUs and P SKUs, I think.
Spark autoscaling must not be confused with the P SKU capacity autoscale feature.
Spark bursting is a feature which helps with concurrency (allowing a combined VCore consumption of 3x the capacity's VCore limit, in cases where multiple Spark jobs are running at the same time). Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn
Spark autoscale is a feature which dynamically allocates or deallocates nodes within a single, individual job. Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn
Bursting:
Example calculation: F64 SKU offers 128 Spark VCores. The burst factor applied for a F64 SKU is 3, which gives a total of 384 Spark Vcores. The burst factor is only applied to help with concurrency and does not increase the max cores available for a single Spark job. That means a single Notebook or Spark job definition or lakehouse job can use a pool configuration of max 128 vCores and 3 jobs with the same configuration can be run concurrently. If notebooks are using a smaller compute configuration, they can be run concurrently till the max utilization reaches the 384 SparkVcore limit.
Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn
Autoscaling:
Autoscale for Apache Spark pools allows automatic scale up and down of compute resources based on the amount of activity. When you enable the autoscale feature, you set the minimum and maximum number of nodes to scale. When you disable the autoscale feature, the number of nodes set remains fixed. You can alter this setting after pool creation, although you might need to restart the instance.
Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn
For example on an F64 SKU, I think this is how it works:
Spark bursting and Spark autoscaling are two different concepts. Both of them apply to all F SKUs and P SKUs, I think.
Spark autoscaling must not be confused with the P SKU capacity autoscale feature.
Spark bursting is a feature which helps with concurrency (allowing a combined VCore consumption of 3x the capacity's VCore limit, in cases where multiple Spark jobs are running at the same time). Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn
Spark autoscale is a feature which dynamically allocates or deallocates nodes within a single, individual job. Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn
Bursting:
Example calculation: F64 SKU offers 128 Spark VCores. The burst factor applied for a F64 SKU is 3, which gives a total of 384 Spark Vcores. The burst factor is only applied to help with concurrency and does not increase the max cores available for a single Spark job. That means a single Notebook or Spark job definition or lakehouse job can use a pool configuration of max 128 vCores and 3 jobs with the same configuration can be run concurrently. If notebooks are using a smaller compute configuration, they can be run concurrently till the max utilization reaches the 384 SparkVcore limit.
Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn
Autoscaling:
Autoscale for Apache Spark pools allows automatic scale up and down of compute resources based on the amount of activity. When you enable the autoscale feature, you set the minimum and maximum number of nodes to scale. When you disable the autoscale feature, the number of nodes set remains fixed. You can alter this setting after pool creation, although you might need to restart the instance.
Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn
For example on an F64 SKU, I think this is how it works:
The documentation is extremely confusing.
Spark Compute: Here the text says "SKU F64 has 64 capacity units, which is equivalent to 128 Spark VCores." there is no reference at all the Burst and then the table out of the blue mentions 384 VCore
Spark Job Concurrency: Here the Burst factor for F64 is 3.
Burstable Capacity: Here Burstable scale factor is 1-12x. True that this is referring to the warehousing but it makes everything quite confusing.
Thank you, this was incredibly helpful!
User | Count |
---|---|
4 | |
2 | |
2 | |
2 | |
2 |
User | Count |
---|---|
13 | |
9 | |
8 | |
6 | |
5 |