Solved: Re: Disparity in Microsoft Documentation Relating ...

IntegrateGuru · ‎10-11-2024

On this page:
https://learn.microsoft.com/en-us/fabric/data-engineering/spark-compute

It says:

"For example, a Fabric capacity SKU F64 has 64 capacity units, which is equivalent to 128 Spark VCores. You can use these Spark VCores to create nodes of different sizes for your custom Spark pool, as long as the total number of Spark VCores doesn't exceed 128."

Then directly under that there is the following table:

Fabric Capacity Table

How did it go from 64 CU = 128 VCores to 64 CU = 384 VCores?

Then, on this page: https://learn.microsoft.com/en-us/fabric/data-engineering/spark-job-concurrency-and-queueing

We get another table with further conflicting info:

fabric capacity 2

I understand that the P1 SKU is burstable/autoscaling, but my understanding is that F64 is not.

And even so, there are only an additional 8 CUs for the burstable P1 SKU, so going by 1CU = 2 V-Cores, that would only be an additonal 16 V-Cores. So again, going by the 1CU:2VCore math, 128 VCores base + 16 VCores bursting = 144 VCores, not 384 VCores.

Can anyone help clarify this for me? I don't know if I am not understanding something, or the documentation is incorrect?
What is the actual count of spark VCores available? Why is it being calculated as 64CU = 128 VCores , but then the tables show 384?

frithjof_v · ‎10-12-2024

Spark bursting and Spark autoscaling are two different concepts. Both of them apply to all F SKUs and P SKUs, I think.

Spark autoscaling must not be confused with the P SKU capacity autoscale feature.

Spark bursting is a feature which helps with concurrency (allowing a combined VCore consumption of 3x the capacity's VCore limit, in cases where multiple Spark jobs are running at the same time). Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn

Spark autoscale is a feature which dynamically allocates or deallocates nodes within a single, individual job. Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn

Bursting:

Example calculation: F64 SKU offers 128 Spark VCores. The burst factor applied for a F64 SKU is 3, which gives a total of 384 Spark Vcores. The burst factor is only applied to help with concurrency and does not increase the max cores available for a single Spark job. That means a single Notebook or Spark job definition or lakehouse job can use a pool configuration of max 128 vCores and 3 jobs with the same configuration can be run concurrently. If notebooks are using a smaller compute configuration, they can be run concurrently till the max utilization reaches the 384 SparkVcore limit.

Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn

Autoscaling:

Autoscale for Apache Spark pools allows automatic scale up and down of compute resources based on the amount of activity. When you enable the autoscale feature, you set the minimum and maximum number of nodes to scale. When you disable the autoscale feature, the number of nodes set remains fixed. You can alter this setting after pool creation, although you might need to restart the instance.

Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn

For example on an F64 SKU, I think this is how it works:

The max vCores in a single pool is 128 vCores.
The max vCores used by a single job is 128 vCores.
The max concurrent vCore usage on the F64 capacity is 384 vCores (due to bursting). This means 3 jobs can run at the same time, each using a maximum of 128 vCores.

Optimistic job admission: Now, let's assume we are using a pool with Medium node size (8 vCores per node). Let's say we have enabled Spark autoscale, and set the minimum limit of nodes to 1 and the maximum limit of nodes to 16. Due to optimistic job admission, a total of 48 jobs could in theory be run (if each job uses only 1 node each). The math behind it: 384 vCores with bursting / (8 vCores per node x 1 node per job) = 48 jobs.

View solution in original post

frithjof_v · ‎10-12-2024

Spark bursting and Spark autoscaling are two different concepts. Both of them apply to all F SKUs and P SKUs, I think.

Spark autoscaling must not be confused with the P SKU capacity autoscale feature.

Spark bursting is a feature which helps with concurrency (allowing a combined VCore consumption of 3x the capacity's VCore limit, in cases where multiple Spark jobs are running at the same time). Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn

Spark autoscale is a feature which dynamically allocates or deallocates nodes within a single, individual job. Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn

Bursting:

Example calculation: F64 SKU offers 128 Spark VCores. The burst factor applied for a F64 SKU is 3, which gives a total of 384 Spark Vcores. The burst factor is only applied to help with concurrency and does not increase the max cores available for a single Spark job. That means a single Notebook or Spark job definition or lakehouse job can use a pool configuration of max 128 vCores and 3 jobs with the same configuration can be run concurrently. If notebooks are using a smaller compute configuration, they can be run concurrently till the max utilization reaches the 384 SparkVcore limit.

Concurrency limits and queueing in Apache Spark for Fabric - Microsoft Fabric | Microsoft Learn

Autoscaling:

Autoscale for Apache Spark pools allows automatic scale up and down of compute resources based on the amount of activity. When you enable the autoscale feature, you set the minimum and maximum number of nodes to scale. When you disable the autoscale feature, the number of nodes set remains fixed. You can alter this setting after pool creation, although you might need to restart the instance.

Apache Spark compute for Data Engineering and Data Science - Microsoft Fabric | Microsoft Learn

For example on an F64 SKU, I think this is how it works:

The max vCores in a single pool is 128 vCores.
The max vCores used by a single job is 128 vCores.
The max concurrent vCore usage on the F64 capacity is 384 vCores (due to bursting). This means 3 jobs can run at the same time, each using a maximum of 128 vCores.

Optimistic job admission: Now, let's assume we are using a pool with Medium node size (8 vCores per node). Let's say we have enabled Spark autoscale, and set the minimum limit of nodes to 1 and the maximum limit of nodes to 16. Due to optimistic job admission, a total of 48 jobs could in theory be run (if each job uses only 1 node each). The math behind it: 384 vCores with bursting / (8 vCores per node x 1 node per job) = 48 jobs.

zincob · ‎10-29-2024

The documentation is extremely confusing.
Spark Compute: Here the text says "SKU F64 has 64 capacity units, which is equivalent to 128 Spark VCores." there is no reference at all the Burst and then the table out of the blue mentions 384 VCore

Spark Job Concurrency: Here the Burst factor for F64 is 3.

Burstable Capacity: Here Burstable scale factor is 1-12x. True that this is referring to the warehousing but it makes everything quite confusing.

IntegrateGuru · ‎10-12-2024

Thank you, this was incredibly helpful!

Disparity in Microsoft Documentation Relating to Spark Compute

Helpful resources

Fabric Monthly Update - September 2025

Fabric Community Update - August 2025

FabCon is coming to Atlanta

Disparity in Microsoft Documentation Relating to Spark Compute

Helpful resources

Fabric Monthly Update - September 2025

Fabric Community Update - August 2025