This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. We're covering it all. You won't want to miss it.
Learn moreDid you hear? There's a new SQL AI Developer certification (DP-800). Start preparing now and be one of the first to get certified. Register now
We’ve made a major enhancement to the Notebook Integration with Fabric User Data Functions (UDFs)—you can now use Pandas DataFrames and Series as input and output types, powered by native integration with Apache Arrow!
This enhancement brings higher performance, improved efficiency, and better scalability to your Fabric Notebooks—enabling seamless function reuse for large-scale data processing in Python, PySpark, Scala, and R.
As part of our initial preview, we introduced the ability to:
display(myFunction.functionDetails).This helped teams modularize logic, reduce redundancy, and improve productivity across collaborative data science and engineering projects.
In this update, Pandas DataFrames and Series are now supported as first-class input and output types for UDFs—enabled by deep integration with Apache Arrow, a highly efficient columnar memory format optimized for analytics workloads.
Instead of manually converting large datasets to JSON, developers can now natively pass Pandas DataFrames to UDFs, operate on them efficiently, and return processed results—all with minimal latency and memory overhead.
Let’s say you want to aggregate total revenue by driver across a dataset with millions of rows. Now, you can pass a Pandas DataFrame into a shared UDF and perform that operation directly:
# Get the function
agg_func = notebookutils.udf.getFunctions("AggregateRevenueByDriver")
# Sample input as Pandas DataFrame
import pandas as pd
df = pd.DataFrame({
"driver_id": [1, 2, 1],
"revenue": [100.0, 150.0, 200.0]
})
# Call UDF with DataFrame input and receive DataFrame output
result_df = agg_func.aggregate(df)
# Display result
print(result_df)
val aggFunc = notebookutils.udf.getFunctions("AggregateRevenueByDriver")
// Sample input
val input = Seq(
(1, 100.0),
(2, 150.0),
(1, 200.0)
).toDF("driver_id", "revenue")
// Call UDF and get DataFrame output
val result = aggFunc.aggregate(input)
// Show result
result.show()
R
agg_func <- notebookutils.udf.getFunctions("AggregateRevenueByDriver")
# Sample input
df <- data.frame(
driver_id = c(1, 2, 1),
revenue = c(100.0, 150.0, 200.0)
)
# Call the UDF
result <- agg_func$aggregate(df)
# View result
print(result)
With this Arrow-powered enhancement, you can:
Try the new UDF functionality today by using NotebookUtils in your Fabric Notebook. Start by registering a Pandas-compatible UDF, then pass in your DataFrames and enjoy lightning-fast results with Apache Arrow under the hood.
For more information, refer to the NotebookUtils for Fabric documentation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.