Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join nowData Days is here! Join us now for 60+ days of learning, challenges, and connection. Learn more
If you haven’t already, check out Arun Ulag’s hero blog “Microsoft Build 2026: Building Agentic Apps with Microsoft Fabric and Microsoft Databases” for a complete look at all of our Microsoft Build announcements across our Fabric and database offerings.
_______________________________
As data volumes grow, concurrency rises, and analytics workloads become more dynamic and AI-driven, performance becomes harder to predict and harder to scale. Every query sits in the critical path, adding pressure to the warehouse, and every second counts. This is the core tension in analytics today. The expectations have changed, but the underlying technology has not, leaving agents, applications, and AI systems waiting on data. To meet this moment, analytics needs a new kind of execution engine.
GPU-accelerated Fabric Data Warehouse is purpose-built to deliver fast, predictable analytics at scale, meeting the needs of analysts, analytics engineers, and developers alike. It fundamentally shifts the role of the data warehouse from a system built for reporting into an execution layer for applications, agents, and AI systems that continuously reason over data, enabling faster decisions and more productive ways of working with data.
Using NVIDIA accelerated computing, we have built an engine that takes the SQL queries you know and love and makes them faster than ever before, especially as concurrency, scale, and complexity increase. The importance of this work is recognized in the research community, including a recent SIGMOD Best Industry Paper award for CoddSpeed: Hardware Accelerated Query Processing in Microsoft Fabric.
GPU-accelerated Data Warehouse is simple to use, with no query rewrites or new system to manage. You can turn it on from workspace settings, and it applies to all SQL Analytics Endpoints and Data Warehouses in the workspace. Select “run,” and eligible queries are automatically accelerated, allowing teams to execute faster with less manual tuning; simplifying the path from development to production.
Our customers across professional services, healthcare, and manufacturing are already seeing value from GPU-accelerated Fabric Data Warehouse:
“In healthcare, timely insights matter, and GPU-accelerated Fabric Data Warehouse will help ensure critical data is available when clinicians and leaders need it most. We’re seeing up to 5x improvement in our query speeds, which allows our teams to spend less time managing performance and more time delivering meaningful insights.”
- Shaun McDonald, IT Manager, UNC Health
______________________________
“Our experience with GPU-accelerated Fabric Data Warehouse has been excellent. The capability integrates seamlessly into our existing architecture and has delivered meaningful performance improvements across a range of queries, with complex workloads running 3.4x faster at single concurrency and more consistently.
At WTW, we operate a shared data platform supporting multiple applications and reporting workloads. GPU-accelerated Fabric Data Warehouse enables us to serve high volumes of queries more efficiently, directly improving the responsiveness of our downstream analytics and reporting. We see the potential for this capability to help unify how we deliver data across our platform.”
- Andrew Bradbrook, Director – Systems Architecture, WTW
______________________________
“The feature GPU-accelerated Fabric Data Warehouse integrated smoothly with our existing Fabric environment and was straightforward to enable and evaluate. We observed noticeable performance improvements for analytics-heavy warehouse workloads, including faster execution of complex queries. This capability is especially valuable for large-scale reporting scenarios, enabling quicker access to insights and supporting a scalable analytics platform that can improve decision-making speed.”
- Rajkumar Maheshwar, Sr. Manager, Enterprise Data Engineering, Benjamin Moore
______________________________
Across these customers, GPU-accelerated Fabric Data Warehouse directly translates into more responsive reporting, faster access to insights, and less time spent managing performance. Organizations can enable the capability quickly and apply it to real-world scenarios without re-architecting their data warehouse. The net impact is clear: teams can handle higher query volumes, scale analytics more efficiently, and deliver insights faster to the business.
When you run a query, GPU acceleration happens automatically. Queries flow through the SQL frontend, where they are parsed and optimized, just like today. From there, the query moves into the distributed engine, which breaks it into fragments and distributes work across the system.
The difference is in how the queries are executed. Instead of sending all work to CPUs, the optimizer can push eligible portions of the query plan to GPUs. This happens intelligently, accelerating operations like large joins and aggregations, while the rest executes on CPUs. If a query is not eligible for GPU execution, it seamlessly runs on CPUs, preserving correctness.
The result is a system that looks the same to all users: the same T-SQL surface, the same tools, and the same architecture. What changes is performance. By introducing GPUs into the execution engine, Fabric Data Warehouse can process more data in parallel, reduce latency for complex queries, and deliver the consistent, low-latency performance required for modern analytics, applications, and AI systems.
To quantify the impact of GPU acceleration, we ran industry benchmarks against three comparable cloud data warehouse providers and observed up to 7x faster performance across reporting, application, and AI-driven analytics scenarios. These are workloads where systems are typically pushed hardest, with multiple users and agents issuing queries simultaneously.
Figure: Query runtime across concurrency levels (1, 16, 64 users). GPU-accelerated Fabric Data Warehouse delivers lower execution times, with up to 7× faster performance at high concurrency.
At a 100 GB data scale, what stands out is not just the speed of individual queries, but how the system behaves under load. As concurrency increases, most data warehouses slow down and become less predictable. In contrast, Fabric’s GPU-accelerated warehouse maintains stable performance, completing the full 22-query workload in approximately five seconds whether one user is running queries or 64 concurrent users.
NVIDIA accelerated computing is essential to this capability. We use custom CUDA kernels for critical operations like joins, aggregations, scans, and arithmetic, and leverage LibTorch for memory management and general-purpose operations. This allows us to harness massive GPU parallelism to accelerate real-world analytical workloads while maintaining database correctness and reliability.
We also take advantage of modern GPU hardware features, including high-bandwidth memory and highly parallel CUDA cores, to efficiently process large volumes of data and execute complex analytical queries— the same foundation that powers AI workloads at scale. This capability comes out of deep technical collaborations between our engineering team and NVIDIA engineers to bring GPU- accelerated Fabric Data Warehouse from research into production. The result is a deeply integrated system that delivers significant performance gains while remaining seamless and fully compatible with existing T-SQL workflows.
As Todd Mostak, Senior Director, Analytics and Data Intelligence at NVIDIA points out:
“Complex SQL joins and large-scale scans that fuel agentic AI workloads turn traditional data processing systems into a bottleneck, especially under high user concurrency. By intelligently offloading compute-intensive operations to NVIDIA accelerated computing, Microsoft Fabric Data Warehouse performs 6x faster than CPU-powered Fabric Data Warehouse. Data teams can now run resource-heavy T-SQL queries with ultra-low latency, unlocking faster, more scalable production AI applications.”
As applications, agents, and AI systems become more data-driven, performance is no longer just a technical concern. It becomes part of the user experience. Every query matters, and every delay is visible. GPU-accelerated Fabric Data Warehouse addresses this shift by bringing a new execution model to analytics; one that delivers predictable performance at scale and enables a new class of scenarios. This is not just an improvement in speed. It’s a fundamental shift in how analytics systems are built and used, powering AI-driven apps and agents. This is the new foundation for modern analytics.
Get started
GPU-Accelerated Fabric Data Warehouse will be available soon in an early access preview across four regions. You can sign up today to get access and start building apps and delivering AI and BI insights faster than ever.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.