Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

View all the Fabric Data Days sessions on demand. View schedule

NHariGouthami

Fabric Data Agent with GitHub Copilot Agent Mode


🚀 Introduction

Microsoft Fabric is revolutionizing data analytics by offering a unified platform for data engineering, data science, and business intelligence. One of its standout features is the Fabric Data Agent, which enables conversational analytics—allowing users to interact with their data using natural language queries.
In this blog, I’ll walk you through how I created a Fabric Data Agent using GitHub Copilot Agent Mode in Visual Studio Code. This approach dramatically reduces the time and effort required to build intelligent data agents.

📘 What is a Fabric Data Agent?

A Fabric Data Agent is a smart interface that connects to your data warehouse or lakehouse and allows users to:
  • Ask natural language questions
  • Receive AI-generated SQL queries
  • Explore data relationships and insights

💸 Benefits of Using Fabric Data Agents:

  • Conversational Analytics: Empower users to interact with data intuitively
  • Time Efficiency: Reduce manual query writing and schema exploration
  • Scalability: Easily adapt to new datasets and business needs
  • Integration: Seamlessly works with Microsoft Fabric and Power BI
NHariGouthami_0-1754651844872.png

 


🧠 Why GitHub Copilot Agent Mode?

Traditionally, creating a data agent involves manually exploring schema, drafting AI instructions, and writing example queries—often taking a full day. With GitHub Copilot Agent Mode, I completed all of this in less than an hour, saving time and enabling faster iteration.

📈 Step-by-Step Workflow

1. Connect to SQL Analytics Endpoint in VS Code

  • Install the MSSQL extension in VS Code
  • Enable GitHub Copilot Agent Mode
  • Authenticate and connect to your Fabric Lakehouse or Warehouse

2. Explore Schema and Sample Data

  • Ask Copilot to read tables under the schema
  • sample data from each table to understand values and relationships

3. Generate AI Instructions Automatically

  • Copilot helped draft detailed AI instructions based on schema and sample data
  • Instructions included table relationships, key fields, and query guidelines

4. Create Example Queries in JSON Format

  • Categorized queries: Core, Leadership, Advanced, Quick Stats
  • Used placeholders for dynamic filtering
  • Saved as json file and imported this file into the Fabric Data Agent
NHariGouthami_1-1754651844874.png

 


📝 Final Thoughts

This approach allows developers to:
  • Use VS Code as a unified interface for schema exploration, query generation, and agent creation
  • Save time and improve accuracy when building data agents
If you're working with Microsoft Fabric and want to empower users with conversational analytics, this method is a game-changer. What used to be a time-consuming manual process can now be done in minutes, making it ideal for agile teams and fast-paced development environments.
👉 Bonus Tip: If you prefer working in notebooks or want to automate agent creation further, you can also use the Fabric SDK to create and publish and evaulate agents programmatically.
Would love to hear your thoughts or improvements on this workflow. Happy building!

Comments

@NHariGouthami thanks for sharing this information. This workflow works based on my own testing of similar approach. While GitHub Copilot in VS Code with MSSQL extension creates a very useful AI-powered development environment, I also found it to be quite time consuming to create the data schema context for the Copilot before it's able to produce usable meaningful results. The amount of work grows quickly when there are many large tables to discover for the data agent or there are connected data sources like multiple lakehouses of warehouses. In such scenario the interaction with the Copilot, even though it's faster than doing similar work manually, becomes a bottleneck in the process. I have found in my scenario, that the botleneck can be removed by creating a simple MCP server that rund locally in VS Code and interacts with Fabric using both Fabric REST API and SQL endpoint. With MCP server and GitHub Copilot in Agentic mode it is possible to fully automate schema discovery with data sampling and context population including creating and configuring Fabric items along the process if necessary. I am in a process of creating a public GitHub repo with my MCP server code and can share if interested. Thanks again, for sharing a very useful topic.

@apturlov 

Thanks for sharing this detailed note — really interesting insights!

In my case, I’m not explicitly providing any data context to GitHub Copilot. The agent itself automatically reads all tables, gets sample data, and understands the relationships between them. So, I’m not entirely sure what you meant by the need to manually create the schema context before Copilot starts producing meaningful results.

That said, your MCP approach sounds quite intriguing — especially the part about automating schema discovery and context population via Fabric REST API and SQL endpoint. I’d love to learn more about it once you make the repo public. Please do share the details when it’s ready.

Thanks again for sharing — this is a very insightful discussion!

Thank you for this, I will try this out 

@NHariGouthami I agree that the GHC (GitHub Copilot) can discover the data context on its own if you provide the connection information for the Fabric SQL endpoint via a VS Code extension like MSSQL. The extention exposes a number of MCP-like tools that enable the GHC in agentic mode to execute a sequense of operations for the data context discovery. Extensions like MSSQL in such a scenario would be acting on your behalf using your credentials to the data source and would not provide any specific information to the GHC regarding the Microsoft Fabric context. For example, SQL endpoints are created in Fabric for various data items like Lakehouse, Warehouse, or a mirrored SQL database. All SQL endpoints would look very similar from the MSSQL and GHC points of view without understanding of item-specific features or limitations, like support of particular SQL commands, data types, etc. This is where a dedicated MCP server can be more helpful vs generic SQL endpoint access.
I am working on finalizing the content of the public GitHub repo for my Fabric MCP server and planning to release the initial version this week. I will add the repo URL in the comment once it's ready.

For anyone interested, the MCP server proof of concept (initial version) was published on GitHub here aturlov-alta/fabric-mcp: Simple MCP server for Microsoft Fabric. As this is just an initial release it offers only limited support for conversational analytics only including Fabric lakehouses. I am already working on extending this support to mirrored databases and warehouese, in the future all analytical stores that support SQL endpoint. In addition, in the initial stage only service principal authentication is supported which limits usability, but dual authentication support (both service principal and user) is already being added. As usual, any and all feedback and contributions are welcome.