Programmatically Refresh & Sync SQL Analytics Endp...

vojtechsima

What's SQL Analytics Endpoint

In Microsoft Fabric, if you use a Lakehouse item, you automatically generate one (per Lakehouse) SQL Analytics Endpoint. This is a read-only (you can create views tho) interface allowing you to query delta tables with T-SQL. This endpoint is not really for managing the lakehouse, but rather to consume the data for your BI solutions.

Because SQL Analytics Endpoint (SAE) runs on top of Lakehouse, there are some restrictions. For example, in SAE, you will only see data in delta format, so delta tables and there may be a slight lag behind writing data to the lakehouse and displaying the same data through SAE.

Syncing data with Lakehouse and SQL Analytics Endpoint

When your SAE is active (it is treated as idle after 15 minutes of inactivity), it scans the Lakehouse metadata and automatically syncs it into the SQL "output" layer. This is not instant; there can be a short delay from a few seconds up to a couple of minutes.

If you have a workflow that immediately uses newly written data in an external semantic model (not the built in Lakehouse SAE semantic model), you can run into this lag during a Power BI semantic model refresh when you use Import mode.

To avoid this mismatch, you can do a few things:

Wake up the SAE by running a query against it and give it a moment to catch up.
Manually click Refresh in the SQL Analytics Endpoint explorer.
Programmatically refresh the SAE through the Fabric REST API.

If you have a scheduled data pipeline in Fabric, the programmatic option is usually the better choice, because you want the whole process fully automated without you babysitting every refresh.

Automatically refresh the lakehouse SQL Analytics Endpoint via Fabric REST API

The easiest way to sync & refresh your metadata is via the Fabric REST API endpoint called: Refresh Sql Endpoint Metadata.

You can have this automated via calling a notebook from your data pipeline or another notebook, or as a web activity in your data pipeline. To pull this off, you need the following permissions:

Caller must have contributor or higher workspace role.
Caller is User, Service Principal or Managed identity
Alternatively: Item.ReadWrite.All delegated permissions

The call itself requires:

Lakehouse's workspace Id
SQL Analytics Endpoint Id
Alternatively: Lakehouse Id

If you want to run it just for one lakehouse, I'll show you how to get it through URLs inside Power BI or Fabric service.

Find your workspace and the lakehouse item, and click on the SQL Analytics endpoint item/icon:

Then you get the link like this (either with powerbi or fabric.microsoft prefix):

https://app.powerbi.com/groups/185e12d5-de32-4fd9-894e-c0df0ec58cbd/mirroredwarehouses/fb50503a-5ee8-4b3d-873b-18d72d78313e

https://app.fabric.microsoft.com/groups/185e12d5-de32-4fd9-894e-c0df0ec58cbd/mirroredwarehouses/fb50503a-5ee8-4b3d-873b-18d72d78313e

Workspace Id

group - 185e12d5-de32-4fd9-894e-c0df0ec58cbd

Lakehouse SQL Analytics Endpoint

mirroredwarehouses - fb50503a-5ee8-4b3d-873b-18d72d78313e

To get the lakehouse Id (even tho not necessarily needed), you can do this:

Then switch to Lakehouse and get its ID:

.../lakehouses/c0a542d2-c97d-4bf1-94cd-573b7fc55e56

Lakehouse Id

lakehouses - c0a542d2-c97d-4bf1-94cd-573b7fc55e56

Call Refresh Sql Endpoint Metadata REST API Endpoint

I recommend running this as the last activity before your semantic model refresh, basically after your writing activities are complete, only then call it.

In notebook, use non-Spark Python:

Paste this:

import requests
import json

# Authenticate with Microsoft Fabric API
token = notebookutils.credentials.getToken("https://api.fabric.microsoft.com")

# Configuration
workspace = "<workspace_id>"
lakehouse_sql_endpoint = "<sql_analytics_endpoint_id>"

# API request headers
shared_headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

# Request body with timeout configuration
json_body = {
    "timeout": {
        "timeUnit": "Minutes",
        "value": 2
    }
}

# Refresh SQL Analytics Endpoint metadata
sync_sql_analytics_endpoint = requests.post(
    f"https://api.fabric.microsoft.com/v1/workspaces/{workspace}/sqlEndpoints/{lakehouse_sql_endpoint}/refreshMetadata",
    headers=shared_headers,
    json=json_body
)

# Display the response
display(sync_sql_analytics_endpoint.json())

Lemme quickly break down a couple of points:

notebookutils.credentials.getToken("https://api.fabric.microsoft.com")

This is a built-in function that allows you to get a bearer token for a given scope (fabric api). This allows you to bypass the creation of a service principal and delegated permissions. If you have contributor or higher access to the workspace, this will work. If not, you have to use a service principal and properly set it up.

json_body = {
    "timeout": {
        "timeUnit": "Minutes",
        "value": 2
    }
}

This is the required body that handles the timeout.

sync_sql_analytics_endpoint = requests.post(
    f"https://api.fabric.microsoft.com/v1/workspaces/{workspace}/sqlEndpoints/{lakehouse_sql_endpoint}/refreshMetadata",
    headers=shared_headers,
    json=json_body
)

This is the part that actually runs the sync. This endpoint supports long-running requests. In short, this request triggers an operation that does something for a generally longer period of time, and it checks whether it's done. This means you can get code 202 as accepted, so the operation will start running, and then you get code 200 with the result body.

The result body shows all the delta tables, sync status and some timestamps:

If you hit status 'Success', it means you had some unsynced data, and you synced with this request successfully. 'NotRun' means that it didn't run; that could mean that you didn't have new data since the last sync, therefore, you didn't need to run it. If their status is 'Failure', something went wrong.