Solved: how to extract DMV data from multiple warehouses/l...

Jeanxyz · ‎09-05-2025

Just out of curiosity, I'm wondering if it's possible to create a DMV query that loops over all the lakehouses and warehouses across multiple workspaces? In this way, a Fabric admin can monitor queries in one place.

Below is my DMV, it extracts queries run in stock_lakehouse during the past hour. I want to avoid inputting the lakehouse name in a hard code, is it possible to get a lot of all the workspaces, then a list of lakehouses and warehouses for each workspace?

SELECT *
FROM stock_lakehouse.queryinsights.exec_requests_history 
WHERE start_time >= DATEADD(MINUTE, -60, GETUTCDATE())

v-echaithra · ‎09-05-2025

Hi @Jeanxyz ,

Thank you for reaching out to Microsoft Community.

Currently, Microsoft Fabric does not support running a single DMV query that loops over all lakehouses and warehouses across different workspaces. There is no built-in or native "cross-workspace" DMV querying mechanism available. However, it is possible to build an automated, orchestrated solution using tools like Microsoft Fabric APIs, Azure Functions, or Data Factory pipelines to dynamically collect this information.

The process begins by retrieving a list of all workspaces. To do this, you'll need admin permissions and can use endpoints such as admin/workspaces from the Power BI REST API or Microsoft Graph API. Once you have the workspaces, you can loop through each one and identify the Fabric items within them. Specifically, you should look for items of type "Lakehouse" and "Warehouse" in each workspace. These can be found using Fabric APIs or Graph API beta endpoints.

After identifying the lakehouses and warehouses, you can run the appropriate DMV queries for each. This can be done using a Fabric Notebook, Data Pipeline, or Azure Function. For example, you might run a query like:

SELECT * FROM <lakehouse_or_warehouse>.queryinsights.exec_requests_history
WHERE start_time >= DATEADD(MINUTE, -60, GETUTCDATE())

The results from each query should then be aggregated and stored in a central location, such as a designated lakehouse or warehouse (e.g., admin_monitoring_lakehouse). This centralized data can then be used for reporting or monitoring via Power BI dashboards or other Fabric tools.

View solution in original post

spaceman127 · ‎09-08-2025

Hi @Jeanxyz ,

i have a script.

This will display all items in the respective workspaces.

I use this often in customer projects and it helps enorm.

# 1) Install Package Semantic Link
%pip install --upgrade semantic-link-sempy -q

# 2) Imports the Libraries

import sempy.fabric as fabric
import pandas as pd
from IPython.display import display, HTML

def summarize_workspace_items(full_display: bool = True):
    
    # 3) Get Workspaces from the Tenant
    workspaces = fabric.list_workspaces()
    workspaces = workspaces.query("Type == 'Workspace'")  # only normal workspaces, not the own workspace
    
    rows = []
    
    for ws_id, ws_name in zip(workspaces["Id"], workspaces["Name"]):
        items = fabric.list_items(workspace=ws_id)
        if items.empty:
            continue
        
        # Automatically determine all item types
        for item_type in items["Type"].unique():
            subset = items[items["Type"] == item_type].copy()
            if subset.empty:
                continue
            
            subset["workspace_name"] = ws_name
            subset["Item Type"] = item_type
            subset["item_id"] = subset["Id"]
            subset["item_name"] = subset.get("Name", subset.get("displayName", subset.get("Display Name")))
            
            rows.append(subset[["item_id", "item_name", "workspace_name", "Item Type"]])
    
    # 4) Combine all rows
    if rows:
        df = pd.concat(rows, ignore_index=True)
    else:
        df = pd.DataFrame(columns=["item_id", "item_name", "workspace_name", "Item Type"])
    
    # 5) Sort the DataFrame
    df = df.sort_values(["workspace_name", "item_name"]).reset_index(drop=True)
    
    # 6) Display the full table if requested / Its besser to show
    if full_display:
        pd.set_option("display.max_rows", None)
        pd.set_option("display.max_columns", None)
        pd.set_option("display.width", 200)
        display(HTML(df.to_html(max_rows=None, max_cols=None)))
    
    return df

# Execute the function and get the DataFrame
df_items = summarize_workspace_items()

i Hope that help you.

Best regards

View solution in original post

v-echaithra · ‎09-08-2025

Hi @Jeanxyz ,

We’d like to follow up regarding the recent concern. Kindly confirm whether the issue has been resolved, or if further assistance is still required. We are available to support you and are committed to helping you reach a resolution.

Best Regards,
Chaithra E.

v-echaithra · ‎09-05-2025

Hi @Jeanxyz ,

Thank you for reaching out to Microsoft Community.

Currently, Microsoft Fabric does not support running a single DMV query that loops over all lakehouses and warehouses across different workspaces. There is no built-in or native "cross-workspace" DMV querying mechanism available. However, it is possible to build an automated, orchestrated solution using tools like Microsoft Fabric APIs, Azure Functions, or Data Factory pipelines to dynamically collect this information.

The process begins by retrieving a list of all workspaces. To do this, you'll need admin permissions and can use endpoints such as admin/workspaces from the Power BI REST API or Microsoft Graph API. Once you have the workspaces, you can loop through each one and identify the Fabric items within them. Specifically, you should look for items of type "Lakehouse" and "Warehouse" in each workspace. These can be found using Fabric APIs or Graph API beta endpoints.

After identifying the lakehouses and warehouses, you can run the appropriate DMV queries for each. This can be done using a Fabric Notebook, Data Pipeline, or Azure Function. For example, you might run a query like:

SELECT * FROM <lakehouse_or_warehouse>.queryinsights.exec_requests_history
WHERE start_time >= DATEADD(MINUTE, -60, GETUTCDATE())

The results from each query should then be aggregated and stored in a central location, such as a designated lakehouse or warehouse (e.g., admin_monitoring_lakehouse). This centralized data can then be used for reporting or monitoring via Power BI dashboards or other Fabric tools.

Jeanxyz · ‎09-08-2025

It looks one can get a list of items in fabric workspace via sempy packages. I will try this in notebook when I get some time this weekend. Will post here if I get good news.

https://learn.microsoft.com/en-us/python/api/semantic-link-sempy/sempy.fabric?view=semantic-link-pyt...

spaceman127 · ‎09-08-2025

Hi @Jeanxyz ,

i have a script.

This will display all items in the respective workspaces.

I use this often in customer projects and it helps enorm.

# 1) Install Package Semantic Link
%pip install --upgrade semantic-link-sempy -q

# 2) Imports the Libraries

import sempy.fabric as fabric
import pandas as pd
from IPython.display import display, HTML

def summarize_workspace_items(full_display: bool = True):
    
    # 3) Get Workspaces from the Tenant
    workspaces = fabric.list_workspaces()
    workspaces = workspaces.query("Type == 'Workspace'")  # only normal workspaces, not the own workspace
    
    rows = []
    
    for ws_id, ws_name in zip(workspaces["Id"], workspaces["Name"]):
        items = fabric.list_items(workspace=ws_id)
        if items.empty:
            continue
        
        # Automatically determine all item types
        for item_type in items["Type"].unique():
            subset = items[items["Type"] == item_type].copy()
            if subset.empty:
                continue
            
            subset["workspace_name"] = ws_name
            subset["Item Type"] = item_type
            subset["item_id"] = subset["Id"]
            subset["item_name"] = subset.get("Name", subset.get("displayName", subset.get("Display Name")))
            
            rows.append(subset[["item_id", "item_name", "workspace_name", "Item Type"]])
    
    # 4) Combine all rows
    if rows:
        df = pd.concat(rows, ignore_index=True)
    else:
        df = pd.DataFrame(columns=["item_id", "item_name", "workspace_name", "Item Type"])
    
    # 5) Sort the DataFrame
    df = df.sort_values(["workspace_name", "item_name"]).reset_index(drop=True)
    
    # 6) Display the full table if requested / Its besser to show
    if full_display:
        pd.set_option("display.max_rows", None)
        pd.set_option("display.max_columns", None)
        pd.set_option("display.width", 200)
        display(HTML(df.to_html(max_rows=None, max_cols=None)))
    
    return df

# Execute the function and get the DataFrame
df_items = summarize_workspace_items()

i Hope that help you.

Best regards

Jeanxyz · ‎09-09-2025

Thanks a lot, it works. I also managed to extract queries from warehouses/lakehouses across workspaces thanks to the item extraction function.

how to extract DMV data from multiple warehouses/lakehouses?

Helpful resources

Fabric Monthly Update - December 2025

FabCon Atlanta 2026

FabCon is coming to Atlanta

how to extract DMV data from multiple warehouses/lakehouses?

Helpful resources

Fabric Monthly Update - December 2025

FabCon Atlanta 2026