Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

anmolmalviya05

How to Discover and Document Data Sources in Power BI Semantic Models

While working with a mature Power BI semantic model, we faced a simple yet critical question:

 

Where is the data for each table actually coming from?

 

Over time, the model had grown significantly and started pulling data from multiple systems such as SQL databases, APIs, SharePoint, flat files, and third-party tools. Although Power BI allows you to inspect each table’s data source individually through Power Query, there was no single consolidated view showing all tables and their underlying data sources in one place.

 

This became a challenge during audits, troubleshooting, and knowledge transfer. Opening every Power Query one by one was time-consuming and prone to mistakes, especially when working under tight project timelines.

 

To solve this problem, we built a simple automated approach that extracts the data source information for every table directly from the semantic model.

 

In this article, I will walk through the concept and implementation.

 

Introduction

In modern Power BI and Microsoft Fabric environments, semantic models often become complex and integrate data from multiple systems such as:

  • SQL databases

  • APIs

  • SharePoint

  • Flat files

  • Third-party tools like Solver / BI360

Understanding where each table originates is extremely important for:

  • Data governance

  • Model documentation

  • Troubleshooting issues

  • Client handovers and knowledge transfer

However, Power BI does not provide a built-in consolidated view that maps tables to their data sources.

 

Business Problem

When working with large Power BI models, identifying the data source of each table becomes difficult because:

  • Sources must be checked table by table in Power Query

  • There is no centralized source documentation

  • Manual inspection is slow and error-prone

  • It creates challenges during audits, troubleshooting, and project transitions

A scalable method was needed to automatically extract and document data sources across the entire semantic model.

 

Conceptual Approach

Each imported table in a Power BI semantic model contains a partition that stores the Power Query (M) expression used to load the data.

Inside these M expressions, Power BI connectors are called using functions such as:

  • Sql.Database

  • Web.Contents

  • SharePoint.Contents

  • BI360.Contents

By parsing these M expressions as text, we can programmatically extract key details such as:

  • Table Name

  • Connector Type

  • Source Location

This information can then be exported into a structured format such as a CSV file for documentation and auditing purposes.

 

Step-by-Step Implementation

 

Step 1: Open Tabular Editor

Open your Power BI report and launch Tabular Editor using the External Tools menu.

Steps:

  1. Open the Power BI report in Power BI Desktop

  2. Go to External Tools

  3. Click Tabular Editor

This connects Tabular Editor directly to the Power BI semantic model metadata.

 

Step 2: Open Advanced Scripting

In Tabular Editor, navigate to:

File → Advanced Scripting

The Advanced Scripting window allows you to run C# scripts against the model metadata, enabling automation tasks such as extracting table properties, measures, and partition information.

 

Step 3: Run the Connector-Agnostic Script

Paste the following script into the Advanced Scripting window and adjust the output file path as required.

 

using System.Text.RegularExpressions;

var result = "TableName,Connector,Source\n";

foreach (var t in Model.Tables)
{
string connector = "Calculated";
string source = "Calculated";

if (t.Partitions.Count > 0 && !string.IsNullOrEmpty(t.Partitions[0].Expression))
{
var expr = t.Partitions[0].Expression;

var match = Regex.Match(
expr,
@"([A-Za-z0-9_]+)\.([A-Za-z0-9_]+)\(""([^""]+)""",
RegexOptions.IgnoreCase
);

if (match.Success)
{
connector = $"{match.Groups[1].Value}.{match.Groups[2].Value}";
source = match.Groups[3].Value;
}
else
{
connector = "Unknown / Complex M";
source = "See M Expression";
}
}

result += $"{t.Name},{connector},{source}\n";
}

System.IO.File.WriteAllText(
@"C:\Users\YourUser\Downloads\PBIX_Table_Sources.csv",
result
);

 

This script performs the following actions:

  • Iterates through all tables in the semantic model

  • Reads the Power Query expression from each table partition

  • Uses Regex parsing to detect connector functions

  • Extracts the connector name and source location

  • Writes the results to a CSV file

Step 4: Review the Output

After running the script:

  1. Open the generated CSV file

  2. Load it into Excel

  3. Convert the data into a table using Ctrl + T

  4. Use filters to review entries marked as Unknown / Complex M

These rows usually contain custom M logic or complex transformations that require manual inspection.

 

Business Value

This approach provides several practical benefits:

  • Improved governance by documenting data sources automatically

  • Faster audits with a centralized view of table origins

  • Reduced onboarding time for new developers

  • Better transparency across complex Power BI models

  • Repeatable automation that can be reused across projects

Instead of manually checking each query, teams can generate a complete data source inventory in seconds.

 

Conclusion

Large Power BI semantic models often hide an important piece of information: where the data actually comes from.

With a small automation script using Tabular Editor and model metadata, we can quickly generate a comprehensive documentation of table data sources.

This approach eliminates a major blind spot in Power BI models and provides a scalable way to maintain transparency, governance, and documentation across enterprise BI environments.


Let's Connect on LinkedIn


Subscribe to my YouTube channel for Microsoft Fabric and Power BI updates.