Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
Johann_32
Advocate II
Advocate II

Power BI Dataflow Monitoring

Hello!

 

In our Fabric capacities, we supply system based data (from sources like databases) only via Power BI Dataflows. This means, consuming Power BI dataflows is essential in our company when working with Power BI.

 

For example, if a dataflow fails, for us it is not enough to receive the automatic email about a fail, instead there should be an overall monitoring possible. In case a dataflow refresh fails, we may be in charge of fixing the root cause as soon as possible to supply the newest data.

 

Due to this importance, we have a hard time to build up a proper Dataflow monitoring, giving us the following required information:

 

  • Which Power BI reports/semantic models are consuming which dataflows (currently only way is to check lineage in Power BI service or PurView one by one)

  • How much time does each refresh consume (available currently only in Refresh Summary of Power BI Administration)

  • How much Rows are processed per refresh (only way is to manually check the refresh history of each refresh in power bi service)

  • An overview of all existing dataflows and their status (if a dataflow refresh has failed / has succeeded loading)
    --> There is a solution from Microsoft available, but not for Power BI Dataflows (Link)

 

We try to gather all above mentioned information to show them in one big report / evaluation. 

 

We know, there is the API existing, mentioning the "Get Dataflow Transactions" below, but here also the process rows and further required information seems to be missing.

Dataflows - Get Dataflow Transactions - REST API (Power BI Power BI REST APIs) | Microsoft Learn

 

Is there anbody that has a similar case that could share best practice ?

 

Thanks in advance & best regards,

Johann

 

 

 

 

1 ACCEPTED SOLUTION
v-hashadapu
Community Support
Community Support

Hi @Johann_32 , Thank you for reaching out to the Microsoft Community Forum.

 

You need to automate the tracking of refresh status, refresh duration, row counts processed and downstream dependencies. There is no one place in Microsoft where you can do all this and APIs like Get Dataflow Transactions are incomplete. However, you can build a complete solution using Power BI REST APIs, Power Query (M) workarounds, Fabric Lakehouse, and Power Automate.

 

Start by retrieving metadata for all dataflows across your tenant. If you have admin rights, use the GET /admin/dataflows endpoint to fetch all dataflows at once. If not, you’ll need to iterate through each workspace using GET /groups and GET /groups/{groupId}/dataflows. Store each dataflow’s ID name and workspace ID into a Delta table in a Fabric Lakehouse. This gives you a scalable, persistent structure to hold your monitoring data.

 

For refresh tracking, call GET /groups/{groupId}/dataflows/{dataflowId}/refreshes to get the latest refresh history. This API returns the status, start time, end time and error message (if any). You can calculate the refresh duration by subtracting start from end time. Store this data in a separate Delta table and append each new refresh record. This lets you track long-running dataflows and identify patterns in failures over time.

 

Row counts aren’t exposed in any Power BI API, so the only accurate way to track them is by modifying your dataflows Power Query (M) logic. Inside the query, add a small metadata table that uses Table.RowCount to return the number of rows processed. For example:

 

let

    Source = ...,

    RowCount = Table.RowCount(Source),

    Stats = #table({"Metric", "Value"}, {{"RowCount", RowCount}})

in

    Stats

 

You can output this table alongside your data and read it into another dataflow or Power BI dataset. This is the only reliable and refresh-specific way to track row counts and you should apply it to high-priority dataflows first.

 

To identify which datasets and reports use each dataflow, use the Scanner API. This is the only method that provides a complete and accurate lineage map. Use POST /admin/scanner/scan to trigger a scan and GET /admin/scanner/scanResult/{scanId} to get the results. This will link dataflow IDs to dataset IDs and report IDs. Store this lineage mapping in another table in your Lakehouse. If you don’t have Premium or admin access, use the upstreamDataflows property from dataset metadata and cross-reference with reports, though this is less complete.

 

With metadata, refresh logs, row counts and lineage stored in Delta tables, you can now build a Power BI report that connects directly to your Lakehouse. Use a matrix or table to show each dataflow with its latest status, duration, rows processed and the names of linked datasets and reports. Include card visuals for failure counts, line charts for duration over time and filters for workspace and dataflow name. Highlight failed refreshes using conditional formatting.

 

Finally, use Power Automate to send alerts when a dataflow refresh fails. The “When a dataflow refresh completes” trigger lets you check the status, and if it’s “Failed,” you can send an email or Teams message with the failure details and downstream impact. This ensures your team can respond immediately.

 

If this helped solve the issue, please consider marking it “Accept as Solution” and giving a ‘Kudos’ so others with similar queries may find it more easily. If not, please share the details, always happy to help.
Thank you.

View solution in original post

2 REPLIES 2
Johann_32
Advocate II
Advocate II

Thank you for the detailed reply and ideas to solve our challenges here. We will try, step by step, to implement it like this.

v-hashadapu
Community Support
Community Support

Hi @Johann_32 , Thank you for reaching out to the Microsoft Community Forum.

 

You need to automate the tracking of refresh status, refresh duration, row counts processed and downstream dependencies. There is no one place in Microsoft where you can do all this and APIs like Get Dataflow Transactions are incomplete. However, you can build a complete solution using Power BI REST APIs, Power Query (M) workarounds, Fabric Lakehouse, and Power Automate.

 

Start by retrieving metadata for all dataflows across your tenant. If you have admin rights, use the GET /admin/dataflows endpoint to fetch all dataflows at once. If not, you’ll need to iterate through each workspace using GET /groups and GET /groups/{groupId}/dataflows. Store each dataflow’s ID name and workspace ID into a Delta table in a Fabric Lakehouse. This gives you a scalable, persistent structure to hold your monitoring data.

 

For refresh tracking, call GET /groups/{groupId}/dataflows/{dataflowId}/refreshes to get the latest refresh history. This API returns the status, start time, end time and error message (if any). You can calculate the refresh duration by subtracting start from end time. Store this data in a separate Delta table and append each new refresh record. This lets you track long-running dataflows and identify patterns in failures over time.

 

Row counts aren’t exposed in any Power BI API, so the only accurate way to track them is by modifying your dataflows Power Query (M) logic. Inside the query, add a small metadata table that uses Table.RowCount to return the number of rows processed. For example:

 

let

    Source = ...,

    RowCount = Table.RowCount(Source),

    Stats = #table({"Metric", "Value"}, {{"RowCount", RowCount}})

in

    Stats

 

You can output this table alongside your data and read it into another dataflow or Power BI dataset. This is the only reliable and refresh-specific way to track row counts and you should apply it to high-priority dataflows first.

 

To identify which datasets and reports use each dataflow, use the Scanner API. This is the only method that provides a complete and accurate lineage map. Use POST /admin/scanner/scan to trigger a scan and GET /admin/scanner/scanResult/{scanId} to get the results. This will link dataflow IDs to dataset IDs and report IDs. Store this lineage mapping in another table in your Lakehouse. If you don’t have Premium or admin access, use the upstreamDataflows property from dataset metadata and cross-reference with reports, though this is less complete.

 

With metadata, refresh logs, row counts and lineage stored in Delta tables, you can now build a Power BI report that connects directly to your Lakehouse. Use a matrix or table to show each dataflow with its latest status, duration, rows processed and the names of linked datasets and reports. Include card visuals for failure counts, line charts for duration over time and filters for workspace and dataflow name. Highlight failed refreshes using conditional formatting.

 

Finally, use Power Automate to send alerts when a dataflow refresh fails. The “When a dataflow refresh completes” trigger lets you check the status, and if it’s “Failed,” you can send an email or Teams message with the failure details and downstream impact. This ensures your team can respond immediately.

 

If this helped solve the issue, please consider marking it “Accept as Solution” and giving a ‘Kudos’ so others with similar queries may find it more easily. If not, please share the details, always happy to help.
Thank you.

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors
Top Kudoed Authors