Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes! Register now.

Reply
AMD0791
Advocate III
Advocate III

Scanner Results missing detailed metadata for datasets

I have a powershell script that I manually run periodically to get results from the scanner api, saving the results to  JSON files.  I then feed those JSON files to a Dataflow and dump the results into some Fabric Lakehouse tables.

 

I last ran the script in February, for 132 workspaces (the script breaks them down into smaller lists to send to API).  I ended up with 2 files that were 53MB and 14MB.

I ran it today for 140 workspaces, and ended up with 2 files that were 20MB and 3MB.

It looks like all the detailed metadata for the dataset tables is missing in the new files.  

 

This is the query being run by the powershell script: 

https://api.powerbi.com/v1.0/myorg/admin/workspaces/getInfo?lineage=True&datasourceDetails=True&data...

 

The tenant settings mentioned in this article are enabled for a security group and I am a member of that security group.

Set up metadata scanning in an organization - Microsoft Fabric | Microsoft Learn

 

3 REPLIES 3
Acroustique
Advocate I
Advocate I

Any luck finding a solution @AMD0791 ?

I have a similar issue where the scan only returns the datasourceUsages (and datasourceInstanceId) for a limited number of datasets, without rhyme or reason...

AMD0791
Advocate III
Advocate III

After a little bit of investigation, it looks like the latest scan got detailed metatdata for some datasets but not others in the same workspace. 

 

I'm wondering if I'm getting caught in the limitations spelled out here, but that doesn't make sense because the metadata is blank on datasets that have been refreshing daily.    Run metadata scanning - Microsoft Fabric | Microsoft Learn

From the article:

  • semantic models that haven't been refreshed or republished will be returned in API responses but without their subartifact information and expressions. For example, semantic model name and lineage are included in the response, but not the semantic model's table and column names.
  • semantic models containing only DirectQuery tables will return subartifact metadata only if some sort of action has been taken on the semantic model, such as someone building a report on top of it, someone viewing a report based on it, etc."

 

How recently does the dataset need to be republished to get the metadata?  This limitation seems like it would make it very difficult to get the initial full scan of metadata if the dataset has to be republished very recently.  And in this case, I'm getting metadata for datasets last published a year ago, and not getting metadata for a dataset re-published in March.

The workspaces I'm using are in premium workspaces, so shouldn't be subject to the size limits

 

 



Anonymous
Not applicable

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

September Power BI Update Carousel

Power BI Monthly Update - September 2025

Check out the September 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.