The ultimate Microsoft Fabric, Power BI, Azure AI, and SQL learning event: Join us in Stockholm, September 24-27, 2024.
Save €200 with code MSCUST on top of early bird pricing!
Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have been using the 'Admin - Datasets GetDatasourcesAsAdmin' REST api (https://learn.microsoft.com/en-us/rest/api/power-bi/admin/datasets-get-datasources-as-admin) in our scripts for a while now and it has been working as expected, however, I have been getting error with Status code : 429. You have exceeded the amount of requests allowed in the current time frame and further requests will fail. Retry in xxxx seconds.
The error occurs after 30 odd requests.
Has anything changed recently on this api?
Solved! Go to Solution.
Hi all,
I got a reply from Microsoft about the issue:
We have confirmation about a new throttling limit on this specific API call to 50 calls per hour.
Product Group informs they are aware this new limit is affecting users and their actual implementations, for this reason they share this workaround in the meantime a solution is applied about the new throttling limit:
“While we work on refining the throttling limits for the GetDatasourcesAsAdmin API, we understand the importance of providing viable alternatives for our users. As a temporary workaround, we recommend leveraging our scanner APIs to access all data source information for your tenant. The scanner APIs offer a robust solution for retrieving datasets and related data source information.
For more information, please refer “-
https://learn.microsoft.com/en-us/rest/api/power-bi/admin/workspace-info-get-scan-result
I hope they align the final throttling limit with GetXXXUsersAsAdmin APIs (200 calls per hour) so that with have consistency with our scripts. Or that they implement an API that return all datasources of all semantic models of the tenant in a single call.
Using scanner API is more involved (asynchronous calls, complex Json result) requiring a longer implementation.
Best regards,
Frederick
Is the old limit back? Today I was able to get all datasources as before end of January (via Datasets GetDatasourcesAsAdmin) ....
Anyone has similar experience?
Is the old limit back? Today I was able to get all datasources as before end of January (via Datasets GetDatasourcesAsAdmin) ....
Anyone has similar experience?
Yes, I think the old limit is back. I was able to run it across multiple calls and get the data without any issues
Yes same here, able to now run previous code and return all datasources.
I confirm that the old, less restricted limit is back! Thanks all for having reported the issue and to Microsoft to revert to the previous (and documented) limit.
Frederick
Hi all,
I got a reply from Microsoft about the issue:
We have confirmation about a new throttling limit on this specific API call to 50 calls per hour.
Product Group informs they are aware this new limit is affecting users and their actual implementations, for this reason they share this workaround in the meantime a solution is applied about the new throttling limit:
“While we work on refining the throttling limits for the GetDatasourcesAsAdmin API, we understand the importance of providing viable alternatives for our users. As a temporary workaround, we recommend leveraging our scanner APIs to access all data source information for your tenant. The scanner APIs offer a robust solution for retrieving datasets and related data source information.
For more information, please refer “-
https://learn.microsoft.com/en-us/rest/api/power-bi/admin/workspace-info-get-scan-result
I hope they align the final throttling limit with GetXXXUsersAsAdmin APIs (200 calls per hour) so that with have consistency with our scripts. Or that they implement an API that return all datasources of all semantic models of the tenant in a single call.
Using scanner API is more involved (asynchronous calls, complex Json result) requiring a longer implementation.
Best regards,
Frederick
@samsonfrb Thank you for sharing the responce from MS Support.
I have tried Admin - WorkspaceInfo GetScanResult, however it seems that part what I am the most interested in i.e. regarding datasources is missing in my response.
Missing part in comparison to sample response:
], "datasourceInstances": [ { "datasourceType": "XXX", "connectionDetails": { "server": "XXX", "database": "XXX" }, "datasourceId": "XXX", "gatewayId": "XXX" } ],
Have you tried it on your own? Are you getting info regarding datasources (similar as above)?
Regards,
Karo
Hi,
Did you set &datasourceDetails=true whey you called PostWorkspaceInfo?
I didn't have the chance to try these APIs yet but I saw that you must request this information in order to get it.
Frederick
@samsonfrb Good hint, thanks for it! It seems that I need to set up both lineage=True & datasourceDetails=True to get the info I am looking for.
Thanks @samsonfrb
Regards
Amine Jerbi
If I answered your question, please mark this thread as accepted
and you can follow me on
My Website, LinkedIn and Facebook
Same here, working until end of Jan 2024 now gives response code 429 after processing 25 datasets.
I dont understand what the limit is based on, I looped in the api in python with a 60 mins wait timer after the 429 error occurs. The number is not consistent. Sometimes its 28, sometimes 41, ranges between 25-45.
Hi mrraut,
We have the same problem. It started on January 28 and it now seems to allow only 50 calls per hour. It was working fine for a few months before HTTP Error 409 started to appear. We opened a ticket with Microsoft, I hope they resolve it soon. Or that they add a new API call returning all datasources from all semantic models of the tenant in a single call.
I'll let you know if they come back to me with a resolution.
Frederick
Let me know what MS says. I ended up adding a 60 min sleep timer and took 2 days for the api/python code to fetch us the results for all the datasets.
Sure, here is the python script I have been running to get the information needed. This was working correctly till a few weeks ago, but I have been seeing the 429 error since yesterday. I run this every 2-3 months as needed.
######## Loading the required libraries
# import adal
import requests
# import os
import pandas as pd
from azure.identity import ClientSecretCredential
from time import sleep
tenant = 'xxx'
client = 'xxx'
client_secret = 'xxx'
api = 'https://analysis.windows.net/powerbi/api/.default'
# Generates the access token for the Service Principal
auth = ClientSecretCredential(authority = 'https://login.microsoftonline.com/',
tenant_id = tenant,
client_id = client,
client_secret = client_secret)
access_token = auth.get_token(api)
access_token = access_token.token
headers = {'Authorization': f'Bearer {access_token}'}
workspacetable_id = []
workspacetable_name = []
workspace = requests.get('https://api.powerbi.com/v1.0/myorg/admin/groups?%24top=5000&%24filter=state%20eq%20%27Active%27%20and%20type%20eq%20%27Workspace%27&%24expand=users', headers=headers)
if workspace.status_code == 200:
for workspace_info in workspace.json()['value']:
workspacetable_id.append(workspace_info['id'])
workspacetable_name.append(workspace_info['name'])
Workspace_Info = pd.DataFrame({
"Workspace ID":workspacetable_id,
"Workspace Name":workspacetable_name
})
dataset_id = []
dataset_name = []
dataset_url = []
dataset_owner = []
dataset_workspaceid = []
dataset_connection_details = []
dataset_datasource_id = []
dataset_gateway_id = []
datasets = requests.get('https://api.powerbi.com/v1.0/myorg/admin/datasets', headers=headers)
for datasets_info in datasets.json()['value']:
datasouces = requests.get('https://api.powerbi.com/v1.0/myorg/admin/datasets/'+datasets_info['id']+'/datasources', headers=headers)
#Deleted dataset ids are sometimes retained by PowerBI service, when you try to get datascource information with the above api, it errors out. The below if statement is to mitigate that situation
if datasouces.status_code != 200:
continue
for datasources_info in datasouces.json()['value']:
try:
dataset_id.append(datasets_info['id'])
except:
dataset_id.append("")
try:
dataset_name.append(datasets_info['name'])
except:
dataset_name.append("")
try:
dataset_url.append(datasets_info['webUrl'])
except:
dataset_url.append("")
try:
dataset_owner.append(datasets_info['configuredBy'])
except:
dataset_owner.append("")
try:
dataset_workspaceid.append(datasets_info['workspaceId'])
except:
dataset_workspaceid.append("")
try:
dataset_connection_details.append(datasources_info['connectionDetails'])
except:
dataset_connection_details.append("")
try:
dataset_datasource_id.append(datasources_info['datasourceId'])
except:
dataset_datasource_id.append("")
try:
dataset_gateway_id.append(datasources_info['gatewayId'])
except:
dataset_gateway_id.append("")
Dataset_Info = pd.DataFrame({
"Dataset ID":dataset_id,
"Dataset Name": dataset_name,
"Dataset URL": dataset_url,
"Dataset Owner": dataset_owner,
"Workspace ID": dataset_workspaceid,
"Connection Details":dataset_connection_details,
"Datasource ID":dataset_datasource_id,
"Gateway ID":dataset_gateway_id
})
Dataset_Info_Complete = pd.merge(Dataset_Info, Workspace_Info, on='Workspace ID', how='left')
Dataset = Dataset_Info_Complete[['Workspace Name','Dataset Name','Dataset URL','Dataset Owner','Connection Details','Dataset ID','Workspace ID','Datasource ID','Gateway ID']]
Dataset.to_excel('E:/Dataset Info.xlsx', sheet_name='PBI Reports',index=False)
Never mind I was mistaken, that was for workspace API. Sorry.
However did you know that Get Dataset API can't go over 200 requests?
Regards
Amine Jerbi
If I answered your question, please mark this thread as accepted
and you can follow me on
My Website, LinkedIn and Facebook
I am running the get dataset api only once. The api I am running multiple times is this (below) and no limitaions have been documented yet.
https://learn.microsoft.com/en-us/rest/api/power-bi/admin/datasets-get-datasources-as-admin
That is getting the workspace information. The filtered result is <200 workspaces. There is no issue with that api and does not time out. Neither any other api in the same session. The issue is with the
datasouces = requests.get('https://api.powerbi.com/v1.0/myorg/admin/datasets/'+datasets_info['id']+'/datasources', headers=headers)
part of the code. This times out after 30 odd requests. We have about 1250 datasets and the api was successfuly able to fetch the datasource information for all those datasets till last month
Hi @mrraut
Nothing has changed in my knowledge and it shouldn't happen. maybe the process when you run the script is taking too long and causing the problem
Can you show me the Script please?
Regards
Amine Jerbi
If I answered your question, please mark this thread as accepted
and you can follow me on
My Website, LinkedIn and Facebook
Sure, here is the python script I have been running to get the information needed. This was working correctly till a few weeks ago, but I have been seeing the 429 error since yesterday. I run this every 2-3 months as needed.
######## Loading the required libraries
# import adal
import requests
# import os
import pandas as pd
from azure.identity import ClientSecretCredential
from time import sleep
tenant = 'xxx'
client = 'xxx'
client_secret = 'xxx'
api = 'https://analysis.windows.net/powerbi/api/.default'
# Generates the access token for the Service Principal
auth = ClientSecretCredential(authority = 'https://login.microsoftonline.com/',
tenant_id = tenant,
client_id = client,
client_secret = client_secret)
access_token = auth.get_token(api)
access_token = access_token.token
headers = {'Authorization': f'Bearer {access_token}'}
workspacetable_id = []
workspacetable_name = []
workspace = requests.get('https://api.powerbi.com/v1.0/myorg/admin/groups?%24top=5000&%24filter=state%20eq%20%27Active%27%20and%20type%20eq%20%27Workspace%27&%24expand=users', headers=headers)
if workspace.status_code == 200:
for workspace_info in workspace.json()['value']:
workspacetable_id.append(workspace_info['id'])
workspacetable_name.append(workspace_info['name'])
Workspace_Info = pd.DataFrame({
"Workspace ID":workspacetable_id,
"Workspace Name":workspacetable_name
})
dataset_id = []
dataset_name = []
dataset_url = []
dataset_owner = []
dataset_workspaceid = []
dataset_connection_details = []
dataset_datasource_id = []
dataset_gateway_id = []
datasets = requests.get('https://api.powerbi.com/v1.0/myorg/admin/datasets', headers=headers)
for datasets_info in datasets.json()['value']:
datasouces = requests.get('https://api.powerbi.com/v1.0/myorg/admin/datasets/'+datasets_info['id']+'/datasources', headers=headers)
#Deleted dataset ids are sometimes retained by PowerBI service, when you try to get datascource information with the above api, it errors out. The below if statement is to mitigate that situation
if datasouces.status_code != 200:
continue
for datasources_info in datasouces.json()['value']:
try:
dataset_id.append(datasets_info['id'])
except:
dataset_id.append("")
try:
dataset_name.append(datasets_info['name'])
except:
dataset_name.append("")
try:
dataset_url.append(datasets_info['webUrl'])
except:
dataset_url.append("")
try:
dataset_owner.append(datasets_info['configuredBy'])
except:
dataset_owner.append("")
try:
dataset_workspaceid.append(datasets_info['workspaceId'])
except:
dataset_workspaceid.append("")
try:
dataset_connection_details.append(datasources_info['connectionDetails'])
except:
dataset_connection_details.append("")
try:
dataset_datasource_id.append(datasources_info['datasourceId'])
except:
dataset_datasource_id.append("")
try:
dataset_gateway_id.append(datasources_info['gatewayId'])
except:
dataset_gateway_id.append("")
Dataset_Info = pd.DataFrame({
"Dataset ID":dataset_id,
"Dataset Name": dataset_name,
"Dataset URL": dataset_url,
"Dataset Owner": dataset_owner,
"Workspace ID": dataset_workspaceid,
"Connection Details":dataset_connection_details,
"Datasource ID":dataset_datasource_id,
"Gateway ID":dataset_gateway_id
})
Dataset_Info_Complete = pd.merge(Dataset_Info, Workspace_Info, on='Workspace ID', how='left')
Dataset = Dataset_Info_Complete[['Workspace Name','Dataset Name','Dataset URL','Dataset Owner','Connection Details','Dataset ID','Workspace ID','Datasource ID','Gateway ID']]
Dataset.to_excel('E:/Dataset Info.xlsx', sheet_name='PBI Reports',index=False)
Ok, I have met this problem before but using Powershell scripts. Deleted Datasets in service can take a while to completly disappear and that could cause the problem. I saw in your script the condition when Dataset is deleted.... but I am not sure that the process in running the script takes all in effect or not.
Anyway here is my script if it can help you. Didn't use your condition and it works just fine
Get-PowerBIDataset -Scope Organization |
Foreach {
$dsID = $_.Id;
$dsName = $_.Name;
Get-PowerBIDatasource -DatasetId $dsID -Scope Organization |
Foreach {
[PSCustoMObject]@{
'WSID' = $wsId;
'WSName' = $wsName;
'DSID' = $dsID;
'DSName' = $dsName;
'SrcId' = $_.DatasourceId;
'SrcName' = $_.Name;
'DatasourceType' = $_.DatasourceType;
'ConnectionString' = $_.ConnectionString;
'ConnectionDetails.server' = $_.ConnectionDetails.server;
'ConnectionDetails.database' = $_.ConnectionDetails.database;
'ConnectionDetails.Url' = $_.ConnectionDetails.Url;
'GatewayId' = $_.GatewayId
}
}
}| ConvertTo-Json | Out-File -FilePath "D:\Logs\Inventaire\Datasources\PowerBI_Datasources_$(Get-Date -Format yyyyMMdd).json" -Encoding utf8 -Force
By the way, I use App Registration to connect so I don't go through the access token limitations.
Regards
Amine Jerbi
If I answered your question, please mark this thread as accepted
and you can follow me on
My Website, LinkedIn and Facebook
Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.
Check out the August 2024 Power BI update to learn about new features.
User | Count |
---|---|
51 | |
26 | |
14 | |
14 | |
12 |
User | Count |
---|---|
107 | |
38 | |
25 | |
23 | |
19 |