Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.
Hi all
I´m a newbie to spark. Hopefully this is not too silly question.
I´m trying to call the Power BI admin API endpoint PostWorkspaceInfo.
The endpoint needs a list of modified workspaces as body. I have created all the logic to get a list of the modified workspace guids and I have put them into batches so I can call the PostWorkspaceInfo endpoint to ask for the metadata. My problem is that the list over modified workspaces is coming from a temporary view and when I try to pass it to the json part of the request.post it fails with BadRequest. I have examined the content of the column and it´s exactly as it should be. If I copy it and paste it straight into the request.post the call works. For some reason the text I´m inputting from the view is not read correctly. I´m guessing that I need to somehow make the request.post parse it as JSON but I have no idea how.
For this question I have limited the number of workspaces to 3 and sanitized the guids. Normally I would have hundreds/thousands of workspaces in batches of 100 hence the for loop below.
The column BodyExpression from below returns this: {"workspaces": ["034c87da-bbbb-4df8-becc-555e5916d6fa","043654b6-7868-aaaa-bdad-b1cf9ad95f4c","073c7513-dddd-40c3-bb6e-48d0f0b00b42"]}
My code below runs but the API returns error (see below code)
#Import necessary libraries
import msal
import requests
import json
import pandas as pd
from datetime import date, timedelta, time
#Set Client ID and Secret for Service Principal
client_id = ""
client_secret = ""
authority_url = "https://login.microsoftonline.com/northinsights.com"
scope = ["https://analysis.windows.net/powerbi/api/.default"]
activityDate = date.today()
#Use MSAL to grab token
app = msal.ConfidentialClientApplication(client_id, authority=authority_url, client_credential=client_secret)
result = app.acquire_token_for_client(scopes=scope)
#Set URL for changed workspaces
url = "https://api.powerbi.com/v1.0/myorg/admin/workspaces/modified"
#Get changed workspaces
if 'access_token' in result:
access_token = result['access_token']
header = {'Content-Type':'application/json', 'Authorization':f'Bearer {access_token}'}
api_call = requests.get(url=url, headers=header)
print(json.loads(api_call.text))
df1 = spark.createDataFrame(json.loads(api_call.text))
df1.createOrReplaceTempView("modifiedworkspaces")
display(df1)
%%sql
--Batch workspace ids into 100
create or replace temporary view WorkspaceBatches as
WITH WorkspaceRanked AS
(
SELECT ROW_NUMBER() OVER ( ORDER BY id) AS Rank
, id
FROM modifiedworkspaces
),
WorkspaceBatched AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY(Rank % 100) ORDER BY Rank ASC) AS Batch
, id
FROM WorkspaceRanked
)
SELECT replace(concat('{"workspaces": ["' , array_join(array_sort(collect_set(id)),","),'"]}'),',','","') AS BodyExpression, Batch
--INTO WorkspaceBatches
FROM WorkspaceBatched
GROUP BY Batch;
SELECT * FROM WorkspaceBatches
#Try to post workspaces to get inventory
post_WorkspaceInfoUrl = "https://api.powerbi.com/v1.0/myorg/admin/workspaces/getInfo?lineage=True&datasourceDetails=True&getArtifactUsers=True&datasetSchema=True&datasetExpressions=True"
dfbatches = spark.sql("SELECT Batch, BodyExpression FROM WorkspaceBatches")
#display(dfbatches2)
rows_looped = dfbatches.select("Batch", "BodyExpression").collect()
# looping through each batch
for rows in rows_looped:
#print(rows.value)
post_WorkspaceInfo = requests.post(url=post_WorkspaceInfoUrl, headers=header, json = rows.BodyExpression )
print(post_WorkspaceInfo.text)
The code above runs but the response is a bad request: {"error":{"code":"BadRequest","message":"Bad Request","details":[{"message":"Error converting value "{"workspaces": ["034c87da-bbbb-4df8-becc-555e5916d6fa","043654b6-7868-aaaa-bdad-b1cf9ad95f4c","073c7513-dddd-40c3-bb6e-48d0f0b00b42"]}" to type 'Microsoft.PowerBI.ServiceContracts.Api.RequiredWorkspaces'. Path '', line 1, position 3096.","target":"requiredWorkspaces"}]}}
If I paste the value of the BodyExrpression into the code instead (like below). It works correctly.
Codesnippet:
post_WorkspaceInfoUrl = "https://api.powerbi.com/v1.0/myorg/admin/workspaces/getInfo?lineage=True&datasourceDetails=True&getArtifactUsers=True&datasetSchema=True&datasetExpressions=True"
dfbatches = spark.sql("SELECT Batch, BodyExpression FROM WorkspaceBatches")
rows_looped = dfbatches.select("Batch", "BodyExpression").collect()
for rows in rows_looped:
post_WorkspaceInfo = requests.post(url=post_WorkspaceInfoUrl, headers=header, json = {"workspaces": ["034c87da-bbbb-4df8-becc-555e5916d6fa","043654b6-7868-aaaa-bdad-b1cf9ad95f4c","073c7513-dddd-40c3-bb6e-48d0f0b00b42"]} )
print(post_WorkspaceInfo.text)
I tried converting the BodyExpression to JSON and then passing it into the request.post but it didn´t work. I´ve tried all kinds of conversions but nothing works for me.
I´m expecting that the string I pass into would work the same way as when I paste the same string directly into the function.
Solved! Go to Solution.
The problem is solved. I needed to use data = rows.bodyexpression
instead json = rows.bodyexpression because I´m passing string and not json in.
The problem is solved. I needed to use data = rows.bodyexpression
instead json = rows.bodyexpression because I´m passing string and not json in.
User | Count |
---|---|
20 | |
4 | |
4 | |
3 | |
3 |