Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
I have access to an API that lets me retrieve results of tests undertaken by my company employees. Each month, employees get one or more tests. The total set of tests all employees get in a month are grouped into a gropu called a PST, and identified by a unique ID named PstId. There are two API endpoinst related to my question here.
https://us.api.knowbe4.com/v1/phishing/security_tests
https://us.api.knowbe4.com/v1/phishing/security_tests/{PstId}/recipients?per_page={RecordsPerPage}&page={PageNumber}
I can easily call the 1st endpoing to get a list of PSTs, like so:
PSTs = Json.Document
(
Web.Contents
(
"https://us.api.knowbe4.com/v1/phishing/security_tests",
[
Headers=
[
ApiKeyName=#"APIKeyName",
Authorization="Bearer " & #"APIKeyValue"
]
]
)
)
Next, I can also retireve data from the 2nd endpoint using a function (the API doesn't return the total number of pages or total number of records, so I have to do this bit dynamically).
fnGetTestResults = (PstId as text, PerPage as number, PageNum as number) =>
let
Source = Json.Document
(
Web.Contents
(
"https://us.api.knowbe4.com",
[
RelativePath="/v1/phishing/security_tests/" & PstId & "/recipients",
Query=
[
per_page=Number.ToText(PerPage),
page=Number.ToText(PageNum)
],
Headers=
[
ApiKeyName=APIKeyName,
Authorization="Bearer " & APIKeyValue
]
]
)
)
in
Source
So far so good. However, my end goal is to retireve all data from all tests, and set up the PowerBI Service so that going forward, this will be auto refreshed. To do that, I created a 2nd function, called fnIteratePSTs. I pass the PstId to this a parameter, and from within it, I called the fnGetTestResults function.
fnIteratePSTs = (PstId as text) =>
let
Source = List.Generate(
()=> [ Offset = 1, Results = fnGetTestResults(PstId, 100, 1) ],
each not (List.IsEmpty([ Results ])),
each [ Results = fnGetTestResults(PstId, 100, [Offset] + 1 ), Offset = [Offset] + 1 ],
each [Results]
)
in
Source
Now, I call this function using the 'Invoke Custom Function' option from the PSTs table I created previously. This basically goes through all the PstIds, calls the fnIteratePSTs() function, which in tern iteratively calls the fnGetTestResults() function, all of which results in retireving results of all tests.
TestResults =
let
Source = PSTs,
#"Removed Other Columns" = Table.SelectColumns(Source,{"campaign_id", "pst_id", "name", "Campaign Start", "Status"}),
#"Changed Type" = Table.TransformColumnTypes(#"Removed Other Columns",{{"pst_id", type text}, {"campaign_id", type text}, {"name", type text}, {"Status", type text}}),
#"Renamed Columns" = Table.RenameColumns(#"Changed Type",{{"campaign_id", "CampaignId"}, {"name", "Period Title"}, {"pst_id", "PstId"}}),
#"Reordered Columns" = Table.ReorderColumns(#"Renamed Columns",{"CampaignId", "Period Title", "PstId"}),
#"Invoked Custom Function" = Table.AddColumn(#"Reordered Columns", "Results", each fnIteratePSTs([PstId])),
#"Expanded Results" = Table.ExpandListColumn(#"Invoked Custom Function", "Results")
in
#"Expanded Results"
This works fine, and I can get all the results.
However, when I publish this report to the PowerBI Server, and schedule a refresh (or even if I simply manually refresh from PBI service), it fails with the following message:
Data source error: [Unable to combine data] Section1/TestResults/Expanded Results references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.. The exception was raised by the IDataReader interface. Please review the error message and provider documentation for further information and corrective action. Table: TestResults.
Cluster URI: WABI-WEST-US-E-PRIMARY-redirect.analysis.windows.net
Activity ID: c0ad9498-2a5d-403f-bf2a-eb03b4d8df7f
Request ID: f7b6cb6d-ac16-26ec-4ae8-631e237cfb11
Time: 2024-09-13 19:22:40Z
After some digging, I found out that it's the 'Expanded Results' step in the TestResults table that causes this issue. If I remove that step, and simply stop at the 'Invoke Custom Function', then PBI Service no longer reaises this issue. But without that step (and steps after that), the retirived data is useless, since expanded results column is what contains all the actual test results.
What is causing this and how do I fix it?
If this is not the right way to handle a situation like this, then what is?
Show what
fnIteratePSTs
does.
Apologies, it was mislabled in the post. This is the function:
fnIteratePSTs = (PstId as text) =>
let
Source = List.Generate(
()=> [ Offset = 1, Results = fnGetTestResults(PstId, 100, 1) ],
each not (List.IsEmpty([ Results ])),
each [ Results = fnGetTestResults(PstId, 100, [Offset] + 1 ), Offset = [Offset] + 1 ],
each [Results]
)
in
Source
This will only hurt a little: Power Query M Primer (Part 13): Tables—Table Think II | Ben Gribaudo
Are you referring to 'disable parallel loading' as a fix? I tried it, which makes the refresh time much longer (over 2 hours), but even then it eventually failed. Same error as before.
No, I am referring to the "everything in different partitions" vs "everything in the same partition" advice. Your indicated problem was the reference to other queries, not the performance.
In that case, I'm afraid I don't quite understand how to incorporate that solution into mine. The adivse is:
A shortcut solution: If you encounter one of these errors, place the code specific to each data source in a separate query (one query per data source) and then reference those queries from another query that combines between them.
I don't quite understand what partitioning is, but going by the above, I think my sources are already in different queries? PSTs table is one, where I have the list of PstIds. Then my TestResults is a different query, which references the first since it needs to PstIds. Is that not the separation it speaks of? I don't see how else I can separate them - I can't call TestResults API without a PstId, and unless I hardcode it (which defeats the purpose of trying to do Auto Refresh anyway), I don't understand how I can call it without referncing PSTs table.
I also tried the next piece of advise:
Technically, if all data sources have a public privacy level or if all have an organization privacy level, they can be freely combined between, so the data protection level isn’t doing any protecting. Disabling it gets it out of the way, avoiding the coding constraints it imposes and any performance overhead it incurs.
But that too doesn't help, and fails when refreshing from PBI Service.
Check out the July 2025 Power BI update to learn about new features.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.