Advance your Data & AI career with 50 days of live learning, dataviz contests, hands-on challenges, study groups & certifications and more!
Get registeredGet Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more
Hi Folks
We are running several gatways (version 3000.186.14, August 2023, clustered too) and are just noticing that both the DatasetId and WorkspaceID are mostly (99.99% of the time) missing?
We are doing log shipping (and almost real-time monitoring), and tracking this down to support actionable code/things is difficult without this information. (since we run at least 4 different application environments, and multiple gateways).
Are we missing anything as far as gateway setup goes, to ensure we have this information?
Thanks 🙂
Hey @rkaratzas ,
in addition to what @lbendlin already mentioned, maybe updating you gateways to the October 2023 release will help, according to this article artifact details are already part of the basic logging: https://powerbi.microsoft.com/en-us/blog/on-premises-data-gateway-october-2023-release/
Hopefully, this helps to tackle your challenge.
Regards,
Tom
ok, so having those two IDs is not reliable I gather.
Maybe the next question I would have is, since we are trying to make log errors an actionable thing, is there a way to trace anything back to the running "application" (application being dashboards, apps, reports, automations, etc.)?
We've contemplated writing a utility which can go out into the "Azure landscape" (various workspaces associated with different environments) and finding where the query portion (or anything else identifiable of the error) exists. (A very complicated process for searching, needing crawls, unpacking the objects, etc.)
we are trying to make log errors an actionable thing
Oh, good. Let me know when you find something. I have been at it for a couple months and have yet to find a meaningful metric. For example
looks really pretty but is also meaningless since none of these reported issues seem to affect actual refreshes, or indicate any real gateway issues.
We had more luck with telemetry that we collected ourselves. Sadly the SystemCounter reports are not usable as they are only flushed to disk very occasionally.
The EvaluationContext data is still your best bet. Raise a ticket with Microsoft to figure out why it doesn't populate for you (make sure to use a recent-ish gateway version, post June 2023).
yes, so it's pretty easy:
1. We grabbed a copy of this (and have been using it as one of our monthly gateway reports https://github.com/RuiRomano/pbigtwmonitor (we also use the vanilla one developed by the gateway team)
2. I wrote a PowerShell to log ship every X minutes (via task scheduler on EACH gateway server):
Set-ExecutionPolicy -Scope LocalMachine -ExecutionPolicy Unrestricted -Force
# Set the UNC path to the directory
$uncPath = "\\yourserver\c$\Users\PBIEgwService\AppData\Local\Microsoft\On-premises data gateway\*.log"
# Set the UNC path to the log destination directory
$destinationPath = "\\NASS1\UW_Workstation\PowerBIEnterpriseDataGatewayLogs24Hour\yourserver"
# Calculate the cutoff date and time for files modified in the last 24 hours
$now = Get-Date
$cutoff = $now.AddHours(-24)
# List files modified in the last 24 hours
$filesToCopy = Get-ChildItem -Path $uncPath -Recurse | Where-Object { $_.LastWriteTime -ge $cutoff }
# Copy the selected files to the destination
$filesToCopy | ForEach-Object { Copy-Item -Path $_.FullName -Destination $destinationPath }
# Delete files older than the cutoff date in the destination directory
Get-ChildItem -Path $destinationPath | Where-Object { $_.LastWriteTime -lt $cutoff } | Remove-Item -Force
3. Register a Certified Dataset in Azure
4. PowerShell to refresh the dataset @ X minutes (similar to this):
https://stackoverflow.com/questions/52697381/power-bi-dataset-refresh-using-powershell
(note that clientid is really applicationid in the Azure portal)
understand?
PS Note that I already submitted a feature request, such that log shipping functionality becomes part of the Gateway interface (and it being configurable via what's already there in the config files).
We are in the process of morphing the monthly report UI into an almost realtime UI for the support folks. We will also create an additional report for EACH development team (DEV, UAT, Production), attempting to have them fix actionable things related to the errors in EACH environment or by product 🙂 We've learned a lot looking at the logs (like generalized scenarios when things timeout, are missing fields/parameters, etc. 🙂
Monthly?!?! What's the point?
Almost Realtime ? Dream on. As I said the logs are flushed to disk infrequently.
What we really need is the ability to push log events into streaming datasets - as they are happening.
Fun story: We have a situation where we get DOZENS of new QueryStart log files EVERY DAY due to the unintended consequences of the inclusion of the query text in the log. In itself that's a good thing but we have an app that fires a ginormous SQl query at a Direct Query source every 10 seconds on average. You would look at 60GB of log data for a monthly report.
Monthly: It works for us (using this data helps us understand and forecast resource usage), i.e. having the ability to do things like, understand when we will need to add more gateways to the cluster.
Almost Realtime: Support can isolate errors within an hour window (that's our hope). Which helps us (obviously) with errors, but data egress from Azure (and failures along that scenario) is not free. Failures are costly, and worth the effort to make customers happy and save the company $. Often, the things that fail are aspects we can clearly identify from the logs, like we see a notably high percentage pattern of failures when the mashup/etc. is not completed within 1.5 minutes.
Last I looked (we do monthlies tonight/tomorrow), we were around 430 GB.
Streaming sounds great, when the gateway team offers us something with the ability to connect directly with our Synapse Warehouse. (But overall, we lean on the cheap/conservative side, trying to keep much of our "typical" reporting on local PBIRS servers. It gets expensive and difficult if everything is in Azure (since if we can't monitor the gateways, and what's going on, we can't manage it 😉
Almost Realtime: Support can isolate errors within an hour window (that's our hope).
As I said, this does not match our experience. Daily at best with the current flushing mechanism (unless you resort to brute force service restarts which is not a good idea).
Help me understand, log files are just that (simple file system things that work like anything else)?
You can pull (a copy of) those "things" at will, so you just look at the property (of the file) that they were last modified. (easy peasey, but I'd really like to understand what it is that you are experiencing that requires any restarting or flushing). - Rob
grabbing a copy of those items (what's shown in the PowerShell) works just fine. (remember, it's working with the file system where the logs are stored, so you are just dealing with it in a read-only context, so grab what's changed within your modified window, load it into the dashboard mentioned and it works - or at least it has for the last 3 weeks for us 🙂 (We are still ramping up on all of the technical details in the logs AND I've registered a support call with Microsoft to get further clarity on the ActivityType codes, so we can translate this into laymen terms for support.
grabbing a copy of those items (what's shown in the PowerShell) works just fine
That is only half of the story. A last modified timestamp on a file does not reveal the actual last row of data written into that file. Again - our experience is that the gateway service only flushes data to the files occasionally, every couple hours at best and every 24 hours at worst.
Obviously, some logs are (reused) in a round-robin fashion, some are not.
What the log Spooler does (when it writes to logs, etc.), needs to have the logic "plumbing" extended that (when you have a "log shipping" feature available) it does the best it can to empty what is buffered in the spooler, as part of that feature. (The fact that the spooler hasn't done a complete flush, could be controlled. Like enact the log shipping feature, which would roll into the next log sequence (closing all current log connection files, making them available for shipping).
Like here's a sample of what get's grabbed using the PowerShell mentioned:
(poop, my companies URL isolation security won't let me paste, nor upload that screenshot, but it's simple to run what's there in PowerShell - replacing the locations with yours 🙂
We are using OneDrive as a cheap replacement for the log shipping process that you mention. But OneDrive can only sync what's flushed. Not bad, but also not sufficient for our needs.
so far so good for us (seeing what we are getting per ship cycle), but obviously, a log shipper feature (in the Gateway software) would keep most of us folks (like you and me) a lot happier 🙂
Again, our philosophy: "If you can't monitor it, you can't manage it"
Those "EvaluationContext" details are only available for a subset of data sources. What are the main data sources processed on your gateways?
We see the dataset details for 370000 out of 560000 log events.
Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!
Check out the October 2025 Power BI update to learn about new features.
| User | Count |
|---|---|
| 58 | |
| 13 | |
| 12 | |
| 10 | |
| 10 |