Fabric is Generally Available. Browse Fabric Presentations. Work towards your Fabric certification with the Cloud Skills Challenge.
Good day,
We get data from Walkscore for our properties through an API. Every month, I update our properties with a new table in Power Query, re-add custom columns with the API code, and save. When I do so, my file is out of commission for the next ten minutes because it's fetching data from the API. Same applies when I save and close, and refresh the data in the home screen.
(I ran the same table in DAX studio, and it only took 36 milliseconds.)
How can I make our API run faster??
Solved! Go to Solution.
Yep, dataflows have been in in General Availability since April 2019 and it's generally considered a best practice to at least stage, and often transform, your source data in them, upstream of datasets. This gives you separation of concerns with ETL in dataflows and data modeling in datasets.
I'd do the raw API calls in one dataflow, or if there's a lot of historical data I'd partition API calls in two dataflows (historical + refreshable). I'd then input that into another dataflow where additional logic (e.g. calculated columns) is performed, and that's what I would finally ingest in the dataset in Power BI Desktop.
You don't have to do it this way, but this is my considered advice based on having been there and done that quite a few times.
Thanks for the reply. Do you mean like this:
Adding the API as a datasource in the Service:
And then connecting to it once I'm back in the desktop?
I want to do all my work in the desktop.
Yes, that's what I mean, though those screenshots are old as dataflows went out of beta years ago.
If you don't want to use dataflows then don't, I guess being locked out of Power BI Desktop for ten minutes is not such a big deal after all.
I just copied the screenshots off a 2018 instructional video.
https://www.youtube.com/watch?v=veuxofp0ZIg
I do want to circumvent these constant updates. I just added a calculated column to the table, and that alone prompted another 10-minute call to the API.
Yep, dataflows have been in in General Availability since April 2019 and it's generally considered a best practice to at least stage, and often transform, your source data in them, upstream of datasets. This gives you separation of concerns with ETL in dataflows and data modeling in datasets.
I'd do the raw API calls in one dataflow, or if there's a lot of historical data I'd partition API calls in two dataflows (historical + refreshable). I'd then input that into another dataflow where additional logic (e.g. calculated columns) is performed, and that's what I would finally ingest in the dataset in Power BI Desktop.
You don't have to do it this way, but this is my considered advice based on having been there and done that quite a few times.
So I've put my API in the service, but I'm not sure how to integrate this into my data model. It's a post request, so I have brought in the necessary address columns into the service for it to figure out the scores it needs to return.
Now when I'm back in the desktop, I have connected to the dataflow, but it returns nothing, even after I refresh in both the service and desktop.
How do I get it to populate?
Have you saved and refreshed the dataflow? It's distinct from dataset refreshes.
Yes I had... but I just ran it again, and now it's working. Thanks.
Sounds like you're doing all of this in Power BI Desktop, correct? If that's the case consider moving your API calls to dataflows. You might have to adapt your Power Query code so that it can work in the service but in my experience it's well worth it.
Check out the November 2023 Power BI update to learn about new features.
Read the latest Fabric Community announcements, including updates on Power BI, Synapse, Data Factory and Data Activator.