Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
I have an API link, which I use as a data source. However, I wanted to find a way to do all the ETL outside of Power Query, so as not to weigh on Power BI's memory, and I thought about doing the ETL by Python and loading the data via Python into Power BI. Do you know if it is possible to have automatic updating without needing the Gateway? Because I am in the challenge of being able to make the update of this dashboard automated without needing my notebook to be connected to the gateway.
Solved! Go to Solution.
Hi, @madpimenta013
Power BI indeed has the capability to integrate with Python, which means you can use Python to do your ETL process. The problem, however, is that for any data source not located in the cloud, like an on-premises SQL Server or a Python script running on your personal computer, Power BI generally requires the use of a Gateway to connect and refresh data.
If your ETL process with Python is getting data from an API, the Python script itself isn't actually the data source from Power BI's perspective. The API is the actual source. It might be possible to avoid using a Gateway by transforming your ETL process into a cloud-based solution. Here are a few general steps you could consider:
Cloud-based Python Environment: Set up a Python environment in the cloud using a service like Azure Functions, Google Cloud Functions, AWS Lambda or any other server where you can run Python. This will handle the API calls and ETL processes.
Storing the Data: After processing your data, store the result in a cloud-based database or data warehouse that Power BI can connect to directly, such as Azure SQL Database, Azure Data Lake, Google BigQuery, AWS Redshift etc.
Connecting Power BI to the Cloud Source: Now, you can set up Power BI to directly connect to this cloud-based data source. These types of connections usually don't require a Gateway since Power BI can connect to them directly over the internet.
Automating the ETL Process: Lastly, set up your Python environment to run your ETL process at regular intervals. This could be done using various methods depending on your choice of the cloud environment (like using Azure Functions' Timer Triggers).
Remember that any solution will still need to manage authentication and secure data transfer, especially if your API or data sources require it.
It's also worth noting that this approach will mean a significant shift from a mostly Power BI environment to a more distributed, cloud-based architecture, which could have additional costs and complexity. This should be considered when evaluating whether this is the best solution for your situation.
this thread might be helpful
see if this article helps you.
https://pbi-guy.com/2022/01/07/refresh-a-power-bi-dataset-with-python/
Proud to be a Super User!
Hi, @madpimenta013
Power BI indeed has the capability to integrate with Python, which means you can use Python to do your ETL process. The problem, however, is that for any data source not located in the cloud, like an on-premises SQL Server or a Python script running on your personal computer, Power BI generally requires the use of a Gateway to connect and refresh data.
If your ETL process with Python is getting data from an API, the Python script itself isn't actually the data source from Power BI's perspective. The API is the actual source. It might be possible to avoid using a Gateway by transforming your ETL process into a cloud-based solution. Here are a few general steps you could consider:
Cloud-based Python Environment: Set up a Python environment in the cloud using a service like Azure Functions, Google Cloud Functions, AWS Lambda or any other server where you can run Python. This will handle the API calls and ETL processes.
Storing the Data: After processing your data, store the result in a cloud-based database or data warehouse that Power BI can connect to directly, such as Azure SQL Database, Azure Data Lake, Google BigQuery, AWS Redshift etc.
Connecting Power BI to the Cloud Source: Now, you can set up Power BI to directly connect to this cloud-based data source. These types of connections usually don't require a Gateway since Power BI can connect to them directly over the internet.
Automating the ETL Process: Lastly, set up your Python environment to run your ETL process at regular intervals. This could be done using various methods depending on your choice of the cloud environment (like using Azure Functions' Timer Triggers).
Remember that any solution will still need to manage authentication and secure data transfer, especially if your API or data sources require it.
It's also worth noting that this approach will mean a significant shift from a mostly Power BI environment to a more distributed, cloud-based architecture, which could have additional costs and complexity. This should be considered when evaluating whether this is the best solution for your situation.
this thread might be helpful
see if this article helps you.
https://pbi-guy.com/2022/01/07/refresh-a-power-bi-dataset-with-python/
Proud to be a Super User!
That makes total sense. Thank you for your reply.
Good to know that it helped. I used GPT to write this much detailed info. 😂
Proud to be a Super User!