Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
asfdwd
Frequent Visitor

Scheduled Refresh for Data imported from Amazon S3

Hi,

 

One datatable for my Power BI report is imported from Amazon S3.

The only way I know is to use python like below. 

import boto3
import pandas as pd
import io

bucket = 'name of your bucket'
key = 'name of your file'

s3 = boto3.client('s3')
f = s3.get_object(Bucket=bucket, Key=key)
shape = pd.read_csv(io.BytesIO(f['Body'].read()), header=0, index_col=0)
shape = shape.apply(lambda x: x.fillna(0))
print(shape)

 However, it seems that Power BI Scheduled Refresh service doesn't support Python.

asfdwd_0-1694147217034.png

 

Is it possible that I can do Scheduled Refresh for Data imported from Amazon S3?

1 ACCEPTED SOLUTION
TomMartens
Super User
Super User

Hey @asfdwd ,

 

at the current moment, there is no other way than to use Python to get S3 data when using Power BI, this might change in the future when Power BI Dataflows Gen 2 become available (a feature of Microsoft Fabric).

Using Python has its own downsides as it requires a gateway in personal mode, this gateway machine also needs a Python installation. There are downsides when using a gateway in Personal mode: https://learn.microsoft.com/en-us/power-bi/connect-data/service-gateway-personal-mode?WT.mc_id=DP-MV...

My recommendation, use Azure data Factory to get the S3 data (https://learn.microsoft.com/en-us/azure/data-factory/connector-amazon-s3-compatible-storage?tabs=dat...), when this is done you can trigger a dataset refresh from a Azure Data Factory pipeline (https://community.fabric.microsoft.com/t5/Service/Trigger-the-dataset-refresh-after-Azure-ETL-proces...). I have to admit that this is not trivial in comparison to the Power BI Desktop solution you have, but this is most stable solution I can currently think of.

 

Everything S3 related will become more simple with Microsoft Fabric, but Fabric is in Preview at the current moment and comes with extra costs (as well as the Azure Data Factory approach).

 

Hopefully, this provides some new ideas, helping to tackle your challenge.

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

View solution in original post

1 REPLY 1
TomMartens
Super User
Super User

Hey @asfdwd ,

 

at the current moment, there is no other way than to use Python to get S3 data when using Power BI, this might change in the future when Power BI Dataflows Gen 2 become available (a feature of Microsoft Fabric).

Using Python has its own downsides as it requires a gateway in personal mode, this gateway machine also needs a Python installation. There are downsides when using a gateway in Personal mode: https://learn.microsoft.com/en-us/power-bi/connect-data/service-gateway-personal-mode?WT.mc_id=DP-MV...

My recommendation, use Azure data Factory to get the S3 data (https://learn.microsoft.com/en-us/azure/data-factory/connector-amazon-s3-compatible-storage?tabs=dat...), when this is done you can trigger a dataset refresh from a Azure Data Factory pipeline (https://community.fabric.microsoft.com/t5/Service/Trigger-the-dataset-refresh-after-Azure-ETL-proces...). I have to admit that this is not trivial in comparison to the Power BI Desktop solution you have, but this is most stable solution I can currently think of.

 

Everything S3 related will become more simple with Microsoft Fabric, but Fabric is in Preview at the current moment and comes with extra costs (as well as the Azure Data Factory approach).

 

Hopefully, this provides some new ideas, helping to tackle your challenge.

 

Regards,

Tom



Did I answer your question? Mark my post as a solution, this will help others!

Proud to be a Super User!
I accept Kudos 😉
Hamburg, Germany

Helpful resources

Announcements
PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors