Reply
Jeanxyz
Impactful Individual
Impactful Individual
Partially syndicated - Outbound

pull AWS s3 data into power bi

I need to import some csv files from aws s3 into power bi. Below is the python script I try to use. I have received API url and token from another colleague, how can I complete the bucket and key variable?

********************************************************************************************

import boto3
import pandas as pd
import io

bucket="<input bucket name>" 
key="<file name>"

s3 = boto3.client('s3')
f = s3.get_object(Bucket=bucket, Key=key)
shape = pd.read_csv(io.BytesIO(f['Body'].read()), header=0, index_col=0)
shape = shape.apply(lambdax: x.fillna(0))
print(shape)

1 ACCEPTED SOLUTION
Jeanxyz
Impactful Individual
Impactful Individual

Syndicated - Outbound

Thanks, @amitchandak . will go through the tutorials when I get some time.

 

I talked to our AWS admin and made some change with the python script, it works now. (see script below).

limitations of python script connector:

1. this import mode is slow, hence I can only import a small csv file. If there is a small error with the csv file, the import query will fail. Is there a way to ignore csv reading errors?

2. to import multiple files from the s3 bucket. I need to write a loop function in python script

*****************************************

import boto3

import pandas as pd

import io

import os

my_bucket_name="xx"

my_file_path="xx.csv"

my_key="xx"

my_secret="xx"

 

session=boto3.Session(aws_access_key_id=my_key,aws_secret_access_key=my_secret)

s3Client=session.client("s3")

f = s3Client.get_object(Bucket=my_bucket_name, Key=my_file_path)

aws_data = pd.read_csv(io.BytesIO(f['Body'].read()), header=0, index_col=0)

print(aws_data)

View solution in original post

2 REPLIES 2
amitchandak
Super User
Super User

Syndicated - Outbound

@Jeanxyz , Hope you are using Python script as source

https://towardsai.net/p/cloud-computing/how-we-connected-amazon-s3-to-microsoft-powerbi-in-5-minutes

 

How to make Python work with Power BI- https://youtu.be/5D0BkNsu5CM

Full Power BI Video 20 Hours YouTube
Microsoft Fabric Series 60+ Videos YouTube
Microsoft Fabric Hindi End to End YouTube
Jeanxyz
Impactful Individual
Impactful Individual

Syndicated - Outbound

Thanks, @amitchandak . will go through the tutorials when I get some time.

 

I talked to our AWS admin and made some change with the python script, it works now. (see script below).

limitations of python script connector:

1. this import mode is slow, hence I can only import a small csv file. If there is a small error with the csv file, the import query will fail. Is there a way to ignore csv reading errors?

2. to import multiple files from the s3 bucket. I need to write a loop function in python script

*****************************************

import boto3

import pandas as pd

import io

import os

my_bucket_name="xx"

my_file_path="xx.csv"

my_key="xx"

my_secret="xx"

 

session=boto3.Session(aws_access_key_id=my_key,aws_secret_access_key=my_secret)

s3Client=session.client("s3")

f = s3Client.get_object(Bucket=my_bucket_name, Key=my_file_path)

aws_data = pd.read_csv(io.BytesIO(f['Body'].read()), header=0, index_col=0)

print(aws_data)

avatar user

Helpful resources

Announcements
March PBI video - carousel

Power BI Monthly Update - March 2025

Check out the March 2025 Power BI update to learn about new features.

March2025 Carousel

Fabric Community Update - March 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors (Last Month)
Top Kudoed Authors (Last Month)