Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
visheshjain
Impactful Individual
Impactful Individual

Fetching AWS Athena data into Fabric

Hello everyone,

 

I have all my sql tables in AWS Athena.

 

Is there some way we can get all those tables into Fabric?

 

I am looking to mirror/shortcut the Athena data into Fabric.

 

Thank you,

Vishesh Jain

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



14 REPLIES 14
visheshjain
Impactful Individual
Impactful Individual

Hello,

 

I was not able to get the data from Athena directly into Fabric, however I was able to get the data from S3 into a Lakehouse and make a table using a SQL endpoint, which was stored into a Lakehouse table.

 

I created a report from this lakehouse table and published it to the service, but here is the issue.

 

I added new data to S3 and when I refresh this semantic model in the service, the new data does not show up.

 

If anyone could please help me with it.

 

Thank you,

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



Hi @visheshjain,

Thanks for the detailed follow-up, the issue you are facing is likely due to how the data is being loaded from S3 into the Lakehouse. Fabric doesn't automatically sync new files added to S3 unless you've built a process like a Dataflow Gen2, Notebook, or Pipeline to handle that refresh regularly.

You can set up a recurring Dataflow Gen2 or Notebook in Fabric that reads from your S3 path and updates the Lakehouse table which makes sure the new S3 data is picked up before the semantic model refresh. If you're using COPY INTO or reading Parquet/CSV files directly from S3, you'll need to rerun that logic whenever new files are added.

Once the Lakehouse table is refreshed with the latest data from S3, the semantic model refresh in the Power BI Service will pick up the changes correctly.

 

Best Regards,

Hammad.

Hello @v-mdharahman,

 

Could you please help me out with how to design the solution?

 

I want to design the solution in such a way that after loading new data in S3, when the user refreshes the dataset in PBI service, all the data new and old is fetched from S3 and the report is updated.

 

Thank you,

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



Hi @visheshjain,

To automatically reflecting the latest S3 data whenever the dataset is refreshed in the Power BI Service you’ll need to set up a Fabric pipeline or scheduled process that refreshes the Lakehouse table before the semantic model refresh runs.

If using a Fabric Pipeline to Orchestrate the Flow, first create a Notebook or Dataflow Gen2. Now write logic to load all data from your S3 folder into a Lakehouse table, if using a notebook, you can use COPY INTO or read_parquet() / read_csv() and then write to the table using saveAsTable() or write.format("delta").save().

Now create a Fabric Pipeline and add two activities, first activity is to run the Notebook/Dataflow that loads data from S3 to Lakehouse and second activity is to refresh the semantic model (dataset) using the Power BI Refresh semantic model activity. This ensures that the Lakehouse is updated with fresh S3 data before the report’s semantic model is refreshed.

Lastly schedule the pipeline or trigger it using external tools. You can schedule this pipeline to run at regular intervals, or use APIs/automation to run it when new files are added to S3 (e.g., from AWS Lambda if you want real-time sync).

 

If I misunderstand your needs or you still have problems on it, please feel free to let us know.  

Best Regards,
Hammad.

Hi @v-mdharahman,

 

Notebooks do not seem to be a viable options as I do not know how to trigger them when a refresh is triggered from the user's side and running so many notebooks will only create unneccasary load on the capacity.

 

Can you provide more information as I doubt, a Lambda in AWS can trigger a dataflow/pipeline in Fabric.

 

Thank you for responding!

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



Hi @visheshjain,

You're absolutely right triggering a Fabric Notebook for every semantic model refresh could put unnecessary load on your capacity, and currently, there’s no native way to trigger a Dataflow Gen2 or Pipeline directly when a user clicks “Refresh” on a Power BI dataset.

Also, as you pointed out, AWS Lambda cannot directly trigger Fabric Pipelines, since there's no public webhook or API for Fabric pipelines today. So instead of writing S3 data into a Lakehouse table and creating a semantic model on top of that table, you can connect your semantic model directly to the files in S3 using a Dataflow Gen2 that reads from S3 and lands data in a Lakehouse folder.

Then, you build your semantic model on top of that folder-based Lakehouse table.

 

Best Regards,

Hammad

Hello @v-mdharahman,

 

Yes I could do that but here is a problem with that.

 

I have created joins in Athena with other tables and if I bring in data directly from S3 I will have to perform those joins in PQ which everyone knows is an expensive operation for a table of 6 milllion rows and counting, that too without incremental refresh working, as PQ is not able to query fold.

 

If you have any other solution/suggestion I am all ears.

 

Thank you,

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



v-mdharahman
Community Support
Community Support

Hi @visheshjain,

Thanks for reaching out to the Microsoft fabric community forum.

At the moment, Microsoft Fabric doesn’t provide a native connector to directly pull tables from AWS Athena into a Lakehouse or Warehouse. While Power BI Desktop does have an Athena connector that lets you import data for reporting, this connector isn’t currently available in Fabric Dataflows or Pipelines.

However, there’s a supported workaround using Amazon S3. You can export your Athena query results to S3 (in formats like CSV or Parquet), and then use the Amazon S3 connector in Fabric to bring that data into your Lakehouse.

 

If a direct Athena connector is important for your use case, as already recommend by @lbendlin, try submitting an idea in the Ideas Forum so the product team can prioritize it.

 

I would also take a moment to thank @suparnababu8 and @lbendlin, for actively participating in the community forum and for the solutions you’ve been sharing in the community forum. Your contributions make a real difference.

 

If I misunderstand your needs or you still have problems on it, please feel free to let us know.  

Best Regards,
Hammad.
Community Support Team

Hi @visheshjain,

As we haven’t heard back from you, so just following up to our previous message. I'd like to confirm if you've successfully resolved this issue or if you need further help. If yes, you are welcome to share your workaround and mark it as a solution so that other users can benefit as well. If you find a reply particularly helpful to you, you can also mark it as a solution.

And if you're still looking for guidance, feel free to give us an update, we’re here for you.

 

Best Regards,

Hammad.

Hi @visheshjain,
Hope everything’s going smoothly on your end. We haven’t heard back from you, so I wanted to check if the issue got sorted. 
Still stuck? No worries just drop us a message and we can jump back in on the issue.

 

Best Regards,

Hammad.

Hi @v-mdharahman 

 

Please read my previous response!

 

Thank you,

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



suparnababu8
Super User
Super User

Hi @visheshjain 

 

There is no direct connector availble in fabric to pull table from AWS athena as of now. Currently support connector from AWS is S3 bucket. 

 

If you want Athena Connector,   Please submit your Idea here: Fabric Ideas - Microsoft Fabric Community

 

Thank you!

 

Did I answer your question? Mark my post as a solution!

Proud to be a Super User!

lbendlin
Super User
Super User

There is a standard AWS Athena connector in Power BI.  Are you looking for something else (shortcuts or mirroring)?

Hi @lbendlin,

 

Yes I am looking to mirror the tables. I have the data in S3 on which tables have been created in Athena.

 

Do you know if having the data in Fabric will reduce refresh times or not?

Currently the gateway takes about 45-50 mins for a 6 million row dataset.

 

Thank you!

Vishesh Jain

Did I answer your question?
If yes, then please mark my post as a solution!

Thank you,
Vishesh Jain

Proud to be a Super User!



Helpful resources

Announcements
Fabric July 2025 Monthly Update Carousel

Fabric Monthly Update - July 2025

Check out the July 2025 Fabric update to learn about new features.

July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.