Solved: Architecture set up using azure blob storage - adv...

jaryszek · ‎06-17-2025

Hello,

I was using custom github connector via web.contents into seperated csv tables.
But it is entirely sucks.

Slow, not reliable, issue with refreshing online.

Have to switch my approach.

I was thinking about Azure Blob Storage workflow:

1. From github using github actions i will push changes into Azure Blob Storage specific branch path like maim/data/..., dev/data/...

And after I can:

a) either use this data from azure blob storage directly within power bi uzing AzureBlob.Contens method - get all tables inside branch and filter them in each seperated query.
b) use datagen 2 with lakehouse in order to preprocess data and make them in very optimized for power bi form. After that load whole branch table with table names into power bi and use for seperate queries.

What do you think? What is the best appraoch?

thank you for your help,
Jacek

v-venuppu · ‎06-18-2025

Hi @jaryszek ,

Thank you for reaching out to Microsoft Fabric Community.

Switching to Azure Blob Storage, Dataflow Gen2, and Lakehouse is a smart move if you're aiming for a more reliable, scalable, and high-performance data pipeline into Power BI.

Instead of using Power BI’s Web.Contents connector to pull data directly from GitHub-which is often slow, fragile, and doesn’t refresh well in the Power BI Service-you can automate the process using GitHub Actions. These actions push your data into Azure Blob Storage, which acts as your raw data staging area.

From there, Dataflow Gen2 in Microsoft Fabric takes over. It ingests and transforms the data, loading it into a Lakehouse. This step lets you clean and optimize the data before Power BI even touches it.

Once your Lakehouse is ready, Power BI connects to it using DirectLake mode. This means no more waiting for data refreshes-Power BI reads the data directly from OneLake in-memory, giving you near-instant performance. Plus, you can reuse the same clean tables across multiple reports, which saves time and keeps everything consistent.

In short, this setup gives you an enterprise-grade architecture:

GitHub is your source of truth and automation engine.
Blob Storage is your raw data landing zone.
Dataflow Gen2 handles the ETL (Extract, Transform, Load).
Lakehouse becomes your central data model.
Power BI focuses purely on visualization and insights.

It’s a clean, efficient, and scalable approach that makes managing and sharing data much easier in the long run.

If this post helps, then please consider Accepting as solution to help the other members find it more quickly.

Thank you.

View solution in original post

jaryszek · ‎06-23-2025

thank you

v-venuppu · ‎06-23-2025

Hi @jaryszek ,

I wanted to check if you had the opportunity to review the information provided.If the response has addressed your query, please accept it as a solution, so other members can easily find it.

Thank you.

v-venuppu · ‎06-18-2025