Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started
Hi all
I am not sure If I am in the right place for this question , but since it is data ingestion / storage related, I thought I'd give it a try 🙂
As there are quite a number of tools and options within fabric and everything is still quite new, I wanted to cross check with the community to find the most ideal architecture for data integration for the following scenario.
- 4 Main sources: 2 Web APIs (with pagination), one Database (SQL) behind a Firewall, and Dynamics Business Central
- Data from one Web API is mostly event driven, the other is a payment gateway
- Use Power BI for visualization, preferrably using centralized datasets with measures
- Not a hige amount of data for now (below 1TB) but solution should be scaleable
Small company, preferrably using Power BI licenses only or a lower Fabric license
I lean towards using Dataflows Gen2 for all sources, connecting to the internal DB with the Power BI Gateway, and dropping all into a Datamart. That way, I believe, all could be done using Premium per User Licenses. But that would also mean that everyone consuming reports base don that would need PPU, right?
Alternatively I could bring it all into a Lakehouse or Warehouse. That would require a Fabric license.
Looking for some thoughts, ideas? What would you do? Or is there anything else that might need to be considered?
Thanks for the help!
Solved! Go to Solution.
Hi,
About the architecture, you can use only a few steps for the ETL to go directly to a kimball star-schema, or you have the option to evolve and clean the data step by step, which would be a combination of the medallion architecture with your final single source of truth.
Keep in mind the self-service nature of Power BI may end up "moving" from a single source of truth to data mesh before you expect. If this happens without coordination, you end up with data silos.
About the cost, you buy what are called Fabric Capacities. You buy them on Azure environment. Here is a list of price: https://azure.microsoft.com/en-us/pricing/details/microsoft-fabric/
The challenge will be to discover what capacity will be enough for your processing needs.
Kind Regards,
Dennes
Hi,
It's very difficult to give you a ready-to-use solution. If you are interested in a consulting, you can reach me in private.
However, I would consider the following:
- Microsoft Fabric is the perfect environment to implement data mesh and it's evolving on this direction. It's important you understand data mesh, the best way to do so is to read the original papers and then notice how Fabric can do way better than the original proposal.
- Once you understand data mesh and I believe you may already have the understand of single source of truth, you will need to decide how to combine both. Yes, combine, because in opposite to what many believe, they are not opposite, they work together.
- Inside each single source of truth, which can be a data mesh domain or can be the single source of truth, according to your decision, you will need to build a medallion architecture. The medallion architecture allow the evolution from the production model to the BI model and there are many tricks in this implementation to allow you evolve this model according to your understanding about how different your intelligence model needs to be in comparison to production.
The bad news is that my videos about this are not published yet.... they will be soon...
Kind Regards,
Dennes
Thank you very much for your reply and insight.
I have heard of data mesh and the medallion architecture and it is for sure something to keep in mind when scaling the solution.
As of now, the primary objective is to get all sources into one source of truth, as the company is still very small. We need to build up an architecture based on an enterprise business model, which was defined together with the business and analytics teams. That means appending / mergeing and cleaning data from all the above sources into an easy to understand kimball star schema. Quickly and at low cost.
Keeping data "future" architecture in mind, using dataflows and a lakehouse would be most appropriate as it can accommodate the principles of data mesh.
Can anyone point me to a cost calculation using the fabric pricing? Really all I need seems to be a couple of dataflows plus a lakehouse, but I am having a hard time finding out what that would incurr in cost.
Hi,
About the architecture, you can use only a few steps for the ETL to go directly to a kimball star-schema, or you have the option to evolve and clean the data step by step, which would be a combination of the medallion architecture with your final single source of truth.
Keep in mind the self-service nature of Power BI may end up "moving" from a single source of truth to data mesh before you expect. If this happens without coordination, you end up with data silos.
About the cost, you buy what are called Fabric Capacities. You buy them on Azure environment. Here is a list of price: https://azure.microsoft.com/en-us/pricing/details/microsoft-fabric/
The challenge will be to discover what capacity will be enough for your processing needs.
Kind Regards,
Dennes
Hi @Data-Crunch - Thanks for using Fabric Community,
We haven’t heard from you on the last response and was just checking back to see if your query got resolved? Otherwise, will respond back with the more details and we will try to help.
Hi @Anonymous thanks for the follow up.
The comments were definitely helpful, but when choosing the right tools to create a capable initial architecture, this is something which I will have to find out by trial and errror I believe.
Hi @Data-Crunch ,
Glad to know that you got some insights on your query. We expect you to keep using this forum and also motivate others to do that same . You can always help other community members by answering to their queries
Thanks
Chenna Gopi Krishna
User | Count |
---|---|
7 | |
3 | |
3 | |
2 | |
2 |
User | Count |
---|---|
17 | |
4 | |
3 | |
3 | |
3 |