Get certified for free when you join Fabric Data Days 2026 and dive into Fabric, Power BI, SQL, AI, and other essential data skills.
Join nowTry your skills in the Power BI Dataviz World Championship! Round one ends June 26. Join now
Good day,
I previously requested assistance on designing a small data platform for Portfolio reporting
Solved: Data Environment Options - Microsoft Fabric Community
My source enviornment is across 2 AWS regions using frontend report exports API (RDS PosgresSQL + S3 + mongodb).
The API forms data logic java developed API from S3. These product data export API are my primary ingest using then secondary else prod source rds + sr+ mongo) with a service account. So i need to go small scale as i will try and motivate for funding but i need a data store, ingest, compute/transform orch, api loop and powerbi to read. So i received some boost where i was told the reports and data needs to be resuable which gives me encouragement of store + semantic sql views but my cost is low, the AI narratives is only a side benefit not the core purpose
So i hve a current report using a azure function consuming 150 USD, and thought i could now decom this report and push for F2 license of 280 usd with Fabirc+PythinIngest & orchestration for PowerBI direct query reporting and i could use i think Management studio to query data and build reporting overtime. Cam i do this and still be Compliant (GDPR) meaning raw data resides in region but agg is in fabric, is the store limitation etc
like @Natarajan_M pointed out archetetcurei could use
-------
F2 (~$262/month) is your entire stack in one purchase:
Fabric Lakehouse — your storage (replaces needing Azure SQL or Dataverse)
Fabric Notebooks — your ETL (Python script that loops the API per client weekly, writes to Lakehouse)
Fabric Pipelines — your scheduler (automates the notebook runs)
Direct Lake — Power BI reads from Lakehouse with no import refresh needed
Copilot — AI narratives, natural language Q&A, DAX generation
You don't need Dataverse, Azure SQL, or to migrate off AWS. Just read from the product API and store results in Fabric Lakehouse.
Architecture:
Product API → Fabric Notebook (loops per client) → Lakehouse → Power BI + Copilot
-------
i want to design. motivate and fool proof this to my cto, does anyone challenge this and is my best approach it remains low cost, covers full overtime storage, how does storage and ingest and consukme add on top of the F2 cost, i cant afford F3 they will be happy with manual refresh them but im trying to build a small data pltform with this i could possibly push to 400usd if i motivate well because i ma reusing existing cost that is wasted but i need to be sure the pythin can loop these aws client facing api or query the regions s3 + rds and orchesatrete and transform for agg data maniputation in pythin, then sql to build semantic views and then consume or even compute for some ML FC in the future as added benefits like the AI copilot. I cant afford F3 as i think its roughly 600usd
Please help, is this a good and best strongest method and my powerbi reports plugs to the fabric sql views remember i need to agg on my ingest for compliance and i eed to motivate why fabric because we AWS product org, my angle is that easy integrate with powerbi and that i could write scripts accross the fabric store etc comnining both regions data as concoildated. My prod aws analytics account env was decom due to cost but in my new role of consulting and client facing reporting need to show what our product api can do but alsio aid our consulting team, hence narratives , fc etc
Please help if you have any suggestions, improvements, better ideas, documnts, links, erd, motivation etc to help me
Thank You
Solved! Go to Solution.
Hi @icassiem ,
1. ERD / Architecture Example.
AWS (API / S3 / RDS / Mongo)
|
Fabric Notebook (Python ingest + transform, remove PII)
|
Lakehouse Tables (Silver Layer only)
|
SQL Views (Semantic Layer)
|
Power BI (Direct Lake / DirectQuery)
Ex: fact_usage_metrics, dim_client, dim_product and dim_date
2. Motivation + Risks: Replace Azure Function ($150) and Potential DB / ETL tools with single Fabric capacity.
Note: Reusable data for reports, Centralized semantic layer and Shared across consulting + clients. Risks you must call out, Capacity limits (F2) as Small compute (2 CUs) Can slow if too many users and large joins
Solution: Pre-aggregate data and Small models
3. Medallion — Do you need it?
Try below architecture.
API --> Python (clean + aggregate + remove PII) --> Silver tables
|
SQL Views (Gold)
Note: No Bronze (save cost), Silver = your storage and Gold = SQL views (no extra storage)
4. Orchestration: Best option is Fabric Pipelines, You can Schedule notebook runs, Retry handling and No extra cost (included in F2).
5. Query using SSMS: YES , this is supported, Please refer below steps.
Go to Lakehouse
Open SQL Endpoint
Copy connection string
Connect via SSMS (SQL Auth / AAD)
Note: Fabric exposes T-SQL endpoint over Lakehouse
6. Dev Access + Power BI Connectivity:
Development: You can use Fabric web portal (primary), VS Code (optional for notebooks) and Git integration available.
Power BI: Direct Lake Connect to Lakehouse tables and No gateway needed.
Note: Gateway only needed if you connect to on-prem data.
7. Why NOT AWS RDS:
AWS RDS --> Operational DB not analytics, Power BI , RDS --> Security risk (prod exposure), Scaling --> Expensive + manual, Transformation -->No built-in ETL.
Note: Fabric contains Separate analytics layer, No impact on prod, Built-in ETL + BI and Lower total cost.
8. Copilot usage: Works in Power BI (narratives, DAX, Q&A) and Fabric notebooks (code assist)
Note: It is not full pipeline automation and Not replacing ETL. It is a“Enhancement layer, not core architecture”.
9. Python vs SQL roles: Python Ingest APIs, Transform, Aggregate and Remove PII. SQL contains Views, Joins and Semantic layer.
10. Ingestion Pattern (overwrite vs CDC): Simple approach (recommended for F2) Overwrite or append and Use timestamps. Better approach: Incremental loads (by date) and Partition tables.
Note: CDC Overkill for your case and Requires more complexity. You do NOT need Azure Data Factory. No extra infra needed Everything is SaaS and no hidden infra costs.
11. Backup Plan (Power BI only approach): Yes you can, but No central data store, Repeated API calls, Not reusable and Not scalable.
Note: It acceptable only for Very small MVP only. Power BI-only creates datasets and is not a reusable data platform.
I hope this information helps. Please do let us know if you have any further queries.
Regards,
Dinesh
Hi @icassiem ,
Thank you for reaching out to the Microsoft Community Forum.
Yes, your proposed Fabric architecture is one of the best low-cost options available today and 100% achievable on F2 capacity.
Regarding your queries, please refer below.
Can Python in Fabric call AWS APIs + loop clients?
Yes, Fabric Notebooks run Python and can call REST APIs, loop through tenants/clients, handle JSON/CSV responses and write to Lakehouse tables. This replaces your Azure Function entirely.
Can it read AWS S3 / RDS if needed?
Yes, it reads via REST APIs, JDBC for PostgreSQL and S3 connectors.
Can you build SQL semantic views?
Yes, Fabric Lakehouse exposes a SQL endpoint, you can create views, Aggregate data and Build reporting models. This becomes your “semantic layer”.
Power BI integration?
Yes, you can integrate.
GDPR & Compliance?
This is compliant if you do not store raw PII in Fabric, you mask/anonymize during ingest and you define retention policy and purpose limitation.
Note: Fabric is used only for aggregated and reporting-level data, not raw personal data. That aligns with GDPR principles.
Architecture Improvements:
1. Layer your Lakehouse properly, even in small setup: Bronze (raw API response - temporary), Silver (cleaned + structured) and Gold (aggregated reporting tables). Then delete Bronze data if needed (for GDPR). Power BI reads from Gold.
2. Use incremental ingestion, instead of full reload only pull changed data per run.
3. Add simple orchestration via Pipelines, schedule notebook weekly/daily and retry logic for API failures.
4. Keep datasets small, because you are on F2, avoid huge joins and Pre-aggregate early.
Please refer below links.
Lakehouse end-to-end scenario: overview and architecture - Microsoft Fabric | Microsoft Learn
What is a lakehouse? - Microsoft Fabric | Microsoft Learn
How to use notebooks - Microsoft Fabric | Microsoft Learn
Direct Lake overview - Microsoft Fabric | Microsoft Learn
Microsoft Fabric - Pricing | Microsoft Azure
Governance and compliance in Microsoft Fabric - Microsoft Fabric | Microsoft Learn
Standards compliance in Microsoft Fabric - Microsoft Fabric | Microsoft Learn
I hope this information helps. Please do let us know if you have any further queries.
Regards,
Dinesh
@v-dineshya Thank You very much
1. do you have an ERD envioronment & solution archetetcure examples or supporting links and docs please
2. and motivation or risks i need to cover for my proposal
3. i probably cant do the medallian due to storage and cost becuase i guess that additional costs, maybe into silver layer where the pythin pipeline agg and removes PII data, maybe only a silver with the sql views as the gold mantic layer iots not storage but its bus tranformed or pre access layer?
4. orchestration?
5. how do i query ssms?
6. how would i access fabric from pc for dev and powerbi connect a gateway to sql views?
7. have I covered all angles theres nothing else like a db in azure or they will ask why not a rds db in aws but that gives powerbi security to rds prod issus and a db on prod product and ev etc
8. thereafter i can use copilot in powrbi narratives, does it allow m,e to use copilot else where in my data management or ingest, transform?
9. so everything is via python correct, the ingest to store to transfomr and sql only for views and query?
10. would my ingest be overrite be or is there a cdc manner or can i use data factory with pythoin or is that dddiotnal cost and is pc dveelopement that needs to deploy to server are those all additional costs?
Sorry for all the ques, im really trying to be sure i cover all agles etc
Please help
I don't think it's a big ask, motivating from 170 USD to 263 USD due to storage and doing internal reporing with F2 pluss the client essay narratives from the Ask to be distrubuted and summaries are good bonuses - so this is my first prize pitch
But wrost case could i do this in powerbi, i know i can schedule refresh python in the services but could powerquery loop per clinet api do the ingest and publish the extracts as datasets for me then to do the repoeting from the datasets almost like dtamrts as powerbi store?
"just thinking out loud"
Please Help
Hi @icassiem ,
1. ERD / Architecture Example.
AWS (API / S3 / RDS / Mongo)
|
Fabric Notebook (Python ingest + transform, remove PII)
|
Lakehouse Tables (Silver Layer only)
|
SQL Views (Semantic Layer)
|
Power BI (Direct Lake / DirectQuery)
Ex: fact_usage_metrics, dim_client, dim_product and dim_date
2. Motivation + Risks: Replace Azure Function ($150) and Potential DB / ETL tools with single Fabric capacity.
Note: Reusable data for reports, Centralized semantic layer and Shared across consulting + clients. Risks you must call out, Capacity limits (F2) as Small compute (2 CUs) Can slow if too many users and large joins
Solution: Pre-aggregate data and Small models
3. Medallion — Do you need it?
Try below architecture.
API --> Python (clean + aggregate + remove PII) --> Silver tables
|
SQL Views (Gold)
Note: No Bronze (save cost), Silver = your storage and Gold = SQL views (no extra storage)
4. Orchestration: Best option is Fabric Pipelines, You can Schedule notebook runs, Retry handling and No extra cost (included in F2).
5. Query using SSMS: YES , this is supported, Please refer below steps.
Go to Lakehouse
Open SQL Endpoint
Copy connection string
Connect via SSMS (SQL Auth / AAD)
Note: Fabric exposes T-SQL endpoint over Lakehouse
6. Dev Access + Power BI Connectivity:
Development: You can use Fabric web portal (primary), VS Code (optional for notebooks) and Git integration available.
Power BI: Direct Lake Connect to Lakehouse tables and No gateway needed.
Note: Gateway only needed if you connect to on-prem data.
7. Why NOT AWS RDS:
AWS RDS --> Operational DB not analytics, Power BI , RDS --> Security risk (prod exposure), Scaling --> Expensive + manual, Transformation -->No built-in ETL.
Note: Fabric contains Separate analytics layer, No impact on prod, Built-in ETL + BI and Lower total cost.
8. Copilot usage: Works in Power BI (narratives, DAX, Q&A) and Fabric notebooks (code assist)
Note: It is not full pipeline automation and Not replacing ETL. It is a“Enhancement layer, not core architecture”.
9. Python vs SQL roles: Python Ingest APIs, Transform, Aggregate and Remove PII. SQL contains Views, Joins and Semantic layer.
10. Ingestion Pattern (overwrite vs CDC): Simple approach (recommended for F2) Overwrite or append and Use timestamps. Better approach: Incremental loads (by date) and Partition tables.
Note: CDC Overkill for your case and Requires more complexity. You do NOT need Azure Data Factory. No extra infra needed Everything is SaaS and no hidden infra costs.
11. Backup Plan (Power BI only approach): Yes you can, but No central data store, Repeated API calls, Not reusable and Not scalable.
Note: It acceptable only for Very small MVP only. Power BI-only creates datasets and is not a reusable data platform.
I hope this information helps. Please do let us know if you have any further queries.
Regards,
Dinesh
@v-dineshya Wow, thank you so so much
1. I am new to fabric/onelake, any examples, sources, links or materials on how to actually implement, setup and dev this?
2. any more on the motivation and way of working, just checking there's anything you missed please that can help me show how this design will provide long term sustaianabilty too please?
3. On the Fabirc is it similar to SQL/Azure where theres databse sections with sql agent for jobs and and ssis to importat pakacges or the pything is directly inserted in sceduler task?
4. What would the DB size limit be, can i do overtime? like you said best to overwrite but i have 10 client product data and no clue what size im looking at, is there a limitation like 1TB or do i get charged on top of the F2 $262 for storage or does it include everything with a size limit?
5. Are there more F2 AI Benefits other thant the copilot summaries and client performance analyses etc for my enevironment and reporting please?
Hi @icassiem ,
Please refer below practical implementation for your project.
1. Setup: Create Fabric workspace, Enable Fabric capacity (F2) and create a Lakehouse
2. Ingestion: Create notebook with below sample code.
import requests
import pandas as pd
clients = ["client1", "client2"]
for c in clients:
data = requests.get(f"https://api/{c}").json()
df = pd.json_normalize(data)
# remove PII
df = df.drop(columns=["email","name"], errors="ignore")
# aggregate
df = df.groupby(["product","date"]).sum().reset_index()
# write to lakehouse
df.to_parquet(f"/lakehouse/default/{c}_data")
3. SQL Layer: In Lakehouse SQL endpoint, create a view with below code.
CREATE VIEW vw_client_metrics AS
SELECT
client,
product,
SUM(metric) AS total_metric
FROM silver_table
GROUP BY client, product;
4. Orchestration: Create Pipeline, Add Notebook activity and Schedule weekly.
5. Power BI: Connect via Direct Lake and SQL endpoint.
Long-Term Sustainability:
1. Reusability Layer: Your SQL views is a reusable datasets. Instead of:
Report --> API --> logic, now you have API --> Fabric --> Shared data model --> Multiple reports. This reduces duplication massively.
2. This is enterprise-grade design.
Layer Responsibility
AWS Operational data
Fabric Analytics
Power BI Visualization
Fabric vs Traditional SQL/Azure Stack:
Traditional (Azure SQL) as SQL DB (storage), SQL Agent (jobs) and SSIS (ETL)
Fabric Equivalent:
Traditional Fabric
SQL DB Lakehouse
SQL Agent Pipelines
SSIS Notebooks (Python/Spark)
Stored procedures SQL views
Scheduling: Python runs inside Notebook. Pipelines trigger it and No separate SSIS packages.
Storage Size & Limits: There is NO fixed DB size limit like SQL DB. Fabric uses OneLake (data lake storage).
Cost model:
Component Cost
F2 capacity ~$262/month
Storage Charged separately
Storage pricing: ~$23 per TB/month (approximate).
Note: Keep aggregated data only, Partition by date and Avoid raw logs.
Can it scale over time?
YES, No hard limit and Pay-as-you-grow
AI Capabilities in F2:
1. In Notebooks: Code generation (Python), Debugging assistance and Data exploration.
2. In Power BI: Narrative summaries, Q&A and DAX suggestions.
3. Data exploration: It has limited Auto insights.
Please refer below links.
End-to-end tutorials in Microsoft Fabric - Microsoft Fabric | Microsoft Learn
Microsoft OneLake documentation - Microsoft Fabric | Microsoft Learn
Pipeline Overview - Microsoft Fabric | Microsoft Learn
Connect and Query SQL Database in Microsoft Fabric | Microsoft Learn
Microsoft Fabric quotas - Microsoft Fabric | Microsoft Learn
I hope this information helps. Please do let us know if you have any further queries.
Regards,
Dinesh
@v-dineshya Thank You 👍
schu i have a lot to learn, last ques:
1. Is the storage/pipeline/notebooks all in PowerBI and not Azure, like a PowerBI Extended capability framework and everything is managed from powerbi?
2. I wont go more than 1TB, do you think its wise to go for the F4 Reservation 1yr or pay as you go better as my current model we thinking of monthly refresh of 10 reports, are there benefits from F2 to F4 other than processing power? so its F2 $263+$23 or Reservation F4 $312+$23 - is it worth the motivation does the business see the difference like extra featires etc or just pocessing?
3. This "Data exploration: It has limited Auto insights.", i want to provide client analyses monthly but in full essay narratives for 15 clients or is this something different, is thois the per visual summary limit?
4. So i need to create a PowerBI worksapce and assign the license to it and i work within that workspace on powerbi or azure - is azure falling away what about the synapse dw/databricks if fabric can also do medallion & databricks?
5. With the Fabric Notebooks i can do forecasting and schedule in fabirc peipleine correct but access data in the Lakehouse or does fabric also have databircks available etc?
Sorry for all the ques, im learning so much.
Hi @icassiem ,
Please refer below.
1. It’s not just Power BI, It’s Microsoft Fabric a SaaS platform. It is managed from the Power BI / Fabric portal. You access everything via https://app.fabric.microsoft.com. Inside that UI you get Power BI, Lakehouse, Notebooks, Pipelines and SQL endpoint.
2. There are NO extra features unlocked in F4, only more capacity. You said 10 clients, Monthly refresh, 10 reports and Aggregated data. This is VERY LOW load.
changes from F2 --> F4
Area Difference
Compute More power (2x)
Concurrency More users supported
Performance Faster loads/queries
Features SAME features
Note: Start with F2, Enough for your pipelines, reporting, development and monitor usage metrics. If necessary Upgrade later.
3. Fabric / Power BI does Per report: Smart narrative visual and Generates insights per dataset.
Copilot: Can generate summaries, insights and explanations.
Note: It is not a multi-client report generator engine and not automatically generating 15 separate essays per client. AI narrative is assistive, not a full reporting engine.
Workaround: Use Python in Fabric, Generate text summaries, Store in table
and display in Power BI.
4. Yes, You don’t need Azure SQL, Data Factory and Synapse. Fabric replaces everything.
5. Fabric Notebooks are Spark-based, Python and SQL. It can Ingest data, Transform and Forecast. yes you can do forecasting. Please refer below example.
from sklearn.linear_model import LinearRegression
You can schedule it by Using Pipelines: Run notebook weekly/monthly and Chain tasks.
You don't need Databricks, Fabric already includes Spark, Notebooks and ML capability.
I hope this information helps. Please do let us know if you have any further queries.
Regards,
Dinesh
@v-dineshya Thank You so so much, you have been a star in helping me. Thank You for your patience and knowledge sharing 🙏
3 Last things:
1. This is important for me "Note: It is not a multi-client report generator engine and not automatically generating 15 separate essays per client." so i will start with F2 as there's no additional capability benefit with F4 but i was hoping to have a slicer to select a client and the Q&A and ask "give me a full analyses of the past 6 months based on the selected client for example or should i say give me a summary past 6 months per client" as this is internal reporting this part - would that workm is the Q&A an LLM in PowerBI could do that?
2. I am ready to accept as solution, im just wondering im going to need more future help as i go along, how do i do can i reach out to you or should i create another ques but this might go deeper into fabric?
3. Are the copilot narratives strong in detail meaning, if my page is a quarterly view per slicer client, could it generate a essay or is that only in the Q&A or summary per visuals as i would like a page detail narrative of all visuals based on slicer client selected?
Sorry for the questions
Hi @icassiem ,
1. Can Q&A generate a full client-specific 6‑month analysis?
Yes, but not reliably as a full “consulting-grade essay”. Power BI Q&A (LLM) is an LLM-powered semantic query layer. It can Understand natural language, Generate measures/visuals and Summarize trends. It can produce a response, but not structured like a proper report/essay, Output varies (not deterministic), Not multi-paragraph consulting-quality narrative and works best for quick insights and short summaries.
2. Can Copilot generate full-page narrative (your slicer-based report)?
This is closer to what you want but still not perfect. Power BI has “Smart Narrative” visual, it reads entire page (all visuals). Responds to slicers and filters. It generates multi-sentence summaries, trend descriptions and KPI insights.
Yes, your slicer-based idea WILL work with Smart Narrative, Copilot and Q&A. but it will give summaries, not full essays. If you want true client-level narrative reports, the best solution is Generate narratives in Fabric (Python) and serve them dynamically in Power BI.
Please refer below practical setup.
Phase 1 (F2 setup): Use smart Narrative visual, Simple Copilot Q&A. It show value fast.
Phase 2 (upgrade your “wow factor”): Add Python-generated narratives and
Stored narrative tables.
You want Client-level, 6-month, detailed narrative (essay style). Please refer below workaround.
Combine Fabric + Python + Power BI.
1. Generate narrative in Notebook. Please refer below sample python code.
def generate_summary(df):
growth = df["revenue"].pct_change().mean()
trend = "growth" if growth > 0 else "decline"
summary = f"""
Over the past 6 months, the client shows a trend of {trend}.
Key drivers include product usage increases and seasonal variation.
Peak performance occurred in {df['month'].max()}.
"""
return summary
2. Store in table:
client | narrative | date_range
3. Show in Power BI: Use table/text visual and filter by slicer. Now you get Stable, Repeatable, Client-specific and “Essay-like”.
I hope this information helps. Please do let us know if you have any further queries. If you face any issues or need any help in future. please create a thread in Fabric community . we are happy to help you .
Home - Microsoft Fabric Community
Regards,
Dinesh
@v-dineshya Thank You Very Much, really appreciate your help ❤️🙏
Wish me luck, i will post in the fabric in the near future
| User | Count |
|---|---|
| 17 | |
| 11 | |
| 11 | |
| 8 | |
| 7 |
| User | Count |
|---|---|
| 40 | |
| 35 | |
| 35 | |
| 34 | |
| 20 |