Building a File based data mart strategy?

gemcityzach · ‎01-09-2025

Hey folks,

in trying to figure out what the best solution is going to be for my use case. There's not an easy way to get direct connections to any of our databases. Therefore, for 90% of my source data it's all coming from daily CSV files delivered via email. Let's say these are daily snapshots from 30 data sources. And the total records each day is less than 500k records in total from all 30 data sets.

we cannot easily manage this in SharePoint to PowerBI any more. We need to move into something more automated and sustainable to do our transformations and calculations.

what would be the best fabric solution to manage the ingestion and transformation of all of this data so that we can do transformations in Python and publish the cleansed data into PowerBI or a TSQL end point?

I was thinking of a premium fabric data warehouse. We stage the files daily into a fabric lake house parent folder with 30 sub-folders. Each sub folder would have the daily files that get appended in. Those appends could be handled by a powerautomate flow that grabs the data out of our mailbox and writes them into the correct folder.

then we use either data flow or nktebooks to connect to the lake house folders to ingest the data into 30 different tables in the data warehouse. In this case we're just. Then we use notebooks to transform and load to 30 silver or gold tables?

v-prasare · ‎02-04-2025

@gemcityzach, we haven’t heard back from you, we wanted to check in to see if the resolution provided by our Supers User helps?

If you’re still facing any issues or have additional questions, please don’t hesitate to let us know.

We’re here to help and would be happy to assist further if needed. Looking forward to your feedback!

Thanks,

Prashanth Are

MS Fabric community support.

Did we answer your question? Mark post as a solution, this will help others!

If our response(s) assisted you in any way, don't forget to drop me a "Kudos"

v-prasare · ‎01-30-2025

Hi @gemcityzach, As we haven’t heard back from you, we wanted to check in to see if the resolution provided by our Supers User helps?

If you’re still facing any issues or have additional questions, please don’t hesitate to let us know.

We’re here to help and would be happy to assist further if needed. Looking forward to your feedback!

Thanks,

Prashanth Are

MS Fabric community support.

Did we answer your question? Mark post as a solution, this will help others!

If our response(s) assisted you in any way, don't forget to drop me a "Kudos"

v-prasare · ‎01-27-2025

@gemcityzach,

Hope your doing well.

@lbendlin, Thanks for your promt response.

As we haven’t heard back from you, we wanted to check in to see if the resolution provided by our Supers User helps?

If you’re still facing any issues or have additional questions, please don’t hesitate to let us know.

We’re here to help and would be happy to assist further if needed. Looking forward to your feedback!

Thanks,

Prashanth Are

MS Fabric community support.

Did we answer your question? Mark post as a solution, this will help others!

If our response(s) assisted you in any way, don't forget to drop me a "Kudos"

lbendlin · ‎01-09-2025

we cannot easily manage this in SharePoint to PowerBI any more

Can you please elaborate? With a sufficient quota a SharePoint Document Library seems to work well for us, especially when you can avoid folder contents enumeration across the whole site.

A Fabric Data Lake would be an obvious choice but it is based on Parquet (Delta Lake) files, and the general guidance is to avoid the "many small files" scenario that you have. They prefer large files in the gigabyte range.

If you can run a Fabric trial you should test it out and see what the performance is, including the transforms from CSV to Parquet (which is built in, I think).

gemcityzach · ‎01-09-2025

Are you bringing the data into directly from SharePoint to a fabric data warehouse into individual tables like I described? If that was possible and then we could do transformations in the data warehouse that would work

lbendlin · ‎01-09-2025

Are you bringing the data into directly from SharePoint to a fabric data warehouse into individual tables like I described?

We are not there yet - we are using SharePoint as our Data Lake for now. But with Data Factory you can bring the data over if you need to.

Building a File based data mart strategy?