Re: Building flat file pipeline

davidz106 · ‎11-09-2023

Hi everyone,

I've recently come across Microsoft Fabric and its connected services, and I'm intrigued by the functionality where you can simply drop files in an explorer interface, and they become available in OneLake, ready for use in tools like Power BI.

Our organization deals with a lot of semi-structured flat files (mainly CSVs and XLSX). Currently, we rely on Python scripts to parse these files into a structured table format and move them to sql db as doing so. However, current Luigi python pipelines are not well maintained and transition to other (MS based) tool would be very benefitical.

I'm interested in migrating this process to the Fabric ecosystem in a way that maintains the simplicity shown in the promo videos, but also ensures that file parsing happens automatically to deliver the formatted data we need in OneLake.

I'm completely new to Fabric and Microsoft's modern data services, so I would appreciate any insights on:

Which specific Microsoft services within the Fabric framework are best suited for this task?
What are the recommended first steps to start this integration?

I'm aiming for a solution that minimizes manual intervention and maximizes efficiency in making these semi-structured files readily available and usable for our analytics and reporting tools.

Would love if somebody could be willing to prepare some demo on this subject, maybe with some very simple minimal example dataset. We can arrange it as a paid one-time job.

Anonymous · ‎11-14-2023

Hi @davidz106 ,

Thanks for using Fabric Community.

In order to guide you better I have few queries,
1. Can you please share some dummy data or sample files ( 2 or 3 )
2. How you are expecting data to load in SQL DB? I would like to understand the sink table structure.
3. How big is the data?
4. Can you please share challenges in your current python pipelines implementation?
5. Are you looking for real time triggers? Like whenever we upload a file in One lake File explorer it starts processing the file and loads to SQL DB.

Please help me sharing your response, so that I can guide you better.

davidz106 · ‎11-22-2023

Hi v-gchenna-msft,

1. and 3. Sample data could be a simple 20x20 table with one empty row. By 'parsing' I mean operation like skipping empty rows in df or such.

2. We have thousands of files so uploading them directly into PBI is not an option, therefore a data pipeline is needed. Currently parsing functions are already written in python but we could use other tool if needed.
4. Solutions are ok but a more robust approach is wanted, preferably inside MS enviroment since we would then use this database as input to PBI.

5. This is the gist of my question and exactly what I or we are aiming for, you put it nicely: we upload a file in One lake File explorer it starts processing the file and loads to SQL DB.

A tutorial on that would be very welcome.

Anonymous · ‎11-16-2023

Hi @davidz106 ,

We haven’t heard from you on the last response and was just checking back to see if you got a chance to look into above reply.

Anonymous · ‎11-19-2023

Hi @davidz106 ,

We haven’t heard from you on the last response and was just checking back to see if you got a chance to look into above reply.

Building flat file pipeline

Helpful resources

Fabric Community Update - July 2025

Fabric Monthly Update - June 2025

Party with Power BI’s own Guy in a Cube

Building flat file pipeline

Helpful resources

Fabric Community Update - July 2025

Fabric Monthly Update - June 2025