Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
dnauflett
Frequent Visitor

Where Do I Start With Consuming Source Files That can be Queries in Fabric Data Warehouse

Hello,

 

I know how to do this in AWS, but want to make sure I start on the correct path with MicroSoft Fabric Warehouse.    We receive 520 files from our core system everyday.  We want to create a persistent stage in Azure Fabric Warehouse, so our DGO team can query this data historically (we only Stage one days worth of data in SQLServer for Warehouse processing).  We have to manually load the day they want to inspect into SQLServer, which is painful for everyone.    

 

The files themsoelves do NOT have column headers nor do they all have Dates in them to denote the Business Effective Date of the data within.  We do have meta-data for each table to let a process know the struture.  We can easily transform this meta-data so it can be used by Fabric so it can know the struture for each file. The names of the individual file will be the same everyday.   We have an internal process for moving files (move-it) within our company that can write to Azure.

 

In another life, we used AWS S3 to place each days file in a dictory (Date Named: 20240625 as a example) and used the Date on the folders as the Business Effective Date.  We used a crawler that crawled the directory and made new data available.  Was looking to do something similar.  In Azure Fabric.  Just dont know where to start with Azure for accomplishing this. 

 

So my qustions. Within the Fabric eco system what is the best strategy for getting those daily files loaded into Fabric so they can be queried accross multiple days.

1 ACCEPTED SOLUTION

What BI tool will you use? (Power BI?)

 

Will you query the files directly from your BI tool? Or will you load the data from the files into delta tables first, and then query the delta tables from your BI tool?

 

Could you write the contents of your files into a Delta table (by using Notebook), and use Notebook to add the Business Effective Date as a column in the Delta table? 

View solution in original post

8 REPLIES 8
dnauflett
Frequent Visitor

Thanks frithjof_v,

From what I was reading, I think you confirmed that I need to park the actual files in Lakehouse.  I was just wondering how to get the meta-data layer on top of them so we can query lake house and the files using our BI tool. 

What BI tool will you use? (Power BI?)

 

Will you query the files directly from your BI tool? Or will you load the data from the files into delta tables first, and then query the delta tables from your BI tool?

 

Could you write the contents of your files into a Delta table (by using Notebook), and use Notebook to add the Business Effective Date as a column in the Delta table? 

We are going to continue using MicroStrategy.  After months of trying and multiple high end consultants, we can't get Power BI to do Multi Fact Table analysis.  

The plan is to query the data from MicroStrategy.  I'm very new to the MicroSoft eco system and did not know how to get started. Not having File headers is definitly a complicating matter.  I'm a little familur with Pipelines, but never used Notebook.  Would I have to write a Notebook job for each file?  or can it use Meta-data?  I'm going to start reading about Note book.

I'm not quite sure what is meant by Multi Fact Table analysis.

If it is the same as Multi Fact Star Schema, then it should be quite straightforward to work with it in Power BI. https://community.fabric.microsoft.com/t5/Desktop/creating-a-multi-fact-star-schema/m-p/272010

I don't have experience with Microstrategy.

 

I think it should be possible with only one notebook for all the files. The notebook should be able to loop through the files, and do transformations, adding schema, column headers, etc. by using python code. I don't have super much experience with notebook, and I am not sure what a crawler is, but I think notebook should be good for working with manipulating files in directories, adding schema (column headers) and writing to tables like delta tables. 

 

Notebook can be run on a schedule, or triggered by a Data pipeline.

 

If you can share some screenshots or illustrations of the files and the data structures you will be working with, it could be easier to understand and give suggestions.

 

Hi @dnauflett 

 

Thanks for using Microsoft Fabric Community.

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond back with the more details and we will try to help.

 

Thank you.

Hi @dnauflett 

 

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. If you have any question relating to the current thread, please do let us know and we will try out best to help you. In case if you have any other question on a different issue, we request you to open a new thread.

 

Thank you.

Sorry for my slow responses,  I'm trying to figure out how to get started.  I'm going to do some reading on Notebook to see if we could use it.  We have about 520 files that we need to expose daily. I was hoping Fabric had some type of crawler that could crawl a directory struture and use Meta-data to define and expose the data for our BI tool.

frithjof_v
Community Champion
Community Champion

I've never done something like that before, but if you want to use files and directories I think you need to use the Lakehouse (for storing and working with the directories and files) instead of the Warehouse.

 

Sounds like a Notebook (or a combination of Notebook and Data pipeline) could be relevant tools.

Or Spark job (I never tried Spark job).

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.

Top Solution Authors