Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Reply
pmscorca
Kudo Kingpin
Kudo Kingpin

Filtering files using a SQL view or table

Hi,

I need to retrieve several historical csv files from a on-premise folder.

The name of each file mainly has a prefix, a middle string and a final date, in yyyymmdd format, plus the csv extension.

F.e. the name of a file could be sourceprefixfile_middlestring_20251231.csv.

In the remote folder there are a file for each day: sourceprefixfile_middlestring_20251201.csv, sourceprefixfile_middlestring_20251202.csv, sourceprefixfile_middlestring_20251203.csv and so on; but I need to get the end of month file as sourceprefixfile_middlestring_20251031.csv, sourceprefixfile_middlestring_20251130.csv, sourceprefixfile_middlestring_20251231.csv, etc and not all files.

So, I've implemented a SQL view in a warehouse to obtain the end of month files respect to each source prefix file starting from 2022-01-01. This view exposes the source prefix file and the month end date (in yyyymmdd format).
Then, I've tried to filter all files by the SQL view using a pipeline: I've implemented a lookup activity to read the SQL view, then a get metadata activity to get all file name (childitems), but I'd like to match the corresponding output arrays possibly without reading all files more times. I cannot implement an exact match, I can filter a file respect to the source prefix file and the month end date, returned by the lookup activity and not to an exact name.

Then I should use a for each activity to iterate the filtered files in order to copy them in a lakehouse.

Now, any suggests to me to implement the right filter condition for the filter activity? Many thanks

6 REPLIES 6
ssrithar
Resolver II
Resolver II

 Hi @pmscorca ,

 

Here are a few additional best practices and enhancements you can consider to make your pipeline more reliable, maintainable, and scalable when filtering and copying historical end-of-month CSV files from on-prem folders using a SQL view + Data Pipeline approach in Microsoft Fabric:

 

Make Filtering Case-Insensitive (If Needed) -- If filenames might have inconsistent casing (e.g.,SourcePrefix_20240131.csv)

Trim Extra Whitespace in Lookup Output-- To ensure the prefix and month_end columns are trimmed to prevent false negatives

 

 

If this post helps, then please appreciate giving a Kudos or accepting as a Solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!

Hi @pmscorca 

Following up to confirm if the earlier responses addressed your query. If not, please share your questions and we’ll assist further.

Ok, about this issue I think that the For each should be sequential, isn't it?

Thanks

Hi @pmscorca 

Yes, ForEach is the right approach here. You will use ForEach to loop through each SQL row.

 

Thank You.

Vinodh247
Super User
Super User

you already have everything that you need in the lookup. The simplest approach would be:

  1. Lookup gets rows: (prefix, month_end_yyyymmdd)

  2. Get-metadata gets all filenames (childItems)

  3. Filter activity keeps only files where:
    filename startswith(prefix) AND filename contains(month_end)

In Fabric pipelines the filter condition looks like this:

@And(
startswith(item().name, item().prefix),
contains(item().name, item().month_end_yyyymmdd)
)

If your lookup returns an array of objects, wrap the filter under a ForEach on the lookup output, but do NOT do multiple Get Metadata calls:

  • 1 Get Metadata

  • 1 ForEach over lookup records

  • Inside: Filter on the same childItems list

This prevents repeated enumeration of the folder. You generate a single list of files and apply logical filtering per prefix/date combination.

 

Please 'Kudos' and 'Accept as Solution' if this answered your query.

Regards,
Vinodh
Microsoft MVP [Fabric]
LI: https://www.linkedin.com/in/vinodh-kumar-173582132
Blog: vinsdata.in

Hi, thanks for your reply.

I'm implementing this solution:

  1. Lookup activity to read the SQL view in warehouse, getting some rows with some columns (source prefix file and month end date);
  2. Get metadata activity to access to on-premise folder with csv files and to get the child items;
  3. For each activity to iterate respect to lookup output --> @activity('Lookup').output.value;
    inside the For each setting two variables to save source prefix file and month end date for the current item;
  4. Filter activity to filter the output child items returned from the Get metadata activity.

The filter condition used is:

@And(
startswith(item().name, variables('SourcePrefixFile')),
contains(item().name, variables('MonthEndDate'))
)

I think that the For each should be sequential.

 

Thanks

 

Helpful resources

Announcements
December Fabric Update Carousel

Fabric Monthly Update - December 2025

Check out the December 2025 Fabric Holiday Recap!

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors