Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Request now

Reply
RaiSta
New Member

how to merge lists of files

Hi all,

 

currently I got stuck and need an idea how to proceed...

  • I've got a net-folder (one folder!) with many (several thousand) files, mostly .csv.
  • Most files belong to at least 3 different categories, to be distinguished by there name schema (for instance one category are files with "EVT", others with "LOG" and a third category with "Rate" in their names.
  • Usually (but not always(!) ) I have triplets with one file of each category, so each triplet consists of one file from category "EVT", one from "LOG" and one from "Rate".

Now my problem is how to build these triplets inside Power Query?

  • The only property all files of a single triplet have in common are somewhat simlar (but not necessarily identical) dates and times of file-modification --> this is how a human observer can identify them.
  • The differences between file-modification-times of the files of a triplet normally are within 1-2 minutes
  • Nothing else indicates, how they belong together to the same triplet
  • For a human observer the files to be combined into one triplet are - not easy - but can be identified due to their modification-times. But ... how to teach PQ to do so?

Up to now I have tables of the files in the folder, seperated into "EVT"-, "LOG"- and "Rate"- files, together with corresponding columns of their respective modification-times. Also for "EVT" and "LOG"-files I found a way to group them by some naming convention they follow. But how to add  the "Rate"-files ... no corresponding naming convention, only their modification-times...

I would like to have a condition to select corresponding files by setting one type as 'standard' and comparing the other group, whether their time-column ist within 'standard'-time ± 0:01:00, but how to do this in PQ? Especially due to that not all "EVT" + "LOG"-files pairs have corresponding "Rate"-files...

 

Any idea, how to automate this in PQ?

 

Tanks,

RaiSta

1 ACCEPTED SOLUTION
jbwtp
Memorable Member
Memorable Member

Hi @RaiSta,

 

I think you can use something like this:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcg0L0TUyMDIyBGKlWB0UAWN0AROwgI+/O6oWJAGIlqDEklSIiKGBIZoI2JRYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [FileName = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "FileName", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"FileType", "FileDateTime"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"FileType", type text}, {"FileDateTime", Int64.Type}}),
    EVTs = Table.SelectRows(#"Changed Type1", each ([FileType] = "EVT")),
    #"Added Custom" = Table.AddColumn(EVTs, "Data", (x) => Table.SelectRows(#"Changed Type1", each [FileDateTime] = x[FileDateTime]))[Data]
in
    #"Added Custom"

 

 

The key is the condition in

 

 

 

#"Added Custom" = Table.AddColumn(EVTs, "Data", (x) => Table.SelectRows(#"Changed Type1", each [FileDateTime] = x[FileDateTime]))[Data]

 

Just replace [FileDateTime] = x[FileDateTime] with an appropriate condition getting you +/- 1 min as applicable in your case.

 

Cheers,

John

View solution in original post

3 REPLIES 3
jbwtp
Memorable Member
Memorable Member

Hi @RaiSta,

 

I think you can use something like this:

 

 

let
    Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcg0L0TUyMDIyBGKlWB0UAWN0AROwgI+/O6oWJAGIlqDEklSIiKGBIZoI2JRYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [FileName = _t]),
    #"Split Column by Delimiter" = Table.SplitColumn(Source, "FileName", Splitter.SplitTextByDelimiter("-", QuoteStyle.Csv), {"FileType", "FileDateTime"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Split Column by Delimiter",{{"FileType", type text}, {"FileDateTime", Int64.Type}}),
    EVTs = Table.SelectRows(#"Changed Type1", each ([FileType] = "EVT")),
    #"Added Custom" = Table.AddColumn(EVTs, "Data", (x) => Table.SelectRows(#"Changed Type1", each [FileDateTime] = x[FileDateTime]))[Data]
in
    #"Added Custom"

 

 

The key is the condition in

 

 

 

#"Added Custom" = Table.AddColumn(EVTs, "Data", (x) => Table.SelectRows(#"Changed Type1", each [FileDateTime] = x[FileDateTime]))[Data]

 

Just replace [FileDateTime] = x[FileDateTime] with an appropriate condition getting you +/- 1 min as applicable in your case.

 

Cheers,

John

Hi John,

 

U finally pushed me solve my problem! Thank you very much!

 

Only, how to understand:

Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45Wcg0L0TUyMDIyBGKlWB0UAWN0AROwgI+/O6oWJAGIlqDEklSIiKGBIZoI2JRYAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [FileName = _t])

beside that it creates a sample for the further explanation (I had no idea how to create a sample to explain what I was looking for...) it seems to me to be some sophisticated binary magic.

 

Is there a way for mortal men (and women...) to understand, what this 'holerith code' does?

 

Thanks a lot!

RaiSta

jbwtp
Memorable Member
Memorable Member

Hi @RaiSta,

 

You, probably, should not bother yourself with trying to understand this bit of the code (aside from that it generates a table). I, personally, don't :). You are right this is like reading a binary code, quite facinating, but also quite pointless.

 

This is the step whhich is genarated by PQ when you use "Enter Data" functinality (button in the main menu):

 

jbwtp_0-1667339852494.png

 

Cheers,

John

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors