Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Grow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.

Reply
RipeKiwi
Frequent Visitor

Speed up a combine from folder

Hi!

I currently have a combine going on using this DAX code:

 

let
    Source = Folder.Files("M:\Smart\reports"),
    #"Filtered Rows" = Table.SelectRows(Source, each ([Extension] = ".TXT") and ([Name] = "DailyOpenAM.TXT")),
    #"Inserted Text Between Delimiters" = Table.AddColumn(#"Filtered Rows", "FolderDate", each Text.BetweenDelimiters([Folder Path], "\", "\", 2, 0), type date),
    #"Changed Type" = Table.TransformColumnTypes(#"Inserted Text Between Delimiters",{{"FolderDate", type date}}),
    #"Filtered Rows1" = Table.SelectRows(#"Changed Type", each [FolderDate] > #date(2020, 12, 31) and [FolderDate] < #date(2022, 1, 1)),
    #"Filtered Hidden Files1" = Table.SelectRows(#"Filtered Rows1", each [Attributes]?[Hidden]? <> true),
    #"Invoke Custom Function1" = Table.AddColumn(#"Filtered Hidden Files1", "Transform File (9)", each #"Transform File (9)"([Content])),
    #"Expanded Table Column1" = Table.ExpandTableColumn(#"Invoke Custom Function1", "Transform File (9)", Table.ColumnNames(#"Transform File (9)"(#"Sample File (9)"))),
    #"Removed Columns" = Table.RemoveColumns(#"Expanded Table Column1",{"Column2", "Column7", "Column9", "Column10", "Column12", "Column13", "Column14", "Column15", "Column16", "Column17", "Column18", "Column19", "Column20", "Column21", "Column23", "Column28", "Column29", "Column30", "Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
    #"Replaced Value" = Table.ReplaceValue(#"Removed Columns",". ","",Replacer.ReplaceText,{"Column1", "Column3", "Column4", "Column8", "Column11", "Column24", "Column25", "Column26"}),
    #"Replaced Value1" = Table.ReplaceValue(#"Replaced Value","'","",Replacer.ReplaceText,{"Column31"}),
    #"Trimmed Text" = Table.TransformColumns(#"Replaced Value1",{{"Column1", Text.Trim, type text}, {"Column3", Text.Trim, type text}, {"Column4", Text.Trim, type text}, {"Column5", Text.Trim, type text}, {"Column6", Text.Trim, type text}, {"Column8", Text.Trim, type text}, {"Column11", Text.Trim, type text}, {"Column22", Text.Trim, type text}, {"Column24", Text.Trim, type text}, {"Column25", Text.Trim, type text}, {"Column26", Text.Trim, type text}, {"Column27", Text.Trim, type text}, {"Column31", Text.Trim, type text}, {"Column32", Text.Trim, type text}}),
    #"Promoted Headers" = Table.PromoteHeaders(#"Trimmed Text", [PromoteAllScalars=true]),
    #"Changed Type2" = Table.TransformColumnTypes(#"Promoted Headers",{{"7/7/2021", type date}, {"C", type text}, {"Sym", type text}, {"Type", type text}, {"Quantity", type text}, {"Amount", type text}, {"CP No", type text}, {"No", type text}, {"Date", type text}, {"Comments", type text}, {"Contract No", type text}, {"Depo", type text}, {"SOD", type text}, {"Name", type text}, {"te", type text}}),
    #"Filtered Rows2" = Table.SelectRows(#"Changed Type2", each [Cusip] <> "---------------------------" and [C] <> "C" and [C] <> "ZZZ"),
    #"Renamed Columns" = Table.RenameColumns(#"Filtered Rows2",{{"7/7/2021", "FolderDate"}}),
    #"Changed Type3" = Table.TransformColumnTypes(#"Renamed Columns",{{"Quantity", Int64.Type}, {"Amount", Currency.Type}, {"Date", type date}, {"SOD", Int64.Type}, {"ate", type number}}),
    #"Filtered Rows3" = Table.SelectRows(#"Changed Type3", each ([No] <> "") and [FolderDate] <> null and [FolderDate] <> ""),
    #"Added Custom" = Table.AddColumn(#"Filtered Rows3", "P&L", each if [Type] = "S/B" then [Amount]*[ate]/36000 else [Amount]*[ate]/-36000),
    #"Changed Type1" = Table.TransformColumnTypes(#"Added Custom",{{"M&L", Currency.Type}}),
    #"Added Custom1" = Table.AddColumn(#"Changed Type1", "Info Date", each [#"FolderDate"] - #duration(1,0,0,0), type date)
in
    #"Added Custom1"

 

 

It works great, however there are 5 years (and counting) worth of folders (an updated one every day) and it was very slow to update recent information. I thought it would be better if I built a separate "combine and clean" for each year and then append them to one "Master" query and just refresh the current year and master file. Unfortuantely, it is still super slow. Any thoughts?

Here are some sample files in a folder structure.
Questions welcome!

1 ACCEPTED SOLUTION
watkinnc
Super User
Super User

 Can't you perform all of these steps in the transform file instead of combining them and then doing the transforms?

--Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

View solution in original post

5 REPLIES 5
watkinnc
Super User
Super User

No, I mean all of the filtering and removing columns and replacing values and the type changes and the text trimming that you show in your first post. Can't you do that in the "Sample file"?

 

--Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

I restarted the project with this implemented - so much neater!

Got it, I conflated 2 separate issues I was experiencing. I have made your recommended adjustments. However, when I combine the query, I have to do the editing again. What am I missing?

watkinnc
Super User
Super User

 Can't you perform all of these steps in the transform file instead of combining them and then doing the transforms?

--Nate


I’m usually answering from my phone, which means the results are visualized only in my mind. You’ll need to use my answer to know that it works—but it will work!!

Hey @watkinnc, I have tried that using the code below, but it did not work. 

 

= (Parameter8 as binary) => 
let
    SourceSemicolon = Csv.Document(Parameter8, [Delimiter=";", Columns=32, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    SourceComma = Csv.Document(Parameter8, [Delimiter=",", Columns=32, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    Source = if Table.RowCount(SourceSemicolon) > 1 then SourceSemicolon else SourceComma
in
    Source

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

Top Solution Authors
Top Kudoed Authors