This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
Good day everyone. I am having quite a problem trying to figure out how to accomplish my goal using Power Query, if it's even possible.
I keep a running index of all of the projects that my team works on at work. I am trying to figure out if there is a way to use Power Query to get the most recent file (date modified) from a group of subfolders, and filter out the subfolders that do not contain that file. I'll do my best to explain the file structure below, as I believe that is what is causing problems.
Root Folder is Projects/2026. Then within 2026 there are project number folders labeled PR-26001 etc. Then within PR-26001 there are folders labeled Draft 01, Draft 02, Draft 03, etc. Finally, within that "Draft" folders there is a folder labeled "Equipment Proposal". I want to be able to get the most recent file from all of the "Equipment Proposal" folders of all of the "PR-26***" folders, and get rid of the "Draft" folders that do not contain that file. The final output should be the folder path to the most recent file, as well as it's date modified.
One thing to mention that is also giving me problems, is the name of the file I'm trying to get does not stay consistant.
Please let me know if you have any questions that may help you understand what I'm trying to accomplish.
Solved! Go to Solution.
The grouping is currently set at the Draft folder level, which is why you’re seeing one file per Draft instead of one per PR folder.
You should group at the PR-26*** level so all Draft folders are considered together. Then, use the maximum modified date within each group and filter to that row. This will give you only the most recent file across all Draft folders for each PR.
If you still get multiple rows, sort within each group and keep just the top row to ensure a single result.
Please try this approach and let us know how it works for you.
@JPorterValmetal wrote:Wie erwähnt, brauche ich nur die neueste Datei aus dem neuesten "Draft"-Ordner jedes "PR-26***"-Ordners. Deshalb probiere ich den zweiten Codesatz aus. Ich habe die Fehler behoben, aber ich habe etwa 20 Minuten gewartet und es lädt immer noch alle Schritte.
*****************************************************************************************************************
In diesem Fall brauchst Du tatsächlich den unteren Code. Aber wenn Du 20 Minuten brauchst, dann stimmt da was nicht. Da hat sich sicher ein Kopier- oder Übersetzungsfehler eingeschlichen.
Hab Dir mal einen Code gemacht, bei dem zumindest Übersetzungsfehler ausgeschlossen sein dürften. Probier mal den:
let
Quelle = Folder.Files("C:\Users\User\OneDrive\Dokumente\All_PQ\Beispiele\Mehrere Daten aus einem Quellordner auslesen\CSV Files Mit Unterordner"),
LowerKeys = Table.TransformColumns(Quelle,{{"Extension", Text.Lower, type text}}),
NurCSV = Table.SelectRows(LowerKeys, each ([Extension] = ".csv")),
FolderGroups = Table.Group(NurCSV, {"Folder Path"}, {{"Foldergroup", each _, type table [Content=binary, Name=text, Extension=text, Date accessed=nullable datetime, Date modified=nullable datetime, Date created=nullable datetime, Attributes=nullable record, Folder Path=text]}}),
AddGroupIndex = Table.AddIndexColumn(FolderGroups, "GroupIndex", 0, 1, Int64.Type),
ExpandGI = Table.ExpandTableColumn(AddGroupIndex, "Foldergroup", {"Content", "Date modified", "Name"}, {"Content", "Date modified", "Name"}),
SortRows = Table.Sort(ExpandGI,{{"GroupIndex", Order.Ascending}, {"Date modified", Order.Descending}}),
Filter = Table.AddColumn(SortRows, "LastDat",
each
let
GrInd = [GroupIndex]
in
List.Max(Table.Column(Table.SelectRows(SortRows,
each [GroupIndex] = GrInd ),"Date modified"))),
AddCheckCol = Table.AddColumn(Filter, "Check", each [Date modified] = [LastDat]),
LatestFiles = Table.SelectRows(AddCheckCol, each ([Check] = true)),
RemoveOtherCols = Table.SelectColumns(LatestFiles,{"Folder Path", "Name", "Date modified", "Content"})
in
RemoveOtherCols
Was mir noch einfällt:
Wenn die Quelldaten in Excelfiles stehen, muss der Code natürlich auf Excelfiles angepasst werden!
Im Schritt:
NurCSV = Table.SelectRows(LowerKeys, each ([Extension] = ".csv")),
muss der Filter von .csv auf .xlsx geändert werden.
Falls aber auch Makro- oder Vorlagendateien in den Ordnern stehen und ausgewertet werden sollen, kannst Du den Schritt aber auch so abändern:
NurCSV = Table.SelectRows(LowerKeys, each Text.Contains([Extension],".xl")),...oder nach welchen Dateityp (doc, txt, jpg, mpeg,....) Du auch immer suchst...
Du nutzt offenbar den unteren Code. Da Du aber immer nur EINE Datei sehen willst, musst Du den oberen nutzen. Ich füge ihn Dir hier nochmal ein:
let
//Den Ordner auswählen, in welchem sich die Dateien befinden - Pfad anpassen
//evtl. vorhandene Unterordner werden berücksichtigt
Quelle = Folder.Files("C:\Users\User\OneDrive\Dokumente\All_PQ\Beispiele\Mehrere Daten aus einem Quellordner auslesen\CSV Files Mit Unterordner"),
LowExtension = Table.TransformColumns(Quelle,{{"Extension", Text.Lower, type text}}),
//Nur csv-Dateien anzeigen
NurCSV = Table.SelectRows(LowExtension, each ([Extension] = ".csv")),
//Nach Datei(en) mit neuestem Bearbeitungsdatum filtern
LastModified = Table.SelectRows(NurCSV, let latest = List.Max(NurCSV[Date modified]) in each [Date modified] = latest),
SpalteUmbenennen = Table.RenameColumns(LastModified, {{"Name", "Datei"}}),
DelColumns = Table.SelectColumns(SpalteUmbenennen, { "Content", "Datei"}),
#"Importierte CSV" = Csv.Document(DelColumns[Content]{0},[Delimiter=";", Columns=4, Encoding=1252, QuoteStyle=QuoteStyle.None]),
HeaderHoch = Table.PromoteHeaders(#"Importierte CSV", [PromoteAllScalars=true])
in
HeaderHochDie Fehlermeldung ist tatsächlich der Übersetzung geschuldet, denn die vermisste Spalte FolderGroup heißt im Original:
"Ordnergruppe".
War aber, wie gesagt, sowieso der falsche Code, den Du benutzt hast.
Den Code sicherheitshalber nicht übersetzen lassen... 🙂 Zumindest die in meinem ersten Post grün dargestellten Texte nicht. Oder wird hier auch der Code generell übersetzt? Wär ja blöd...
Falls nicht, dann einfach den Code von hier kopieren und in Deinem Editor einfügen.
I got the the code you put here to work, however it's not what I want. It only shows one single file that is the newest from the entire source data.
As I mentioned I need the newest file from only the newest "Draft" folder of each "PR-26***" folder. That's why I'm trying the second code set. I got the errors fixed, but I've waited about 20 minutes and it's still loading all of the steps.
Hi,
da es sehr unwahrscheinlich ist, dass mehrere User mehrere Dateien sekundengenau gleichzeitig ändern, kommt eigentlich immer nur eine Datei in Frage. Ich hab Dir mal ein Beispiel für mehrere csv Dateien in einem Ordner inklusive Unterordner gemacht. Im ersten Beispiel wird nur die neueste Datei aller Ordner ausgegeben.
Um die neuesten Dateien aller Ordner und Unterorder zu bestimmen, müsste noch nach den Ordnern gruppiert werden (im Beispiel 2).
Für Exceldateien ist das Vorgehen ein wenig anders, da die Daten sowohl in normalen als auch in formatierten Tabellen oder Namensbereichen liegen können. Es müsste deshalb noch nach der Art der Datenquelle gefiltert werden. Aus Unkenntnis Deiner Struktur beschränke ich mich deshalb auf csv Dateien.
let
//Den Ordner auswählen, in welchem sich die Dateien befinden - Pfad anpassen
//evtl. vorhandene Unterordner werden berücksichtigt
Quelle = Folder.Files("C:\Users\User\OneDrive\Dokumente\All_PQ\Beispiele\Mehrere Daten aus einem Quellordner auslesen\CSV Files Mit Unterordner"),
LowExtension = Table.TransformColumns(Quelle,{{"Extension", Text.Lower, type text}}),
//Nur csv-Dateien anzeigen
NurCSV = Table.SelectRows(LowExtension, each ([Extension] = ".csv")),
//Nach Datei(en) mit neuestem Bearbeitungsdatum filtern
LastModified = Table.SelectRows(NurCSV, let latest = List.Max(NurCSV[Date modified]) in each [Date modified] = latest),
SpalteUmbenennen = Table.RenameColumns(LastModified, {{"Name", "Datei"}}),
DelColumns = Table.SelectColumns(SpalteUmbenennen, { "Content", "Datei"}),
#"Importierte CSV" = Csv.Document(DelColumns[Content]{0},[Delimiter=";", Columns=4, Encoding=1252, QuoteStyle=QuoteStyle.None]),
HeaderHoch = Table.PromoteHeaders(#"Importierte CSV", [PromoteAllScalars=true])
in
HeaderHochIm folgenden Beispiel wird die jeweils zuletzt geänderte Datei auch aller Unterordner angezeigt: Also für jeden Ordner die jeweils letzte....
Das Ergebnis sähe in etwa so aus:
let
Quelle = Folder.Files("C:\Users\User\OneDrive\Dokumente\All_PQ\Beispiele\Mehrere Daten aus einem Quellordner auslesen\CSV Files Mit Unterordner"),
Kleinbuchstaben = Table.TransformColumns(Quelle,{{"Extension", Text.Lower, type text}}),
NurCSV = Table.SelectRows(Kleinbuchstaben, each ([Extension] = ".csv")),
Pfadgruppen = Table.Group(NurCSV, {"Folder Path"}, {{"Ordnergruppe", each _, type table [Content=binary, Name=text, Extension=text, Date accessed=nullable datetime, Date modified=nullable datetime, Date created=nullable datetime, Attributes=nullable record, Folder Path=text]}}),
AddGruppenindex = Table.AddIndexColumn(Pfadgruppen, "Gruppenindex", 0, 1, Int64.Type),
ExpandGI = Table.ExpandTableColumn(AddGruppenindex, "Ordnergruppe", {"Content", "Date modified", "Name"}, {"Content", "Date modified", "Name"}),
#"Sortierte Zeilen" = Table.Sort(ExpandGI,{{"Gruppenindex", Order.Ascending}, {"Date modified", Order.Descending}}),
fkt = (DatListe as list, GrIndex as number) =>
let
letzte = Table.SelectRows(#"Sortierte Zeilen",each [Gruppenindex] = GrIndex)[Date modified]{0}
in letzte,
Liste = Table.AddColumn(#"Sortierte Zeilen", "Liste", each fkt(#"Sortierte Zeilen"[Date modified],[Gruppenindex])),
Checken = Table.AddColumn(Liste, "Behalten", each [Date modified] = [Liste]),
JustFirstFile = Table.SelectRows(Checken, each ([Behalten] = true)),
#"Andere entfernte Spalten" = Table.SelectColumns(JustFirstFile,{"Content","Folder Path", "Name", "Date modified"})
in
#"Andere entfernte Spalten"
Da ich die Ordnerstruktur jetzt nicht nachstellen will, musst Du aus der Quelle in der Spalte Folder Path noch alle unerwünschten Ordner herausfiltern.
In der Abfrage befindet sich noch die Spalte Content, die Du brauchst, wenn Du die Dateien weiter bearbeiten willst. Aber das war ja hier nicht gefragt.
I'm trying to follow your second example code, but it's a little difficult with the language barrier. I've used Google Translate to change the code into English. The copy pasted it.
I'm getting an error in the code on the "let last = table.selectrows" section. It says "Token ',' expected". I don't know how to fix this.
Alright. I was able to get the syntax errors fixed. Now I am running into a different error.
Should be possible with Power Query but it may not be fast. How many files overall are in that folder structure?
Also note that "Last Modified" is a slippery slope. Very easy to get false positives on that.
The amount of files is really dependent on how many drafts we end up with. On average I would say propably 5 files counting everything from the PR-26*** folder down. I don't really need it to be fast.
Connect to the Sharepoint site via "Sharepoint Folders", click "Transform data" *
Filter for the desired file name patterns (and folder names if needed)
Sort by Last Modified Date descending
Keep the top 1 row.
*) This will give you all the files in all the folders on that sharepoint site.
I think maybe explanation of what the final result should be was confusing. Keep top row 1 gives me only the most recent file from the entire PR-26*** list of files. I want the most recent file from the group of "Draft" folders in each PR-26*** folder with the file path to each file. Also I should mention there is no SharePoint involved, so this would all be "from file - from folder".
Connect to the parent folder, transform data
Group by Folder Path, aggregate by Max Last Modified Date but also include "All Rows"
Expand the "Rows" column
filter all rows where the Last Modified Date equals the Max Last Modified Date
I'll give that a shot on Monday when I'm at work and let you know how it goes.
Just following up did you have a chance to test this on your side? Please let us know how it went or if you’re still experiencing the same issue.
I'm a little confused on the last step. When I add a date filter by equals, I have to choose a specific date. Do I need to make a custom collumn comparing the Last Modified Date, and the Max Modified Date. Then filter out everything that is false from the custom column?
I used the custom column that I described above and I'm still having the same issue. It's not getting rid of the "draft" folders that have older files in them. It's giving me the latest file that is in every "draft" folder.
Here is a screenshot of the folder structure when there are multiple draft folders.
I want the final result to show the most recent file path out of the "draft" folders. If draft 4 has the most recent file, show only that file path. If draft 02 has the most recent file show only the draft 02 file path.
The current result I am getting is the file path for the most recent file in ALL of the draft folders.
The grouping is currently set at the Draft folder level, which is why you’re seeing one file per Draft instead of one per PR folder.
You should group at the PR-26*** level so all Draft folders are considered together. Then, use the maximum modified date within each group and filter to that row. This will give you only the most recent file across all Draft folders for each PR.
If you still get multiple rows, sort within each group and keep just the top row to ensure a single result.
Please try this approach and let us know how it works for you.
Ok. I I got it. I'm not sure if I did it the best way, but I made it work. I split my file path column using "\" as a delimeter. Grouped the PR-26*** folders based on max date modified. I made a new query using the same source folder, and did an inner merge on the two queries using max modified date and last modified date.
I'm very new to Power Query as you can probably tell. How do change the level at which the grouping is done? I've always used the "Group By" button.
@JPorterValmetal wrote:I want the final result to show the most recent file path out of the "draft" folders. If draft 4 has the most recent file, show only that file path. If draft 02 has the most recent file show only the draft 02 file path.
Dann verwende den oberen Beispielcode. Und wie ich schon sagte:
... solltest Du aus der Quelle in der Spalte Folder Path noch alle unerwünschten Ordner herausfiltern.
Check out the May 2026 Power BI update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
| User | Count |
|---|---|
| 4 | |
| 3 | |
| 2 | |
| 1 | |
| 1 |
| User | Count |
|---|---|
| 7 | |
| 6 | |
| 6 | |
| 6 | |
| 4 |