The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hi everyone,
I'm not looking through M-code that well yet and hope that somebody can help me with my problem:
I'm importing .txt files through folder import, but only want the latest report to be considered in Power BI. So I thought to create a true/false column, if each Source.Name is equal to the last row's Source.Name and filter this column afterwards. Thatfore I would already have an index column if necessary..
I tried something similar to Table.AddColumn(#"LastStep", "TrueFalse", each if [Index] = List.Max([Index]) then 1 else 0) which doesn't work and is also not 100% my case.
It would additionally be great to add one exception Source.Name like "Produktqualitaet_archive" to be set true as well.
Thanks a lot for your replies in advance!
Solved! Go to Solution.
Hi @flowgistics
Have been very busy lately. So you have all the files in one folder like this? And you want the archive data file and the latest version of xyz? Are you going to use the Date created or the date in filename to identify the latest version?
here is one using Date created
let
Source = Folder.Files("folderpath"),
#"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name],"archive") or [Date created] = List.Max(Source[Date created]))
in
#"Filtered Rows"
are my descriptions not comprehensible or is the problem too tricky that no solution can be found? 🤔
Hi @flowgistics
The code should be like this, but as you said, it is not your case, so you need to provide the logic how you define the latest
Table.AddColumn(#"LastStep", "TrueFalse", each if [Index] = List.Max(#"LastStep"[Index]) then 1 else 0)
here is one sample of mine
let
Source = Folder.Files("C:\Users\----"),
filterFilename = Table.SelectRows(Source, each Text.Contains([Name], "Filename")),
findLatest = Table.SelectRows(filterFilename, each [Date created] = List.Max(filterFilename[Date created])),
....
in
...
Hello @Vera_33 , thanks for your reply. With the Table.AddColumn command Power Query seems to go into an endless loop, trying to load more than 1GB into the Power Query (output of the status bar in the right low corner..) before I stopped it. The original files "only" have 36MB.. any ideas why?
*line 1-6 were added by Power BI automatically by choosing folder import, not nice?*
the same loop also happened by trying out your provided sample 😕
But as I read your filterFilename command line, you will always filter after "Filename" hardcoded right? I would need a solution, where this Filename changes twice a month, automatically..?
Thanks for any help.
Hello @Vera_33 , thanks for your reply. With the Table.AddColumn command Power Query seems to go into an endless loop, trying to load more than 1GB into the Power Query (output of the status bar in the right low corner..) before I stopped it. The original files "only" have 36MB.. any ideas why?
*line 1-6 were added by Power BI automatically by choosing folder import, not nice?*
the same loop also happened by trying out your provided sample 😕
But as I read your filterFilename command line, you will always filter after "Filename" hardcoded right? I would need a solution, where this Filename changes twice a month, automatically..?
Thanks for any help.
Hi @flowgistics
You M code looks like you are using Sample query, I am not. I read from folder directly and you can see this example is using Excel.Workbook to read a single sheet from a single Excel file, if you read CSV, you can use Csv.Document, it depends...and if you have multiple, you can use Table.Combine
let
Source = Folder.Files("C:\Users\folderpath"),
filterFilename = Table.SelectRows(Source, each Text.Contains([Name], "filename")),
findLatest = Table.SelectRows(filterFilename, each [Date created] = List.Max(filterFilename[Date created])),
Custom1 = Table.PromoteHeaders(Table.Skip( Excel.Workbook(findLatest[Content]{0})[Data]{0},3))
in
Custom1
And yes, my filename is hard coded, what do you mean Filename changes? So maybe you need a parameter to put filename every time when it changes?
Hi @Vera_33, first: sorry for the late reply.
I'm still struggling with Power Query, found a solution in DAX but the problem is that the model slows down every month caused by additional added data..
Once again: I have .txt input data, that get's reported twice a month into one folder (-> changing filenames, e.g. for August: "xyz_20210801_0600" and "xyz_20210815_0600") with current data, but I'm only interested in the latest report. Additionally I have archive data from 2017 until 2020 (with identical columns).
Two questions:
thanks for all answers!
Hi @flowgistics
I don't seem to understand your question
Two questions:
For example, you read a folder, you can use the columns Date created or others to identify the latest version OR extract the date in your filename as shown in your sample to compare and filter to the files you need
Hi @Vera_33
For better understanding I've created a small example of my handled data:
Source.Name | ProductionID | ProductionTime | ProductionAmount | Index | ||||
archiveData | 111111 | 01.04.2017 06:35 | 59500 | 1 | ||||
archiveData | 111112 | 01.04.2017 07:20 | 59700 | 2 | ||||
… | … | … | … | … | ||||
archiveData | 112879 | 01.04.2020 04:50 | 60300 | 90013 | ||||
xyz_20210715 | 121111 | 01.04.2020 06:30 | 60000 | 90014 | ||||
xyz_20210715 | 121112 | 01.04.2020 06:44 | 59000 | 90015 | ||||
… | … | … | … | … | ||||
xyz_20210715 | 131456 | 15.07.2021 05:10 | 61500 | 125627 | ||||
xyz_20210801 | 121111 | 01.04.2020 06:30 | 60000 | 125628 | ||||
xyz_20210801 | 121112 | 01.04.2020 06:44 | 59000 | 125629 | ||||
… | … | … | … | … | ||||
xyz_20210801 | 141378 | 01.08.2021 04:57 | 60200 | 130235 | ||||
xyz_20210815 | 121111 | 01.04.2020 06:30 | 60000 | 130236 | ||||
xyz_20210815 | 121112 | 01.04.2020 06:44 | 59000 | 130237 | ||||
… | … | … | … | … | ||||
xyz_20210815 | 151621 | 15.08.2021 05:16 | 61000 | 135365 |
So at the moment the folder import loads archive data rows from 2017 until the morning of the 1st April 2020, as well as reports created twice a month with the current data from the 1st April 2020 until the morning the report was created..
In my model, I want to load archive data as well as the rows of the latest available report which would be "xyz_20210815" at the moment.
Hope that explanation helps?
Hi @flowgistics
Have been very busy lately. So you have all the files in one folder like this? And you want the archive data file and the latest version of xyz? Are you going to use the Date created or the date in filename to identify the latest version?
here is one using Date created
let
Source = Folder.Files("folderpath"),
#"Filtered Rows" = Table.SelectRows(Source, each Text.Contains([Name],"archive") or [Date created] = List.Max(Source[Date created]))
in
#"Filtered Rows"
Hi @Vera_33
Thanks that works!
Regarding using the filename instead: can you think of a solution using the current date in order to compare it to the closest date of a filename and then select this file?
Hi @flowgistics
you mean 20210815 extracted from filename and compare with today? Can't use the Max date...you might get some date in the future?