Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
admincaleb007
Frequent Visitor

Power Query - Most Recent File within a week

Hi,

I was looking at some of my data and noticed a large anomaly.  Looking into it, it seems that there were 2 different files created on 2 different days, due to an error.  Because of this, powerbi is reading both files because they are from the same week, and aggregating them together.

 

admincaleb007_0-1669744382204.png

 

I'm trying to figure out how to only use the most recent file from within the same week.

So for this example, it would be Week 48 has 2 files, so filter out the older one (Nov 23 2022) keeping the new one.  This would also need to be able to interpret the year, as there is data for Week 48 for many years prior as well.

4 REPLIES 4
v-zhangti
Community Support
Community Support

Hi, @admincaleb007 

 

This problem is easy to solve in Desktop. Add year and week columns in the date table.

vzhangti_0-1669794068706.png

Sample data:

vzhangti_1-1669794160027.png

Measure = 
IF (SELECTEDVALUE ( 'Table'[Date modified] )
        = CALCULATE (MAX ( 'Table'[Date modified] ),
            ALLEXCEPT ( 'Date', 'Date'[Weeknum], 'Date'[Year] )),
    1,
    0
)

vzhangti_2-1669794201001.png

Is this the result you expect?

 

Best Regards,

Community Support Team _Charlotte

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly

Hi!

Thank you for the reply.

However, when looking at your data, I see an issue with my explanation.

 

I have 40 groups.  Each group has 2 separate categories (inside, outside).  Regardless of category, all the files are created on the same day 99% of the time.  However, there are times when there is an error, and the files are ran a second time successfully.

 

Below is an example filtered down to the 2 most recent errors.  Row 3, Group A with Category inside but more recently created 11/24/2022 would be used.  Row 4, Group A with Category Outside has no errors, and is the most recent result for that week, so that data is used.  I used green color to highlight some of the examples, and which files would be used. 

 

Group NamesCategoryDate CreatedWeeknumYear
Ainside11/23/2022 0:00482022
Ainside11/24/2022 0:00482022
Aoutside11/23/2022 0:00482022
Binside11/23/2022 0:00482022
Binside11/24/2022 0:00482022
Boutside11/23/2022 0:00482022
Boutside11/24/2022 0:00482022
Cinside11/23/2022 0:00482022
Cinside11/24/2022 0:00482022
Coutside11/23/2022 0:00482022
Dinside11/23/2022 0:00482022
Dinside11/24/2022 0:00482022
Doutside11/23/2022 0:00482022
Einside11/23/2022 0:00482022
Einside11/24/2022 0:00482022
Eoutside11/23/2022 0:00482022
Eoutside11/24/2022 0:00482022
Finside11/23/2022 0:00482022
Finside11/24/2022 0:00482022
Foutside11/23/2022 0:00482022
Foutside11/24/2022 0:00482022
Ginside11/23/2022 0:00482022
Ginside11/24/2022 0:00482022
Goutside11/23/2022 0:00482022

I decided to create a new column called "merged" with power query, which combines the "group name", "category", "year", "week of year".

For example:

A,inside,2022,48

A,inside,2022,48

A,outside,2022,48

B,inside,2022,48

B,outside,2022,48

B,outside,2022,48

 

This enables me to quickly see which items have duplicate files on the same weeks.  I then created an "Age" column that shows the amount of days after the file was created.

 

MergedAge
A,inside,2022,486.00:00:00
A,inside,2022,487.00:00:00
A,outside,2022,487.00:00:00
B,inside,2022,487.00:00:00
B,outside,2022,486.00:00:00
B,outside,2022,487.00:00:00

 

With the previous table additions, I believe I can use the combination of Age and Merged columns to filter out the older erroneous data.  Is that possible?  Is that a measure, or some other query?  I'm not sure how to actually use this data to use the most recent file for a given category, week, year, team.

Hi @v-zhangti 

I am trying to understand the measure you wrote, but recreate it for what I have built.

 
MostRecent = IF(selectedvalue('Folder Paths'[Merged])
    = CALCULATE(MIN('Folder Paths'[Age])),
    1,
    0
    )
The above measure is not going to work, but I believe with some additional understanding and help, it would create the same filter as you previously showed.
 
MostRecent = CALCULATE(
    MIN(
        'Folder Paths'[Age]),
        FILTER('Folder Paths','Folder Paths'[Merged])
    )
(error : Cannot Convert the values under merged of type text to type True/False)
 
Are you able to assist?  I believe my most recent reply makes the problem easier to resolve:
When there is a duplicate value under "Merged", then use the row value where "Age" is the lowest.

Helpful resources

Announcements
LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.