March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hello everyone!
I'm building a data model where I have to import data from several flat files. The sales data is provided to me in Excel and it is currently split into separate files for each quarter. I have them all saved in a folder. Each file is roughly 500k rows and 50 columns. Once I import each of these files, they go through some transformation steps where I: remove a couple of columns and unnecessary rows, change data types (partly text, partly fixed decimal), replace nulls with 0.
Ideally, I would like to automatically apply the steps from prior files to newly imported files. Currently I make the same changes for each file. Moreover, I would like to ensure the model loads only the new data once available.
Could you please advise how to create an efficient appending and cleaning process?
Solved! Go to Solution.
@Anonymous The below steps could be used to filter your data on the most recent:
- Create a conditional column for Month:
- Ensure that Fiscal Year, Quarter, and Month are text data types
- Add a Date Custom Column using this code:
= [Month] & "/" & "1" & "/" & [Fiscal Year]
- Change the Date column's data type to Date
- Apply the "Is Latest" Date Filter to your Date column
@Anonymous You're welcome. Here's a good read on the approach: https://learn.microsoft.com/en-us/power-query/connectors/folder
@Anonymous I would look into using either a Folder connector or SharePoint Folder connector, and the Is Latest filter on a the date column aligned to the updated date of the file.
If you're not aware, using a Folder connector will enable the import and append of new data each time it's added to your folder and the Power BI report is refreshed. The Is Latest filter will ensure that only the most recent data is loaded.
Let me know if you have any questions.
I've followed the instructions and I'm able to connect to the folder and apply transformation steps to all the files. However, when I save a new file and refresh the model, I noticed that all the other files get loaded again; as seen in this snapshot, where Q2 is only the second out of the 13 files I need to include in the model. How do I load and transform only the newly saved data?
@Anonymous Is there an Updated Date for each file that you can apply an Is Latest filter in the Query Editor to?
Each file has a column for Fiscal Year and one for quarter. Could those be used as identifiers?
@Anonymous The below steps could be used to filter your data on the most recent:
- Create a conditional column for Month:
- Ensure that Fiscal Year, Quarter, and Month are text data types
- Add a Date Custom Column using this code:
= [Month] & "/" & "1" & "/" & [Fiscal Year]
- Change the Date column's data type to Date
- Apply the "Is Latest" Date Filter to your Date column
Thank you! I will try to replicate what you suggested, but I'm afraid it will be a bit trickier as I don't have a month dimension. Still, the query should recognize that the cronology is Q1,Q2,Q3,Q4. I will take time to work on this in the following days and hopefully reach/mark a solution.
@Anonymous You're welcome. It's OK that you don't have a month dimension. The process I described creates a month column, using the first month of each quarter. It assumes a fiscal year-end of December though, so you'd need to tweak the code if your fiscal year-end is different.
Yes. Using the fiscal year and quarter columns in the query editor, you could produce an "updated date" column and apply the Is Latest filter to it.
I can show you an example in a few hours. In the meantime, I recommend looking into the Text.Combine M function and in general how to combine columns in the Query Editor.
Thanks, will look into it and update the post if it works.
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions.
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
User | Count |
---|---|
134 | |
91 | |
90 | |
64 | |
58 |
User | Count |
---|---|
201 | |
137 | |
107 | |
72 | |
68 |