Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
CCART92
Frequent Visitor

Get and append data in PBI the optimal way - flat files

Hello everyone!

 

I'm building a data model where I have to import data from several flat files. The sales data is provided to me in Excel and it is currently split into separate files for each quarter. I have them all saved in a folder. Each file is roughly 500k rows and 50 columns. Once I import each of these files, they go through some transformation steps where I: remove a couple of columns and unnecessary rows, change data types (partly text, partly fixed decimal), replace nulls with 0.


Ideally, I would like to automatically apply the steps from prior files to newly imported files. Currently I make the same changes for each file. Moreover, I would like to ensure the model loads only the new data once available.

Could you please advise how to create an efficient appending and cleaning process?

1 ACCEPTED SOLUTION

@CCART92  The below steps could be used to filter your data on the most recent:

- Create a conditional column for Month:

bchager6_0-1705335586903.png

 

- Ensure that Fiscal Year, Quarter, and Month are text data types

 

- Add a Date Custom Column using this code:
= [Month] & "/" & "1" & "/" & [Fiscal Year]


- Change the Date column's data type to Date

 

- Apply the "Is Latest" Date Filter to your Date column

 

bchager6_1-1705335859429.png

 

 

View solution in original post

10 REPLIES 10
bchager6
Super User
Super User

@CCART92  You're welcome. Here's a good read on the approach: https://learn.microsoft.com/en-us/power-query/connectors/folder

bchager6
Super User
Super User

@CCART92  I would look into using either a Folder connector or SharePoint Folder connector, and the Is Latest filter on a the date column aligned to the updated date of the file.

If you're not aware, using a Folder connector will enable the import and append of new data each time it's added to your folder and the Power BI report is refreshed. The Is Latest filter will ensure that only the most recent data is loaded.

Let me know if you have any questions.

I've followed the instructions and I'm able to connect to the folder and apply transformation steps to all the files. However, when I save a new file and refresh the model, I noticed that all the other files get loaded again; as seen in this snapshot, where Q2 is only the second out of the 13 files I need to include in the model. How do I load and transform only the newly saved data?

CCART92_0-1705331286388.png

 

@CCART92  Is there an Updated Date for each file that you can apply an Is Latest filter in the Query Editor to?

Each file has a column for Fiscal Year and one for quarter. Could those be used as identifiers?

@CCART92  The below steps could be used to filter your data on the most recent:

- Create a conditional column for Month:

bchager6_0-1705335586903.png

 

- Ensure that Fiscal Year, Quarter, and Month are text data types

 

- Add a Date Custom Column using this code:
= [Month] & "/" & "1" & "/" & [Fiscal Year]


- Change the Date column's data type to Date

 

- Apply the "Is Latest" Date Filter to your Date column

 

bchager6_1-1705335859429.png

 

 

Thank you! I will try to replicate what you suggested, but I'm afraid it will be a bit trickier as I don't have a month dimension. Still, the query should recognize that the cronology is Q1,Q2,Q3,Q4. I will take time to work on this in the following days and hopefully reach/mark a solution.

@CCART92  You're welcome. It's OK that you don't have a month dimension. The process I described creates a month column, using the first month of each quarter. It assumes a fiscal year-end of December though, so you'd need to tweak the code if your fiscal year-end is different.

Yes. Using the fiscal year and quarter columns in the query editor, you could produce an "updated date" column and apply the Is Latest filter to it.

 

I can show you an example in a few hours. In the meantime, I recommend  looking into the Text.Combine M function and in general how to combine columns in the Query Editor.

Thanks, will look into it and update the post if it works.

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.