- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Append large data files - best practice?
Hi, i have used Power Query in order to combine historical data in our organization.
The problem is the only form of export from SAP webi is either excel or csv format.
As every months has about 125.000 rows and 15 columns, i have made a workaround by creating Excel files per Quarter and then combined them with Append and then created relationships with the Append data to other tables.
However it is still very heavy to use as data in the Append has reached about 8 mio rows and i am continually adding data.
With my limit in the export opportunities can anybody suggest a better solution?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

If these csv files are all in the same folder, can't you just use the folder connector so that all of your new files are added to the dataset upon refresh?
--Nate
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

You can use incremental refresh with files, if you can parse a date value from the filenames. Please see this video.
https://www.youtube.com/watch?v=IVMdg16yBKE
Pat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I am also interested in finding the best practice in a case like this.
So far, I have split a large fact table by year, turned off refresh for prior years and appended the data in PowerQuery.
I am not sure if this will improve performance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Hi, @bilingual
I'd like to suggest you try the following data reduction techniques:
- Remove unnecessary columns
- Remove unnecessary rows
- Group by and summarize
- Optimize column data types
- Preference for custom columns
- Disable Power Query query load
- Disable auto date/time
- Switch to Mixed mode
For further information, you may refer to the following document.
Data reduction techniques for Import modeling
Best Regards
Allan
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Have you considered using Direct Query and incremental refresh to load the new data keeping the old data intact?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Unfortunately it does not work with csv or Excel files.

Helpful resources
Subject | Author | Posted | |
---|---|---|---|
04-05-2024 10:22 AM | |||
06-05-2024 04:28 PM | |||
02-14-2024 05:26 AM | |||
04-17-2023 08:21 AM | |||
05-08-2024 06:41 PM |
User | Count |
---|---|
33 | |
18 | |
14 | |
11 | |
10 |