- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Optimizing data loading process from APIs
Hi,
I have a report that loads about hundred tables from APIs. This process is very slow and i have used most optimizations that i can think of but i'm still struggling to load everything in a usable amount of time. The tables get loaded and then combined into a fact table with about 16 million records.
Is it possible to split or load these 16 million records into different tables and scheduled them at different times? If not, is there anything else i can try?
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @qmest ,
I think the first thing you can try is to load the data in batches using paging. This can allow you to fetch a portion of the records at a time, it can also reduce memory usage and improve performance.
Secondly you can adjust the data that is accessed by the cache so that you can avoid redundant API calls. Of course optimizing the database settings can also handle large amounts of data more efficiently.
Finally I think you can also consider using cloud-based services that provide tools for efficiently processing large data sets.
Best Regards
Yilong Zhou
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi @qmest hi @ you can use the incremental load, data partitioning, data preprocessing ,and transforming the data outside the powerbi envirment to optimise your report.
you can also use the direct wuery mode.
https://analyticpulse.blogspot.com
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm not sure whether it is the right approach or not but you can try it.
You can try pulling your data from your rest API source to datalake and store it in parquet format and you can build on it.For doing this you try building pyspark or scala program.
Please refer this for more details -https://medium.com/@senior.eduardo92/rest-api-data-ingestion-with-pyspark-5c9c9ce89c9f
Helpful resources
Join us at the Microsoft Fabric Community Conference
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
Microsoft Fabric Community Conference 2025
Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.
Subject | Author | Posted | |
---|---|---|---|
12-16-2022 10:29 AM | |||
07-16-2024 02:01 PM | |||
Anonymous
| 04-08-2024 05:01 AM | ||
10-04-2023 04:15 PM | |||
Anonymous
| 10-14-2022 04:28 AM |
User | Count |
---|---|
51 | |
39 | |
23 | |
19 | |
16 |