Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Enhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.

Reply
pallavi_r
Super User
Super User

How to handle refresh of sharepoint metadata with huge volume of data

Hi All,

 

I want to understand how do we approach this problem of refresh. Sharepoint is taking lot of time in refreshing metadata from a list with huge no. of files.

I did some PoC with all connector.

No, of files : 2500, so the no. of rows which captures the metadata: 2500

1. List 1.0 connector takes 40 mins

2. List 2.0 connector - No use as it does not capture metadata at sub folder level

3. Sharepoint Folder: Only file related info, metadata are not capture that we are looking for

4. Sharepoint Rest - It is faster to load millions of records within seconds, But since the metadata is not available by default

Had to create a custom column for the same passing list item id and expanding the field value as text. Was able to retrieve all the metadata.

/_api/web/lists/getbytitle('documents')/items(id)/FieldValuesAsText

And after creating these custom columns, it is same slow and it take more than the list 1.0 connector

Takes 60 mins to refresh 2501 records

5.Odata Feed: Takes 45 mins to refresh. Here also added a custom column

And another point is with sharepoint rest api and odata feed, we can not do a schedule refresh.

 

Question 1 - What would be the best way to tackle the metadata capture if we are expecting more than 100k files in sharepoint libraries. This will timeout with such a slow refresh.

 

Question 2- In many threads, it is mentioned to try with Sharepoint PnP Power shell or Microsoft Graph API or Power Automate. My question here is would it solve the refresh issue. Will it be able to handle large sharepoint list. 

 

Question 3- I assume the Size of metadata is directly proportional to no. of files. If there is any other factor involved.

 

Question 4-In sharepoint rest and odata query, the slowness is caused at custom column creation step, where I am passing the id as parameter to retrieve the FieldValuesAsText

/_api/web/lists/getbytitle('listname')/items(id)/FieldValuesAsText

Is there any other approach to do?

 

Kindly advise. Will be grateful if anyone can put their suggestion, experience and thinking into any of the question.

 

Thanks,

Pallavi

 

 

1 ACCEPTED SOLUTION
v-pnaroju-msft
Community Support
Community Support

Hi @pallavi_r,

 

We sincerely appreciate your inquiry through the Microsoft Fabric Community Forum.

 

The challenge of refreshing metadata in SharePoint while handling a large volume of files is a significant concern and requires careful evaluation of various approaches and tools.

Based on your query, the primary issue appears to be the inefficient retrieval of metadata, particularly when using the FieldValueAsText endpoint for each item individually. This approach may have resulted in a large number of API calls, leading to SharePoint Online throttling. The number of selected columns, their types, and the overall complexity of the SharePoint list structure all contribute to performance bottlenecks.

 

Please find below some suggested approaches that may help resolve the issue:

  1. Extracting metadata for over 100,000 files in SharePoint requires a strategy that minimises individual item requests. Implementing batch processing is recommended to prevent timeouts.
  2. Both PnP PowerShell and Microsoft Graph API are specifically designed to manage large SharePoint lists efficiently. These tools provide better scalability for metadata refresh operations.
  3. Apart from the number of files, factors such as network latency, SharePoint throttling limits, the complexity of metadata fields, and the number of columns retrieved should be taken into account to enhance performance.
  4. To accelerate the retrieval of custom columns, queries should be optimised to fetch all necessary fields in a single request, thereby improving efficiency.

Additionally, please refer to the links below for further guidance:
Make batch requests with the REST APIs | Microsoft Learn
Working with SharePoint sites in Microsoft Graph - Microsoft Graph v1.0 | Microsoft Learn
PnP PowerShell | PnP PowerShell

If you find our response helpful, we kindly request you to mark it as the accepted solution and provide kudos. This will assist other community members facing similar challenges.

 

Thank you.

View solution in original post

5 REPLIES 5
v-pnaroju-msft
Community Support
Community Support

Hi pallavi_r,

We have not received a response from you regarding the query and were following up to check if you have found a resolution. If you have identified a solution, we kindly request you to share it with the community, as it may be helpful to others facing a similar issue.

 

If you find the response helpful, please mark it as the accepted solution and provide kudos, as this will help other members with similar queries.

 

Thank you.

v-pnaroju-msft
Community Support
Community Support

Hi pallavi_r,

 

Thank you for your response.

Please find below the steps that may help in resolving the issue:

  1. To optimise the process, instead of making individual calls for each item, consider using batch processing to fetch metadata in bulk. By batching multiple requests into a single call, you can significantly reduce latency.
  2. Both PnP PowerShell and Microsoft Graph API are generally more efficient than Power BI connectors, as they are optimised for large-scale operations and can bypass some of the limitations associated with Power BI connectors.

At present, I do not have a perfect sample code available. However, if you come across any useful sample code snippets, please share them here. Additionally, you may mark it as a solution so that others can benefit from it as well.

 

Thank you.

v-pnaroju-msft
Community Support
Community Support

Hi @pallavi_r,

 

We sincerely appreciate your inquiry through the Microsoft Fabric Community Forum.

 

The challenge of refreshing metadata in SharePoint while handling a large volume of files is a significant concern and requires careful evaluation of various approaches and tools.

Based on your query, the primary issue appears to be the inefficient retrieval of metadata, particularly when using the FieldValueAsText endpoint for each item individually. This approach may have resulted in a large number of API calls, leading to SharePoint Online throttling. The number of selected columns, their types, and the overall complexity of the SharePoint list structure all contribute to performance bottlenecks.

 

Please find below some suggested approaches that may help resolve the issue:

  1. Extracting metadata for over 100,000 files in SharePoint requires a strategy that minimises individual item requests. Implementing batch processing is recommended to prevent timeouts.
  2. Both PnP PowerShell and Microsoft Graph API are specifically designed to manage large SharePoint lists efficiently. These tools provide better scalability for metadata refresh operations.
  3. Apart from the number of files, factors such as network latency, SharePoint throttling limits, the complexity of metadata fields, and the number of columns retrieved should be taken into account to enhance performance.
  4. To accelerate the retrieval of custom columns, queries should be optimised to fetch all necessary fields in a single request, thereby improving efficiency.

Additionally, please refer to the links below for further guidance:
Make batch requests with the REST APIs | Microsoft Learn
Working with SharePoint sites in Microsoft Graph - Microsoft Graph v1.0 | Microsoft Learn
PnP PowerShell | PnP PowerShell

If you find our response helpful, we kindly request you to mark it as the accepted solution and provide kudos. This will assist other community members facing similar challenges.

 

Thank you.

Thank you @v-pnaroju-msft  for providing all the options. I am still trying to resolve this with Odata, breaking all the lists in multiple dataflows and working on schedule refresh. Will keep this post updated with my final approach once I get success with it.

 

Thanks,

Pallavi

Hi @v-pnaroju-msft ,

 

Thanks a lot for your reply.

1.Will be great if you have a ny sample code or any idea on how to address this. Everywhere I saw only this as reference - /_api/web/lists/getbytitle('listname')/items(id)/FieldValuesAsText. Any idea how can we optimize this in better way.

  1. To accelerate the retrieval of custom columns, queries should be optimised to fetch all necessary fields in a single request, thereby improving efficiency

2.Will PnP & Microsoft graph has the ability to process the sharepoint list faster than all the connectors used in PBI?

3.Is there any specific reference link on PnP Powershell and sharepoint list?

 

Thanks,

Pallavi

 

Helpful resources

Announcements
July 2025 community update carousel

Fabric Community Update - July 2025

Find out what's new and trending in the Fabric community.

July PBI25 Carousel

Power BI Monthly Update - July 2025

Check out the July 2025 Power BI update to learn about new features.