The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredCompete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.
Hello
We have a large SharePoint folder of about 400,000 documents and I've been asked if it's possible to create a report showing all file names and the custom metadata against each file. I've tried everything and I can't find a solution that works as it takes too long or doesn't have the correct information.
Things i've tried:
Are there any options I haven't considered?
Hi @ls784
Dealing with large datasets, Microsoft Fabric might be a good option. It's designed to handle large amounts of data efficiently and could potentially provide a solution for generating your report.
Here's a link for your reference:
What is Microsoft Fabric - Microsoft Fabric | Microsoft Learn
Hope this can help,
Best Regards
Zhengdong Xu
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
Thanks for your response, unfortunately Fabric isn't an option for my company at the moment 😞
OData gets the filenames, and SP Online List gets the metadata - if you put each separately in a dataflow table, you could merge them together on the unique file identifier in a new query. I mention dataflow only because with the size of the list it will require you to ... what's the word... break it down a bit, then merge it together with something that has a fresh resource pool to work with, if that makes sense. I would do the final merge on the PBI desktop side with the dataflows as the source - the "pipe" to the dataflows is much faster than the pipe to SharePoint. None of these are proper terms, but I hope you get the gist of what I mean. 🙂
Make sure when you use the list connector that you select the columns you want, then right click "remove other columns" - particularly to get rid of the complex field types like person fields. I am pretty sure that the remove columns step "folds" and causes it to go much faster if you do it first in the applied steps.
Use the 2.0 connector, its much faster than 1.0.
I've done exactly this for large libraries, but they were not THAT large... 400,000 is a lot for a SharePoint source, and it'll probably throttle you on top of it being slow to start with, so I can't promise it'll get that many but conceptually that's how I'd approach it.
Hi,
What is Custom metadata?
List of files you can get like New Source, Sharepoint Folder. There will be Attributes columns, maybe that will show custom metadata?
We have created meta data fields that are specific to us. Unfortunately any of the SharePoint source options are too slow and/or don't give me the fields I'd need.
Hi @ls784
Is your issue got resolved. Would be great if you can share your approach to this problem.
Thanks,
Pallavi
Hi
Unfortunately it's not resolved, I haven't been able to get any suggestions to work. Fabric would probably be the right solution but we're not in a position to move to Fabric yet.
Thanks
If that is something, what one sees in properties of the file, you can try use python/PowerShell/CommandLine scripts.
Browse Internet or ChatGPT for that. It's pretty much nothing to do with Fabric products.