- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Publishing Large Dataset of Parquet files in Azure Storage Gen 2
Hi,
I need to create a report that queries a folder containing above 300 parquet files with same schema. 1for file per day is added. Data resides in a Gen 2 Azure storage account. The total size is 4GB (equivalent to 20GB if CSV).
I would need to add a filter for serial number and get all data (all coumns required) related to this serial number.
The problem is as the size is too large, I am getting Load exception in Power Bi desktop and cannot publish the whole dataset. The idea is to cache the dataset in the cloud say every day and users query this dataset, hopefully with a fast response.
Can you please guide me the best approach? I was looking for direct querying, but haven't found the connector in Parquet / CSV files
Tks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Supposedly the new DirectLake stuff should allow you to query Parquet files at rest.
Until then you can consider loading them into a dataset (with incremental refresh if needed), and then point your report to that dataset.

Helpful resources
Join our Fabric User Panel
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Power BI Monthly Update - June 2025
Check out the June 2025 Power BI update to learn about new features.

User | Count |
---|---|
63 | |
55 | |
53 | |
36 | |
34 |
User | Count |
---|---|
85 | |
74 | |
55 | |
45 | |
43 |