Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
Sidhant
Advocate V
Advocate V

Managing and Exporting High-Volume Datasets (~1M Records) in Power BI

Hello everyone,
I'm currently working on a use case where I need to manage and export high-volume datasets (~1 million records) using Power BI, and I'm exploring multiple approaches. I’d appreciate your feedback on the current methods I’ve tried and would love to hear if there are better alternatives or optimizations.


Problem Statement:
Effectively manage and export large datasets (~1M records) in Power BI, while allowing users to select/deselect fields dynamically and ensuring smooth integration with Power Automate and Fabric for downstream processes.

 

Current Approaches:

Approach 1: Field Parameters + Power Automate + Paginated Reports

  • Using field parameters in Power BI to allow dynamic column selection.

  • Passing selected fields to Power Automate, which triggers a Paginated Report (RDL).

  • Logic in RDL is set up to show/hide columns based on parameters.

Issue:

The show/hide logic is being overridden — despite user selection, all columns are getting displayed. It seems the parameters are not being passed or consumed correctly within the RDL file.

 

Approach 2: Microsoft Fabric Lakehouse + Semantic Model:

 

  • Created a Lakehouse in Fabric to handle large data volumes efficiently.

  • Built a semantic model on top of the Lakehouse.

  • Developed reports using this model in Power BI.

  • Trying to trigger export or automation via Power Automate using Fabric data.

Issue:

Getting a "Bad Request" error while trying to integrate Power Automate with Fabric. Details of the error aren't very descriptive, so it's hard to debug.

 

I had couple of questions which are as follows:

 

  • Are these approaches going in the right direction for large dataset export scenarios?

  • Has anyone successfully implemented field-level selection with RDL exports based on Power BI parameters? How did you overcome the column visibility issues?

  • Any known limitations or best practices for using Power Automate with Fabric Lakehouse or Semantic Models?

  • Are there any alternative approaches or workarounds you’d recommend for:

    • Efficiently exporting 1M+ rows

    • Allowing dynamic field selection

    • Maintaining performance and scalability.

Any insights, samples or even partial suggestions would be highly appreciated. I’m open to reworking my approach if there’s a more scalable or reliable pattern others have used successfully.

Thanks in advance,
Sidhant

 

 

32 REPLIES 32

Hi @Gabry ,
Thanks for the reply, I had shared the external tools as one way (kind of backup). Earlier you did mention to make use of Notebooks since I haven't worked on that front can you please let me know how to achieve (the required functionality) and with Power Automate, if you have any resources that can help to get this (I have worked with Power Automate before but not such a large data).
I did not get 'UDF', what's that.

Regards,
Sidhant.

Thanks for the updates! I understand your point about using external tools like Bravo or DAX Studio for exporting data, but I think it’s worth considering the advantages of using Fabric Notebooks, UDFs, and Power Automate for this kind of task, especially when working with large datasets.

The main benefit of the Fabric + Power Automate approach is that it offers a more streamlined, scalable, and integrated solution.

Here’s how the flow could work:

Power BI Report: Users select fields via slicers or parameters.

Power Automate Flow: The flow captures user selections (through a Power BI button).

Fabric Notebooks: The notebook processes the selected data (filtering, formatting) and exports it as CSV/Excel to OneLake or Blob Storage.

Automated Notification: The flow sends the user an email with the exported file or a link to the file in storage.

Using Power Automate and Notebooks will let you handle much larger datasets efficiently, and it integrates seamlessly into existing Power BI workflows. Additionally, as you've mentioned, bypassing the Power BI export limit (150k rows) is easily achievable with this approach.
On the oder side you can also check translytical task flows 

It leverages UDFs (User Data Functions), allowing you to place a button in the Power BI report that captures the filter context and uses it to run a notebook or python code to export the data you need.

Hi @Gabry,
Thanks for give an idea about how the flow will look like, so I was trying to create a flow had few queries with respect to it:

import json

# 1. Get JSON string from Power Automate (filters parameter)
filters_str = dbutils.widgets.get("filters")
filters = json.loads(filters_str)

print("Received filters:", filters)

# 2. Load data 
df = spark.read.sql("""
    SELECT OrderID, Name, Profit, Quantity, OrderDate, Region
    FROM SalesTable
""")

# 3. Apply filters dynamically
for col, val in filters.items():
    if isinstance(val, list):
        # Multiple selections from slicer
        df = df.filter(df[col].isin(val))
    else:
        # Range handling for dates or numbers
        if "Start" in col:
            base_col = col.replace("Start", "")
            df = df.filter(df[base_col] >= val)
        elif "End" in col:
            base_col = col.replace("End", "")
            df = df.filter(df[base_col] <= val)
        else:
            # Single value equality
            df = df.filter(df[col] == val)

# 4. Save to OneLake / Blob
output_path = "abfss://exportdata@onelake.dfs.fabric.microsoft.com/SalesExports/FilteredExport.csv"

(df
 .coalesce(1)  # single file
 .write
 .mode("overwrite")
 .option("header", "true")
 .csv(output_path))

print("✅ Export completed:", output_path)

The above code was for the Fabric notebook (to accept dynamic range of filters which means there can be n slicers that can be added later so the code should not fail if newer one's are added). Ovver here I was not sure what should be added as the output_path (should I add the URL of lakehouse)

Sidhant_0-1758283764872.png

Then coming to the Power Automate flow:

Sidhant_1-1758283875151.png

Then in the compose action I was using the JSON body to build the Request body which looks like:

{
  "notebookExecution": {
    "parameters": {
      "filters": "@{json(triggerBody()?['filters'])}"
    }
  }
}

Then to run the Fabric Notebook using the HTTP action (premium connector), I did across two URL that can run a notebook:
1st: https://api.fabric.microsoft.com/v1/workspaces/{{WORKSPACE_ID}}/items/{{ARTIFACT_ID}}/jobs/instances (Received from ChatGPT)
2nd: (which is been used currently and here we don't need to pass anything in the body i.e. empty)
https://api.fabric.microsoft.com/v1/workspaces/{workspace_id}/items/{artifact_id}/jobs/instances?job...
But the issue is I am not sure how to pass the filters (Compose action output if I used the 1st URL and to generate the token do I need to register an app (how to get the token))
-> The next step was to poll the notebook untill it returns 200 (succeded) using the GET request within it and which uses the status code from the previous HTTP action, but here I needed o get the JobInstanceID not sure on how to get that

# GET URL:
GET https://api.fabric.microsoft.com/v1/workspaces/{{WORKSPACE_ID}}/items/{{ARTIFACT_ID}}/jobs/instances/{{jobInstanceId}}

Right now I have the workspace and ARtifact id which we can find in the URL of the Fabric notebook.

ANd the next steps were to the notebook should save the filtered data in OneLake and using GET link to retrieve the link of the file stored (location) and finally using send email action notify user.

So can you help me out here and is the above flow design correct?.

@Poojara_D12 , @v-hashadapu : If you have anything to to do please do share them as well.
Regards,
Sidhant

Hi, sorry for the delay, the last few days have been quite busy and the topic is getting a bit complex, so I needed some time to think it through.

I reviewed your notebook + Power Automate flow and gave it some thought, and I believe that, at least for now, it might be easier to simplify things. Honestly, it felt like we were overcomplicating what should have been a relatively simple task, so I suggest just relying on user data functions 

These are specific artifacts you can create to easily access the report filter context. I recommend checking the documentation, for example:

Overview

Step-by-step tutorial

There are also YouTube videos available, since it would be difficult to explain everything here in detail.

You can use these artifacts almost like notebooks. With some adjustments, you could place the code you wrote inside a UDF, receive the Power BI filter context as parameters, use it to generate the new dataframe, and then write a new file to the lakehouse in a single step. This way, at least for now, you could avoid adding Power Automate.

In my opinion, the UDF approach is the cleanest and least messy: you keep all the code inside the function, both the part that reads the filter context and the part that writes the files.

Take a look at the docs and let me know what you think.

PS. 

Of course, the filter context can also be passed via power automate. It’s not that complex, but explaining all the steps here would be difficult. You can follow the official documentation here or find tutorials on youtube. Additionally, check here in the section Run a notebook on demand explains how to pass parameters using the REST API, that is one of the steps where you got stuck.

I apologize for bringing up the power automate approach, in hindsight, it would probably have been better to focus solely on translytical task Flows. My recommendation, as mentioned earlier, is to set this approach aside for now and give UDFs a try first.

v-hashadapu
Community Support
Community Support

Hi @Sidhant , Hope you're doing okay! May we know if it worked for you, or are you still experiencing difficulties? Let us know — your feedback can really help others in the same situation.

Hi @v-hashadapu,
Not yet @ShubhaGampa11 is my colleague who is working along with me and we are exploring different ways to achieve the expected o/p, so I did convey to share her points on this thread.
I came across some post wherein they did mention to make use of email subscription, but how use that was not mentioned so if you know anything about it do let me know and if you or @Gabry has anything to add please do.

Regards,
Sidhant.

ShubhaGampa11
Advocate I
Advocate I

Hi @v-hashadapu@Gabry 

This end-to-end solution enables users to dynamically select fields in Power BI and export the filtered data to a well-formatted Excel file via Microsoft Fabric, with automation powered by Power Automate.

The flow begins with a Power BI report where users choose specific columns using slicers or parameters. A Power Automate flow—triggered via HTTP or a Power BI button—captures these selections, along with export metadata like user email, export ID, and record limits. The flow authenticates with Microsoft Fabric using an Azure App Registration and securely triggers a Fabric data pipeline.

Inside Fabric, the pipeline filters and exports the selected data to Excel, formats the output using a notebook (with headers styled, column widths adjusted, and summary metadata added), and stores it in a Lakehouse location. After processing, Power Automate fetches the file and emails it directly to the user, attaching the Excel file with all selected data.

This workflow is scalable, secure, and user-friendly—ideal for automated reporting, scheduled exports, or on-demand sharing. It eliminates manual data exports and delivers polished Excel reports with just one click from Power BI.

This approach is possible but its costly to the user and we need lot of field parameters and slicers for example ,PFB screenshot 

ShubhaGampa11_0-1757987671502.png

So im using run a query against dax to get the dynamic parameters but im facing an issue where i need to implement the slicers and right now im working with a sample dataset with 3 field parameters and a slicer when im using perfomance analyzer i'm seeing the following DAX - 

// DAX Query
DEFINE
VAR __DS0FilterTable = 
TREATAS({"'Orders'[Order ID]"}, 'OrderParameter'[OrderParameter Fields])
 
VAR __DS0FilterTable2 = 
TREATAS({"'People'[Region]"}, 'PeopleParameter'[PeopleParameter Fields])
 
VAR __DS0FilterTable3 = 
TREATAS({"'Returns2'[Returned]"}, 'ReturnsParameter'[ReturnsParameter Fields])
 
VAR __DS0FilterTable4 = 
FILTER(
KEEPFILTERS(VALUES('Orders'[Order Date])),
'Orders'[Order Date] >= DATE(2022, 1, 20)
)
 
VAR __DS0Core = 
SUMMARIZECOLUMNS(
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Year],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Quarter],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[QuarterNo],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Month],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[MonthNo],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Day],
__DS0FilterTable,
__DS0FilterTable2,
__DS0FilterTable3,
__DS0FilterTable4,
"SelectedFieldsOrder", 'FieldParamTextOrder'[SelectedFieldsOrder],
"SelectedFieldPeople", 'FieldParamTextPeople'[SelectedFieldPeople],
"SelectedFieldsReturns", 'FieldsParamTextReturns'[SelectedFieldsReturns]
)
 
VAR __DS0BodyLimited = 
TOPN(
1002,
__DS0Core,
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Year],
1,
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[QuarterNo],
1,
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Quarter],
1,
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[MonthNo],
1,
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Month],
1,
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Day],
1
)
 
EVALUATE
__DS0BodyLimited
 
ORDER BY
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Year],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[QuarterNo],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Quarter],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[MonthNo],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Month],
'LocalDateTable_646b2db8-ef0e-4dc0-83b9-a2a5218245aa'[Day]

  

Can anyone pleas help me on how to make the DAX more dynamic and satisfy our requirment of exporting 1m rows too.

Thankyou

Sidhant
Advocate V
Advocate V

Hi @v-hashadapu , @Gabry ,
I had another query by any chance is there any way through which we can increase the 150K (export limit), like for a specific usecase by connecting with the Microsoft Team and increase the limit like how we have in AWS let's say we want to increase some default limit we can easily increase that or in case of Azure Blob storage the default limit {in terms of storage} is 5 PiB which can be increased by contacting the support team (Azure support), similarly can we do that (for a special requirement) {considering if my org is Microsoft Partner}.
Meanwhile I'm also implementing few workarounds, which are in-progress will share those as well (the respective blockers associated with them).

Regards,
Sidhant.

Hi @Sidhant , Thank you for reaching out to the Microsoft Community Forum.

 

In Power BI, the 150K row export limit (to Excel/CSV) is a hard service limitation and can’t be raised by contacting Microsoft support, even if your org is a Microsoft Partner. Unlike Azure services where quotas can be increased, Power BI enforces these limits consistently across tenants for performance and governance reasons.

Paginated reports in Power BI: FAQ - Power BI | Microsoft Learn

 

For true large-scale exports (1M+ rows), the recommended approach is to bypass the built-in export and instead leverage Fabric Lakehouse or Dataflows/Notebooks to generate files (CSV/Parquet) that can be stored in OneLake or Blob Storage and then distributed. That way you remove the export bottleneck, support dynamic column selection (via parameters or notebook inputs) and keep the solution scalable for downstream use in Power Automate or other services.

 

v-hashadapu
Community Support
Community Support

Hi @Sidhant , hope you are doing great. May we know if your issue is solved or if you are still experiencing difficulties. Please share the details as it will help the community, especially others with similar issues.

Gabry
Super User
Super User

I've never tried this myself, but I was thinking, if you're using Fabric, one possible solution could be the following:

You can set up a report with a column filter. Then, using a user data function, you can trigger a notebook that receives the selected columns as input. The notebook can export the filtered data and save it to a folder within the lakehouse. Users can then access the data directly from that folder.

Could this be a viable solution?

Hi @Gabry ,
Thanks for the response, I'll check this approach and see if resolves my issue. I will keep you posted on this.

Regards,
Sidhant.

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Solution Authors