Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

Reply
robotbi
Frequent Visitor

Power BI Create Dataflow with Big Size Data

Hello,

 

I am creating a dataflow with a sql query dataset that is about 2-7 million rows. The sql query itself usually takes 1.5h to finish running. Ive tried using Power Query and change the runtime to 200mins to run the sql but it keeps throwing timed out errors.

Now Im trying to load the base tables directly from database with necessary columns, then in Power Query join/combine the tables to get the final output data, but the tables are too big and it cant finish loading to combine.

 

Initially we need the previous 90 days data, so Ive also tried to slice the dataset into 15 days each batch, but still took too long to load and failed.

 

Any potional solutions we can try here? Appreciate any ideas and help!

robotbi_0-1724252014077.png

 

2 ACCEPTED SOLUTIONS
lbendlin
Super User
Super User

The sql query itself usually takes 1.5h to finish running

Make it run (much, much) faster.  Indexes, statistics etc.  There is no reason for such a small amount of rows to take 1.5 hours. 

 

 then in Power Query join/combine the tables to get the final output data,

Do not do that.  Merges are extremely expensive.  Load the data independently and the combine it in the Semantic Model data model.

 

Initially we need the previous 90 days data, so Ive also tried to slice the dataset into 15 days each batch, but still took too long to load and failed.

Read about incremental refresh, and especially about bootstrapping (preparing partitions without filling them right away).

View solution in original post

Anonymous
Not applicable

Hi @robotbi ,

 

I’d like to acknowledge the valuable input provided by lbendlin . Their initial ideas were instrumental in guiding my approach. However, I noticed that further details were needed to fully understand the issue

 

Incremental Refresh and Real-Time Data for Semantic Models in Power BI provides an effective way to handle dynamic data and improve model refresh performance. By automating partition creation and management, incremental refresh reduces the amount of data that needs to be refreshed and allows the inclusion of real-time data.

 

After applying filters and loading a subset of data into the model, an incremental refresh strategy can be defined for the table. After publishing the model to the service, the service will use the policy to create and manage table partitions and perform refresh operations. To define a policy, specify the required and optional settings using the Incremental Refresh and Live Data dialogue box.

vkaiyuemsft_0-1724308694313.png

 

vkaiyuemsft_1-1724308706215.png

 

 

More details can be found in the documentation:

Incremental refresh for semantic models in Power BI - Power BI | Microsoft Learn

Advanced incremental refresh and real-time data with the XMLA endpoint in Power BI - Power BI | Micr...

 

If your Current Period does not refer to this, please clarify in a follow-up reply.

 

Best Regards,

Clara Gong

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.

View solution in original post

2 REPLIES 2
Anonymous
Not applicable

Hi @robotbi ,

 

I’d like to acknowledge the valuable input provided by lbendlin . Their initial ideas were instrumental in guiding my approach. However, I noticed that further details were needed to fully understand the issue

 

Incremental Refresh and Real-Time Data for Semantic Models in Power BI provides an effective way to handle dynamic data and improve model refresh performance. By automating partition creation and management, incremental refresh reduces the amount of data that needs to be refreshed and allows the inclusion of real-time data.

 

After applying filters and loading a subset of data into the model, an incremental refresh strategy can be defined for the table. After publishing the model to the service, the service will use the policy to create and manage table partitions and perform refresh operations. To define a policy, specify the required and optional settings using the Incremental Refresh and Live Data dialogue box.

vkaiyuemsft_0-1724308694313.png

 

vkaiyuemsft_1-1724308706215.png

 

 

More details can be found in the documentation:

Incremental refresh for semantic models in Power BI - Power BI | Microsoft Learn

Advanced incremental refresh and real-time data with the XMLA endpoint in Power BI - Power BI | Micr...

 

If your Current Period does not refer to this, please clarify in a follow-up reply.

 

Best Regards,

Clara Gong

If there is any post helps, then please consider Accept it as the solution  to help the other members find it more quickly.

lbendlin
Super User
Super User

The sql query itself usually takes 1.5h to finish running

Make it run (much, much) faster.  Indexes, statistics etc.  There is no reason for such a small amount of rows to take 1.5 hours. 

 

 then in Power Query join/combine the tables to get the final output data,

Do not do that.  Merges are extremely expensive.  Load the data independently and the combine it in the Semantic Model data model.

 

Initially we need the previous 90 days data, so Ive also tried to slice the dataset into 15 days each batch, but still took too long to load and failed.

Read about incremental refresh, and especially about bootstrapping (preparing partitions without filling them right away).

Helpful resources

Announcements
Fabric Data Days Carousel

Fabric Data Days

Advance your Data & AI career with 50 days of live learning, contests, hands-on challenges, study groups & certifications and more!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.

Top Kudoed Authors