Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

We've captured the moments from FabCon & SQLCon that everyone is talking about, and we are bringing them to the community, live and on-demand. Starts on April 14th. Register now

Reply
michaelog80
Frequent Visitor

Getting Error when loading Large table to Dataflow

Hi, We are starting to use Power BI recently and need to build a big semantic model containing about 8 tables for our Customer level data for which our end users can build their own reports  by dragging and dropping values. We use Teradata and have created an On Premise Data Gateway with a connection to Teradata. One of the tables is very large - over 100 million rows, 7 years of Customer level data with over 100 columns /attributes which updates daily. The other 7 tables are reference tables which are not updated that often and are smaller tables - max 10,000 rows

From reading online, the suggested method would be to create @DataFlow (Gen1) (waiting on IP to open up ports for Gen2) and then build and publish your Semantic Model off these dataflows to give optimal dashboard performance.

The issue I am having is loading the large Customer table into a Dataflow. 

I know once its loaded, I would be able to set up Incremental refresh to load current month only  but I cant get the table to load. It keeps timing out after about 12 hours loading. If I run a Select * From that table in Teradata, it only takes about 15 minutes to run so why would it be taking 12 + hours to run in a Dataflow. I have tried multiple things with no success.

Loading full table with no additional calculations

Splitting table into years / quarters and trying to load. ( am using calculations here as it will need to be rolling i.e once 1st May comes, I dont want to see April 2018 data)

Any helo would be appreciated.

 

Thanks,

 

1 ACCEPTED SOLUTION
v-nmadadi-msft
Community Support
Community Support

Hi @michaelog80,
Thanks for reaching out to the Microsoft fabric community forum.

As you have mentioned there are 8 tables instead of using one dataflow for all of them you can try splitting them into multiple dataflows for two or even one table.
Reference: Best practices for designing and developing complex dataflows - Power Query | Microsoft Learn

Another recommended step in your scenario is to create staging dataflows, to reduce the load.
Please follow the referenced article to know more about staging dataflow and how to best use them: Best practices for creating a dimensional model using dataflows - Power Query | Microsoft Learn

If you find this post helpful, please mark it as an "Accept as Solution" and consider giving a KUDOS. Feel free to reach out if you need further assistance.
Thanks and Regards

View solution in original post

4 REPLIES 4
v-nmadadi-msft
Community Support
Community Support

Hi @michaelog80,

May I ask if you have resolved this issue? If so, please mark the helpful reply and accept it as the solution. This will be helpful for other community members who have similar problems to solve it faster.

Thank you.

 

v-nmadadi-msft
Community Support
Community Support

Hi @michaelog80,
Thanks for reaching out to the Microsoft fabric community forum.

As you have mentioned there are 8 tables instead of using one dataflow for all of them you can try splitting them into multiple dataflows for two or even one table.
Reference: Best practices for designing and developing complex dataflows - Power Query | Microsoft Learn

Another recommended step in your scenario is to create staging dataflows, to reduce the load.
Please follow the referenced article to know more about staging dataflow and how to best use them: Best practices for creating a dimensional model using dataflows - Power Query | Microsoft Learn

If you find this post helpful, please mark it as an "Accept as Solution" and consider giving a KUDOS. Feel free to reach out if you need further assistance.
Thanks and Regards

michaelog80
Frequent Visitor

Thanks for the reply @Akash_Varuna 

Just to follow up on a couple of helpful suggestions you made.

 

I cannot get the intial load to work as it runs for over 12 hours and then times out. Is there a way to increase this?

Data Gateway is on a good sized server 128GB of RAM. 8 physical cores. 16 VM cores.

Ive tried splitting into smaller dataflows but when I add calculation to split i.e a date column i seemt o get an error stating Event Was cancelled after 10 minutes.

Regarding dataflows. I have turned on Enhanced Compute Engine Settings to allow for Direct query and will see how that works

Thanks,
mike

 

Akash_Varuna
Super User
Super User

You could try optimizing your Teradata query to filter necessary data and enable incremental load to handle only recent updates after the initial load. Configure the Data Gateway with enough resources and adjust batch sizes for better performance. Split the large table by year/quarter into separate Dataflows for smaller, manageable loads. Alternatively, stage the data in Azure or use DirectQuery to reduce load times.

Helpful resources

Announcements
New to Fabric survey Carousel

New to Fabric Survey

If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.

Power BI DataViz World Championships carousel

Power BI DataViz World Championships - June 2026

A new Power BI DataViz World Championship is coming this June! Don't miss out on submitting your entry.

Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

March Power BI Update Carousel

Power BI Community Update - March 2026

Check out the March 2026 Power BI update to learn about new features.

Top Solution Authors