Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
PWRBI
Frequent Visitor

Best practice Reference or Append Query - Load large amount of data in multiple steps

Hello,

 

I have a very large amount of data, so I need to split it into multiple queries to keep the data source from crashing.

But then I have to merge those queries back together to have a single query and transform the fields.
However, when referenced again from the source, they are all loaded together and this leads to chrash in the data source.
Unfortunately, with the reference, the data source is also loaded again, what alternative is there to the reference?

3 REPLIES 3
jbwtp
Memorable Member
Memorable Member

Hi @PWRBI,

 

Theoretically, switching off "Parallel table loading" in the setting may help to avoid concurrent access to the datasource. But if you are having those problems, I guess, the datasource doesnot support anything like pagination or query folding. In this case splitting query into separate queries may do the opposed as each query may be loading the entire datasource and then filter it based on the filters it has. Without knowing exactly what you are doing, it is hard to tell, but this is quite possible scenario. Maybe re-think your strategy and test if you can load the entire dataset from the source at once one more time? Can you split the datasource to a smaller data tables/sources?In this case, splitting can make sense (if you requesting separate files or tables each time), in this case try prohibiting parallel data load in settings.

 

Cheers,

John

PWRBI
Frequent Visitor

Thank you for your feedback.

 

So the individual queries work great and I have already deactivated parallel queries.

 

I have the following example:

query 1
query 2
Query 3
Query 4
Query 5

 

These 5 queries run one after the other and the database has no problems.

 

In the Master Query I refer to queries 1 - 5 and in this query all five queries are started at the same time and the database crashes.

jbwtp
Memorable Member
Memorable Member

Hi @PWRBI,

 

I see your point, yes, you are right since this is a single query referring to several other queries, it triggers them to run all at once regardless the settings.

 

Do you wan tto try playing along these lines:

let
    Source_1 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgKTxkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    Source_2= Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgKTxkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    Source_3 = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WMlSK1YlWMgKTxkqxsQA=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [Column1 = _t]),
    Combine_1 = Table.Combine({Source_1, Function.InvokeAfter(()=>Source_2, #duration(0,0,0,10))}),
    Combine_2 = Table.Combine({Combine_1, Function.InvokeAfter(()=>Source_3, #duration(0,0,0,10))})
in Combine_2

 

The idea behind it to manually force the delay on gettin the date from each table (using Function.InvokeAfter). You will need to somehow manually measure timing for each query to load and then add 20% on top for the duration parameter. I think this may works, but the performance of course will be quite suboptimal.

 

Kind regards,

John

Helpful resources

Announcements
Europe Fabric Conference

Europe’s largest Microsoft Fabric Community Conference

Join the community in Stockholm for expert Microsoft Fabric learning including a very exciting keynote from Arun Ulag, Corporate Vice President, Azure Data.

AugPowerBI_Carousel

Power BI Monthly Update - August 2024

Check out the August 2024 Power BI update to learn about new features.

August Carousel

Fabric Community Update - August 2024

Find out what's new and trending in the Fabric Community.

Top Solution Authors
Top Kudoed Authors