Solved: How to optimize big data load

Fleeups · ‎12-16-2022

So, here's the thing. I had a database that's over 289k rows from Google Big Query.

so, I've tried to split this query in 2:

Sentimentos_old and Sentimentos.

Sentimentos_old are pulling data from before 2022-11-01, and I've locked it from refreshing, it returned 287k rows.

Table Sentimentos pulls data after this date, and combine the old ones.

I was expecting to instantly had the 287 rows from Sentimentos_old (that should be static from now, since it was not refreshing), and only refresh the 2k rows from Sentimentos, but that didn't happened, it still take so long to refresh, just like if it was on a unique query.

Is there a way to do something like this?

ppm1 · ‎12-16-2022

Incremental refresh is the way to do this. Please see this documentation.

Incremental refresh for datasets and real-time data in Power BI - Power BI | Microsoft Learn

Pat

Microsoft Employee

View solution in original post

djurecicK2 · ‎12-19-2022

@Fleeups , Yes, it is.

djurecicK2 · ‎12-19-2022

Agree with incremental refresh with large data, but 289k rows does not seem that big- maybe something else is going on? Can you reduce the number of columns you are importing?

Fleeups · ‎12-19-2022

I've read somewhere that some merges can really slow down the refresh, and I've got a merge that should be slowing it down.

Every row on that table refers to comments on instagram posts. I have another table that had the columns campaign ID and key words to map comments that has been adherent to the campaign.

So, I use a merge (to bring only the right words to each campaign), that result on a column with tables. Turn that tables into a list, like that:

then I create a new column to check if there any of that words on the list match with the comment column, if there's a match returns 1, else 0, so I can just sum this column as relevant comments, and calculate adherence rate. here's the formula:

= Table.AddColumn(#"Added Custom", "Comentários Aderentes", each if
List.AnyTrue(List.Transform([Custom],
(x) => Text.Contains([texto_sem_acento], x, Comparer.OrdinalIgnoreCase))) = true
then 1 else 0)

maybe all this process are slowing down the refresh?