Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Register now to learn Fabric in free live sessions led by the best Microsoft experts. From Apr 16 to May 9, in English and Spanish.

Reply
marktait
Frequent Visitor

Calculated column in transform data is increasing the data refresh by 4 hours

Hi.

 

I have a Power BI dashboard, which refreshes data from two Excel spreadsheets, stored in a TEAMs space, one named [Devices] and the other named [Defender].

 

In PBI, in Transform Data/Power Query Editor, I have added 4 columns (same as the one below, with slightly different criteria) to the [Devices] table.

 

The [Devices] table has around 1000 entries, the [Defender] table, has around 30,000 rows.

 

The calculated column(s) look up each [Devices]"Device Name" field, and [Devices]"ExtractDate" field within the [Defender] table, and gives a count of how many rows match the two criteria (similar to a join in SQL).

 

However, just adding these 4 calculated columns, increases the refresh time of the dataset into Power BI, by around 4 hours.

 

Is there any way I can speed up the lookup/count between the two tables? (when using SQL Server, I used to add an index to specific columns used for joins - but I'm not sure how to do that in Power BI, or if it's even possible).

 

Thanks for any advice,

 

Mark

 

= Table.AddColumn(#"Add CVE Low Count", "CVE Critical", each Table.RowCount(Table.SelectRows(
#"Defender",
(row) => row[Device Name] = [Device Name]
and
row[ExtractDate] = [ExtractDate]
and
row[Severity] = "Critical"
)))

 

2 REPLIES 2
v-junyant-msft
Community Support
Community Support

Hi @marktait ,

What @lbendlin  said does work, you can try to optimize performance by modifying your M function.
You can also try to try to optimize both the data source and the refresh.
1. Add the four columns you need ahead of time in the data source, as the data source is definitely running under a better load than the performance in Power BI.
2. Based on your description, the database you are using seems to be SQL Server. then you can try to use incremental refresh when doing the refresh. If you're using Power BI Pro or Premium, you might benefit from setting up an incremental refresh policy for your dataset. This way, only data that has changed will be refreshed, which could significantly reduce the overall refresh time. However, this approach requires careful planning around how your data is partitioned and updated. 
Incremental refresh for semantic models and real-time data in Power BI - Power BI | Microsoft Learn

Best Regards,
Dino Tao
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

lbendlin
Super User
Super User

Read about Table.Buffer  and about the performance differences between the various merge/join options.

Read about Table.AddKey  and apply those as needed.

 

May want to read the article series by Chris Webb:  Chris Webb's BI Blog: Optimising The Performance Of Power Query Merges In Power BI, Part 1: Removing...

 

 

Helpful resources

Announcements
Microsoft Fabric Learn Together

Microsoft Fabric Learn Together

Covering the world! 9:00-10:30 AM Sydney, 4:00-5:30 PM CET (Paris/Berlin), 7:00-8:30 PM Mexico City

PBI_APRIL_CAROUSEL1

Power BI Monthly Update - April 2024

Check out the April 2024 Power BI update to learn about new features.

April Fabric Community Update

Fabric Community Update - April 2024

Find out what's new and trending in the Fabric Community.