Using local cache for table union on multiple tabl...

matt-stuart · ‎02-01-2018

We've connected one of our Power BI desktop instances to an ODataV4 feed that provides several entity sets, in this case we're only interested in 4 of them. All 4 share at least a few columns and we'd like to do a union of all 4 tables of the common columns into a new table that can then be used to power a few visualziations. What we're finding is that the merge table query, which I'll include below, goes back to the origin servers to grab _all_ the data from all 4 tables before doing the merge rather than using the data which should already be in Power Bi. I'm wondering if we're missing something here in our configuration or in how PowerBI works..

Technical:

Tables: TA, TB, TC, TD

Common Columns: Date (Date), Type (string)

Merge Table Query:

let
    CA=Table.SelectColumns(TA, {"Date", "Type"}),
    CB=Table.SelectColumns(TB, {"Date", "Type"}),
    CC=Table.SelectColumns(TC, {"Date", "Type"}),
    CD=Table.SelectColumns(TD, {"Date", "Type"}),
    AggregateTable=Table.Combine({CA, CB, CC, CD}),
    #"Grouped"=Table.Group(AggregateTable, {"Date", "Type"], {{"Count", each Table.RowCount(_), type number}}),
    #"Pivoted"=Table.Pivot(#"Grouped", List.Distinct(#"Grouped"[Type]), "Type", "Count", List.Sum)
in
    #"Pivoted"

Basically we want to end up with data that looks like:

Date | TypeValue1 | TypeValue2 | TypeValue3
--------------------------------------------------
2017-01-01 | 10 | 0 | 5
2017-01-01 | 3 | 2 | 7
2017-01-02 | 0 | 10 | 6
2017-01-03 | 1 | 8 | 0

The issue is each of the 4 tables contains millions of rows, we do the refresh often enough on those tables for our purposes but by god it goes and asks for all the data again one it calculates the aggregate table! Is there a behaviour we can use to force the aggregate to just pull the data from the on disk copies of the table?

Notes:

* This data is used primarily to drive visualizations, if there's other ways to get the line graphs we're looking for then by all means poke me in the eye!

* We'd be happy to look at aggregation severs that are responsible for pulling in the raw data and storing on premise / cloud that PowerBI would then pull from (Gateway perhaps?)

Thanks for your help & thoughts!

v-chuncz-msft · ‎02-04-2018

@matt-stuart,

You may try Aggregation Extension first.

Community Support Team _ Sam Zha
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

matt-stuart · ‎02-01-2018

We've connected one of our Power BI desktop instances to an ODataV4 feed that provides several entity sets, in this case we're only interested in 4 of them. All 4 share at least a few columns and we'd like to do a union of all 4 tables of the common columns into a new table that can then be used to power a few visualziations. What we're finding is that the merge table query, which I'll include below, goes back to the origin servers to grab _all_ the data from all 4 tables before doing the merge rather than using the data which should already be in Power Bi. I'm wondering if we're missing something here in our configuration or in how PowerBI works..