Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Compete to become Power BI Data Viz World Champion! First round ends August 18th. Get started.

Reply
jrop
Regular Visitor

Power BI Report Size Increase with Different Queries Despite Having the Same Amount of Data

Hello all,

I am in a data project with several reports. In some of them, I use a 'TableA' stored in a SQL Server Data Warehouse. On the other hand, I have another table 'TableB', which is created by performing some transformations to TableA thanks to an ETL process. I hope to be able to replicate this ETL job of TableB in PowerQuery and thus avoid storing this table that occupies 90% of my DB.

Through PowerQuery in Power BI, I have managed to replicate the ETL process that filled TableB but starting from TableA, that is to say that I have the same number of records and columns without using the TableB, which was the goal.

The problem comes with the storage of this report. Assuming that I have the same amount of data, for 2 months of information, the report that reads from TableB directly occupies 362 Mb, and on the contrary, the one that uses TableA to recreate TableB ends up weighing 599 Mb.

The query of the latter is much longer, is this what could be influencing the report to be heavier, or what could be the cause of the report being much heavier?

Below I post the two queries for better understanding.

With TableB:

 

 

 

 

let
    Source = Sql.Database(Server, Database),
    TableB = Source{[Schema="dbo", Item="TableB"]}[Data],
    #"INCREMENTAL REFRESH" = Table.SelectRows(TableB, each [Column1] > RangeStart and [Column1] <= RangeEnd),
    #"Removed Columns" = Table.RemoveColumns(#"INCREMENTAL REFRESH", {"Column2"}),
    #"Merged Queries" = Table.NestedJoin(#"Removed Columns", {"Key1"}, DimTableA, {"Key2"}, "DimTableA", JoinKind.LeftOuter),
    #"Expanded DimTableA" = Table.ExpandTableColumn(#"Merged Queries", "DimTableA", {"Column3", "Column4"}, {"DimTableA.Column3", "DimTableA.Column4"}),
    #"Inserted Date" = Table.AddColumn(#"Expanded DimTableA", "DateColumn", each DateTime.Date([DateTimeColumn]), type date),
    #"Renamed Columns" = Table.RenameColumns(#"Inserted Date", {{"DimTableA.Column3", "Month"}, {"DimTableA.Column4", "Weekday"}, {"DateColumn", "Date"}})
in
    #"Renamed Columns"

 

 

 

 

 


With TableA:

 

 

 

 

let
    // Connection to the database and loading the filtered table
    Source = Sql.Database(Server, Database),
    TableA = Source{[Schema="dbo", Item="TableA"]}[Data],
    FilteredData = Table.SelectRows(TableA, each [Column1] > RangeStart and [Column1] <= RangeEnd),
    DimTableBReduced = Table.SelectColumns(#"DimTableB", {"Key1", "Column2"}),

    // Join and expansion with DimTableB
    ExpandedDimTableB = Table.ExpandTableColumn(
        Table.NestedJoin(
            FilteredData, {"ForeignKey1"},
            DimTableBReduced, {"Key1"},
            "DimTableB", JoinKind.LeftOuter
        ),
        "DimTableB", {"Column2"}, {"Column2"}
    ),
    // Selection of necessary columns
    SelectedColumns = Table.SelectColumns(ExpandedDimTableB, {
        "Column3",
        "Column4",
        "Column5",
        "Column6",
        "Column7",
        "Column8",
        "Column9",
        "ForeignKey1",
        "Column10",
        "Column11",
        "Column12",
        "Column13",
        "Column14",
        "Column15",
        "Column16",
        "Column17",
        "Column18",
        "Column19",
        "Column20",
        "Column21",
        "Column22",
        "Column23",
        "Column24",
        "Column2"
    }),

    // Self join to get the next stop
    FilteredJoinedTable = Table.SelectRows(
        Table.ExpandTableColumn(
            Table.NestedJoin(
                SelectedColumns, {"ForeignKey2"},
                SelectedColumns, {"ForeignKey2"},
                "JoinedTable", JoinKind.Inner
            ),
            "JoinedTable", {
                "Column9", "ForeignKey1", "Column14", "Column17", "Column18", "Column2"
            }, {
                "Column9_R", "ForeignKey1_R", "Column14_R", "Column17_R", "Column18_R", "Column2_R"
            }
        ),
        each [Column2_R] > [Column2]
    ),
    // Final column selection and renaming
    RenamedResultTable = Table.RenameColumns(
        Table.SelectColumns(FilteredJoinedTable, {
            "Column3",
            "Column4",
            "Column5",
            "Column6",
            "Column7",
            "Column8",
            "Column9",
            "ForeignKey1",
            "Column9_R",
            "ForeignKey1_R",
            "Column10",
            "Column11",
            "Column13",
            "Column14_R",
            "Column15",
            "Column16",
            "Column22",
            "Column17_R",
            "Column18",
            "Column19",
            "Column20",
            "Column1",
            "Column23",
            "Column24"
        }),
        {
            {"Column9", "OriginKey"},
            {"ForeignKey1", "OriginForeignKey"},
            {"Column9_R", "DestinationKey"},
            {"ForeignKey1_R", "DestinationForeignKey"},
            {"Column14_R", "DestinationArrivalTime"},
            {"Column17_R", "DestinationActualArrivalTime"},
            {"Column18", "OriginActualDepartureTime"},
            {"Column15", "OriginDepartureTime"}
        }
    ),
    // Add date column and rename
    FinalRenamedColumns =
        Table.AddColumn(
            Table.ExpandTableColumn(
                Table.NestedJoin(
                    RenamedResultTable, {"Column13"},
                    Table.SelectColumns(
                        DimTableC, {
                            "Key3",
                            "Column25",
                            "Column26"
                        }
                    ), {"Key3"},
                    "DimTableC", JoinKind.LeftOuter
                ),
                "DimTableC", {"Column25", "Column26"},
                {"Month", "Weekday"}
            ), "OriginDepartureTime Date",
            each DateTime.Date([OriginActualDepartureTime]), type date
        )
in
    FinalRenamedColumns 

 

 

 

 

 




6 REPLIES 6
Anonymous
Not applicable

Your tableA query is so much heavier because of all of the joins. Each of those joins are adding a bunch of new data to your  query. Can't you do this without referring at all to tables?

 

--Nate

lbendlin
Super User
Super User

I use a 'TableA' stored in a SQL Server Data Warehouse.

You mean a SQL Server database?

 

I have another table 'TableB', which is created by performing some transformations to TableA thanks to an ETL process. I hope to be able to replicate this ETL job of TableB in PowerQuery and thus avoid storing this table that occupies 90% of my DB.

In general, performance on SQL Server is substantially better (sometimes orders of magnitude) than Power Query.

Storage is cheap.  Get some more storage on your database server, and run as much of the processing there.

 

(You could use a view to avoid materializing the TableB but that comes with the penalty that you can't spool the data any more.)

 

 

Yes, I currently have both table A and table B in SQL Server. These are written daily with a Talend process.
Table B is written using table A in a Talend job. The idea is just to be able to eliminate Table B from the Data Warehouse and the Talend job since they take up 90% of the storage in the Data Warehouse and a part of all the time of the ETL process.

That is why I have tried to replicate this job but in PowerQuery with the long query making use of table A directly. In the end, the resulting number of rows and columns is the same, but the file size is much larger, and I don't know the exact reason for this.

Table B is written using table A in a Talend job. 

Why?  Why not do that directly in SQL Server?

Because the idea is to be able to eliminate this Table B of the SQL Server Data Warehouse since it occupies 90% of the storage and I am trying to recreate it directly from Power Query.

You will find that Power Query will take much longer to create Table B. Go ahead, try it out.

Helpful resources

Announcements
August Power BI Update Carousel

Power BI Monthly Update - August 2025

Check out the August 2025 Power BI update to learn about new features.

August 2025 community update carousel

Fabric Community Update - August 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors