I have 2 queries that run in sequence that read from my Fabric data warehouse. The queries are different and should yield different results but when they run, they actually return the same data. Query #1: DECLARE @StartingDate DATE = DATEADD(YEAR, -10, DATETRUNC(year, GETDATE())); select * from rpt.vw_DimGLAccount where [GL Account Key] in (SELECT F.ExpenseGLAccountKey FROM gold.FactPayrollHistory F JOIN gold.DimDate D ON F.PeriodEndDateKey = D.DateKey WHERE D.CalendarDate >= @StartingDate); Returns 2365 rows as it should Query #2: DECLARE @StartingDate DATE = DATEADD(YEAR, -10, DATETRUNC(year, GETDATE())); select * from rpt.vw_DimGLAccount where [GL Account Key] in (SELECT F.LiabilityGLAccountKey FROM gold.FactPayrollHistory F JOIN gold.DimDate D ON F.PeriodEndDateKey = D.DateKey WHERE D.CalendarDate >= @StartingDate); Should return 67 rows but returns the same 2365 rows as the first query. If I run these in reverse order, when the second query returns the same results as the first one run. It looks like the DW optimizer thinks these 2 queries are the same. It doesn't recognize that the the queries have different column names in the subquery. If I change the "select *" in each query to a "select count(*)", the numbers returned are correct. When I then change the "count(*)" back to "*", it returns the same result for both. Also ... the process with this logic has been running successfully in production since May 2025. It is only in the last few weeks that this has been a problem. I've tested this with other keys that reference the same dimension and can consistently re-create the issue. When the second query runs, it notes that it is using cached results.
... View more