The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello everyone.
I'm currently working with a huge dataset with 3 important columns: [serial] - the identification of a machine | [status] - a status code generated by a machine | [timestamp] - the date a machine generated the status that look like this:
Serial | Status | Timestamp |
1 | A | 01/06/2025 |
1 | B | 01/06/2025 |
2 | A | 01/06/2025 |
1 | A | 02/06/2025 |
2 | B | 02/06/2025 |
a machine can output several (different) status a day
Then I created a calculated table with specific serials and specific status, now I need a calculeted column to tell how many consecutive days a machine outputed the status. In the case of the example above the new table should look like this:
Serial | Status | Last occurrence | Consecutive days |
1 | A | 02/06/2025 | 2 |
2 | A | 01/06/2025 | 1 |
For the [Last occurrence] I managed to make it propperly.
The problem is that I can't create [Consecutive days] because the dataset I'm working with is too huge. I always end up with the errors "the resultset of a external query is over 1M" error message or "OLE or ODBC error".
Need optimized ways to get this calculation done.
Solved! Go to Solution.
Hi @RomuloLima ,
Thanks for using Microsoft Fabric Community to post your query.
Calculated columns on large datasets in DirectQuery mode can often lead to performance issues or limits being hit. Specifically, when the logic involves row-by-row evaluations across millions of records, you might encounter errors such as:
"The resultset of a query to external data source has exceeded the maximum allowed size of '1000000' rows"
These occur because Analysis Services enforces a default limit of 1 million rows per intermediate query result in DirectQuery mode.
Reference: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-use-directquery#1-million-row-limit
While it's technically possible to increase this threshold in SSAS or Premium capacities by modifying the MaxIntermediateRowsetSize setting (in msmdsrv.ini), it’s important to note:
Even if you raise this threshold, large DirectQuery result sets can still lead to memory and performance issues. So it's usually better to redesign the logic to avoid returning such large rowsets in the first place.
To avoid these limitations and improve scalability, as mentioned by @Ritaf1983:
--Pre-aggregate data using Power Query (Remove Duplicates, Group By, etc.) during the load step.
--Create Import-mode aggregation tables for key columns like [Serial], [Status], or [Timestamp] and configure them to serve DirectQuery models.
Also as suggested by @Ritaf1983 to use measures, I recreated the scenario using a sample table with [Serial], [Status], and [Timestamp] values as below.
DAX Measures used:
Last Occurrence =
CALCULATE (
MAX ( 'Table'[Timestamp] ),
FILTER (
'Table',
'Table'[Serial] = SELECTEDVALUE ( SerialStatus[Serial] ) &&
'Table'[Status] = SELECTEDVALUE ( SerialStatus[Status] )
)
)
Consecutive Days =
VAR SelectedSerial = SELECTEDVALUE ( SerialStatus[Serial] )
VAR SelectedStatus = SELECTEDVALUE ( SerialStatus[Status] )
VAR DatesTable =
CALCULATETABLE (
VALUES ( 'Table'[Timestamp] ),
'Table'[Serial] = SelectedSerial,
'Table'[Status] = SelectedStatus
)
VAR SortedDates =
ADDCOLUMNS (
DatesTable,
"Index", RANKX ( DatesTable, 'Table'[Timestamp], , ASC ),
"GroupKey", 'Table'[Timestamp] - RANKX ( DatesTable, 'Table'[Timestamp], , ASC )
)
VAR Grouped =
GROUPBY (
SortedDates,
[GroupKey],
"GroupCount", COUNTX ( CURRENTGROUP (), [Timestamp] ),
"MaxDate", MAXX ( CURRENTGROUP (), [Timestamp] )
)
VAR LatestDate = [Last Occurrence]
VAR Result =
MAXX (
FILTER ( Grouped, [MaxDate] = LatestDate ),
[GroupCount]
)
RETURN Result
Output(Table Visual):
Hope this helps. Please reach out for further assistance.
Please consider marking the helpful reply as Accepted Solution to assist others with similar issues.
Thank you.
Please find the attached .pbix for reference.
Hi @RomuloLima ,
Following up to see if your query is resolved and if any of the responses helped.
If you still need assistance, feel free to reach out.
Thank you.
Hi @RomuloLima ,
We wanted to kindly follow up regarding your query. If you need any further assistance, please reach out.
Thank you.
Hi @RomuloLima ,
Thanks for using Microsoft Fabric Community to post your query.
Calculated columns on large datasets in DirectQuery mode can often lead to performance issues or limits being hit. Specifically, when the logic involves row-by-row evaluations across millions of records, you might encounter errors such as:
"The resultset of a query to external data source has exceeded the maximum allowed size of '1000000' rows"
These occur because Analysis Services enforces a default limit of 1 million rows per intermediate query result in DirectQuery mode.
Reference: https://learn.microsoft.com/en-us/power-bi/connect-data/desktop-use-directquery#1-million-row-limit
While it's technically possible to increase this threshold in SSAS or Premium capacities by modifying the MaxIntermediateRowsetSize setting (in msmdsrv.ini), it’s important to note:
Even if you raise this threshold, large DirectQuery result sets can still lead to memory and performance issues. So it's usually better to redesign the logic to avoid returning such large rowsets in the first place.
To avoid these limitations and improve scalability, as mentioned by @Ritaf1983:
--Pre-aggregate data using Power Query (Remove Duplicates, Group By, etc.) during the load step.
--Create Import-mode aggregation tables for key columns like [Serial], [Status], or [Timestamp] and configure them to serve DirectQuery models.
Also as suggested by @Ritaf1983 to use measures, I recreated the scenario using a sample table with [Serial], [Status], and [Timestamp] values as below.
DAX Measures used:
Last Occurrence =
CALCULATE (
MAX ( 'Table'[Timestamp] ),
FILTER (
'Table',
'Table'[Serial] = SELECTEDVALUE ( SerialStatus[Serial] ) &&
'Table'[Status] = SELECTEDVALUE ( SerialStatus[Status] )
)
)
Consecutive Days =
VAR SelectedSerial = SELECTEDVALUE ( SerialStatus[Serial] )
VAR SelectedStatus = SELECTEDVALUE ( SerialStatus[Status] )
VAR DatesTable =
CALCULATETABLE (
VALUES ( 'Table'[Timestamp] ),
'Table'[Serial] = SelectedSerial,
'Table'[Status] = SelectedStatus
)
VAR SortedDates =
ADDCOLUMNS (
DatesTable,
"Index", RANKX ( DatesTable, 'Table'[Timestamp], , ASC ),
"GroupKey", 'Table'[Timestamp] - RANKX ( DatesTable, 'Table'[Timestamp], , ASC )
)
VAR Grouped =
GROUPBY (
SortedDates,
[GroupKey],
"GroupCount", COUNTX ( CURRENTGROUP (), [Timestamp] ),
"MaxDate", MAXX ( CURRENTGROUP (), [Timestamp] )
)
VAR LatestDate = [Last Occurrence]
VAR Result =
MAXX (
FILTER ( Grouped, [MaxDate] = LatestDate ),
[GroupCount]
)
RETURN Result
Output(Table Visual):
Hope this helps. Please reach out for further assistance.
Please consider marking the helpful reply as Accepted Solution to assist others with similar issues.
Thank you.
Please find the attached .pbix for reference.
Hi @RomuloLima ,
Just wanted to check if your query is resolved and if the response provided was helpful. If further assistance is needed, please reach out.
Thank you.
Hi @RomuloLima
It’s not entirely clear why you need a calculated table for this.
If your goal is simply to display a result like the one shown in your screenshot (showing the last occurrence and count of consecutive days), you could use measures like:
MAX([Timestamp])
COUNTROWS(...)
These are very efficient and should work well even on large datasets.
However, if you’re trying to build an aggregated table for further logic and the calculated table fails due to performance/resource issues, here are some better alternatives:
1. Pre-aggregate the table at the data source (e.g., SQL) and load only the summary to Power BI.
2. Use a native SQL query during data import in Power BI to push the logic to the source system and reduce the volume upfront.
Docs: Import data using native database query
3. Apply aggregations using the Group By
feature in Power Query (PQ)
Docs: Group by in Power Query
4. Use Power BI’s aggregate table capabilities
Docs: Use aggregate tables in Power BI
If this post helps, then please consider Accepting it as the solution to help the other members find it more quickly.