Solved: Re: Redshift and Gateway issue

gp10 · ‎03-30-2023

Hi community,
I'm facing issues when trying to create a dataflow with Amazon Redshift data. The data that need to ne imported are over 1bn rows.
I have tried to connect both with the Amazon Redshift connector and ODBC.
The evaluation in Power Query is completed.
But the dataflow refreshes only when I select a really low number of days in the "Store rows from the past" field of the incremental refresh settings.
Whenever I try to store the entire date range I'm getting errors. This is a gateway issue, are there any settings in the gateway that can help?
I have also noticed that when I try to refresh, the VM's CPU, where the gateway is installed, goes instantly to 100% and its being consumed by multiple Microsoft Mashup Evaluation Container tasks.

I'm getting different errors, among others I'm getting this error:

;Parameter name: source [GatewayPipelineErrorCode]=DM_GWPipeline_Gateway_MashupDataAccessError [ErrorShortName]=GatewayClientErrorResponseException[GatewayId=1379774]/MashupDataAccessException[ErrorCode=-2147467259 HResult=-2147467259]/Wrapped(InternalMashupException)[ErrorCode=-2147467259 HResult=-2147467259]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088]/Wrapped(ErrorException)[HResult=-2146233088] [ExceptionErrorShortName]=GatewayClientErrorResponseException[GatewayId=.....'. RootActivityId =........Param1 = <ccon>We're sorry an error occurred during evaluation.</ccon> [DM_ErrorDetailNameCode_UnderlyingErrorCode]=-2147467259 [DM_ErrorDetailNameCode_UnderlyingErrorMessage]=<ccon>We're sorry an error occurred during evaluation.</ccon> [DM_ErrorDetailNameCode_UnderlyingHResult]=-2147467259 [InnerType]=ErrorException Value cannot be null.

Error: Encountered user gateway exception: '[GatewayPipelineErrorCode]=DM_GWPipeline_Gateway_TimeoutError [ErrorShortName]=GatewayTimeoutException [ExceptionErrorShortName]=GatewayTimeoutException'. RootActivityId = e6cc17f9-ee3b-4697-b466-e2bfc290bd40.Param1 = [GatewayPipelineErrorCode]=DM_GWPipeline_Gateway_TimeoutError [ErrorShortName]=GatewayTimeoutException [ExceptionErrorShortName]=GatewayTimeoutException

Any ideas?

lbendlin · ‎04-02-2023

Then the pbix file is so big that it cant be published.

You can bootstrap the incremental refresh partitions to keep the original size under 10 GB. Then use XMLA refresh to fill each partition separately. This will then allow you to load datasets that are half the size of your capacity, and supposedly even beyond.

View solution in original post

lbendlin · ‎05-02-2023

Data duplication issues are usually caused by incorrect application of the RangeStart and RangeEnd filters. One of them must be inclusive, the other must be exclusive, and they cannot be overlapping.

Also keep in mind that incremental refresh is really bad at data changes post fact. If your data is transactional in nature then you will need to mix the incremental refresh with occasional full refreshes.

gp10 · ‎05-02-2023

Thanks for the info. My data do not change post fact.
As for the RangeStart and RangeEnd filters I'm using the following:
RangeStart
1/1/0001 12:00:00 AM
RangeEnd
12/31/9999 11:59:59 PM
and I add a between filter. The pbix file has correct data. After publishing a filtered version and following the proccess with ALM Toolkit and SSMS and the removal of the filter, the refreshed partions, especially those of the last month, return wrong data. If it was a parameter issue, shouldn't it affect the entire date range?

Thanks again for your time, patience and quality support.

lbendlin · ‎05-02-2023

As for the RangeStart and RangeEnd filters I'm using the following:
RangeStart
1/1/0001 12:00:00 AM
RangeEnd
12/31/9999 11:59:59 PM
and I add a between filter.

Please don't do that. Do not use fake dates, and do not use a between filter (as that would make both dates inclusive).

Instead, use a subset of your dates (just enough for development)

RangeStart
1/1/2022 12:00:00 AM
RangeEnd
1/1/2023 12:00:00 AM

and then in your Power Query use a filter

[Date]>=RangeStart and [Date]<RangeEnd

gp10 · ‎05-03-2023

That was it.
Thanks so much @lbendlin, amazing support and guidance. Thanks.

Ashrafsleem2020 · ‎04-02-2023

Same Issue

The operation is timed out.. The exception was raised by the IDataReader interface. Please review the error message and provider documentation for further information and corrective action. Table: beginning_balance.

Data source error:	{"error":{"code":"DM_GWPipeline_Gateway_TimeoutError","pbi.error":{"code":"DM_GWPipeline_Gateway_TimeoutError","parameters":{},"details":[],"exceptionCulprit":1}}}
Cluster URI:
Activity ID:
Request ID:
Time:	2023-03-30 10:37:34Z

lbendlin · ‎03-31-2023

ErrorException Value cannot be null

gaps in primary key columns?

What are your incremental refresh settings? Which datetime column are you using for the partitioning?

gp10 · ‎04-02-2023

Hi @lbendlin ,
no gaps in any primary key column. I have tried with remove empty in power query but still get the same error.
Also, in other attempts I was deleting the primary key column to reduce size, since its not needed, but still no luck.

I have tried various incremental refresh setttings, from trying to pull the entire dataset with a setting like
bring data from last 5 years and refresh last 10 days, to last 3 months and last 5 days for example.

I'm trying to load a single table. I'm using it's datetime column.

lbendlin · ‎04-02-2023

Instead of a dataflow try using a dataset with incremental refresh.

gp10 · ‎04-02-2023

The issue with a dataset @lbendlin is that incremental refresh doesn't seem to work with Amazon Redshift. Then the pbix file is so big that it cant be published. And if it is in a publishable size, again the first refresh will fail with similar errors from the gateway.

lbendlin · ‎04-02-2023

Then the pbix file is so big that it cant be published.

You can bootstrap the incremental refresh partitions to keep the original size under 10 GB. Then use XMLA refresh to fill each partition separately. This will then allow you to load datasets that are half the size of your capacity, and supposedly even beyond.

gp10 · ‎05-02-2023

Thanks for this suggestion @lbendlin , I have managed to do that with SSMS and ALM Toolkit.

The big issue now is that the data are not accurate. For some reason its like it duplicates the data of the last month and it may have small differences elsewhere.

Any ideas on how to deal with this, I have refreshed several times and the issue persists.

lbendlin · ‎05-02-2023

Use DAX Studio or SSMS to check the partitions and their content. Refresh individual partitions as needed.

gp10 · ‎05-02-2023

I did that several times but still the data are wrong. Especially in the last month where it creates daily instead of monthly partitions. It looks like its returning double and more than the actual rows.
Could this have to do with the Amazon Redshift power bi connector?

Redshift and Gateway issue

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026

FabCon is coming to Atlanta

Redshift and Gateway issue

Helpful resources

Power BI Monthly Update - November 2025

Fabric Data Days

FabCon Atlanta 2026