Error - Python into PBI

bbbbbiiiii · ‎06-11-2020

Hi all, I am trying to build a word cloud (using ngrams) to read from a free text column using python. I face this error in PBI but it dont seem like the issue is on the script we wrote. Hence, anyone can help us to elaborate what is the issue here?

DataSource.Error: ADO.NET: Python script error.

Traceback (most recent call last):

File "PythonScriptWrapper.PY", line 7, in <module>

dataset = pandas.read_csv('input_df_xxxxxxxxx.csv')

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f

return _read(filepath_or_buffer, kwds)

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 454, in _read

data = parser.read(nrows)

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 1133, in read

ret = self._engine.read(nrows)

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 2037, in read

data = self._reader.read(nrows)

File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read

File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory

File "pandas\_libs\parsers.pyx", line 929, in pandas._libs.parsers.TextReader._read_rows

File "pandas\_libs\parsers.pyx", line 916, in pandas._libs.parsers.TextReader._tokenize_rows

File "pandas\_libs\parsers.pyx", line 2071, in pandas._libs.parsers.raise_parser_error

pandas.errors.ParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.

Details:

DataSourceKind=Python

DataSourcePath=Python

Message=Python script error.

Traceback (most recent call last):

File "PythonScriptWrapper.PY", line 7, in <module>

dataset = pandas.read_csv('input_df_XXXXXXXX.csv')

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f

return _read(filepath_or_buffer, kwds)

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 454, in _read

data = parser.read(nrows)

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 1133, in read

ret = self._engine.read(nrows)

File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 2037, in read

data = self._reader.read(nrows)

File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read

File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory

File...

ErrorCode=-2147467259

ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException

lbendlin · ‎06-13-2020

as the error says you are consuming too much memory. Try reducing the data set, or check if your Python code has a runaway recurrency somewhere.

bbbbbiiiii · ‎06-14-2020

thank you Ibendlin, i have actually reduce the column to 3 columns and row to less than 3000 rows. but i guess its the description column which contains free text that is eating up the memory. Thanks for the sharing!

lbendlin · ‎06-14-2020

Power Query can distinguish between quoted line feeds and "regular" csv files. This is especially important when your free text column may contain line feeds. Not sure if your panda importer is equally smart.

So maybe import the csv in Power Query and then run your python script against the imported rows?

Error - Python into PBI

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025

Join us at FabCon Vienna from September 15-18, 2025

Error - Python into PBI

Helpful resources

Fabric Community Update - July 2025

Power BI Monthly Update - July 2025