Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi all, I am trying to build a word cloud (using ngrams) to read from a free text column using python. I face this error in PBI but it dont seem like the issue is on the script we wrote. Hence, anyone can help us to elaborate what is the issue here?
DataSource.Error: ADO.NET: Python script error.
Traceback (most recent call last):
File "PythonScriptWrapper.PY", line 7, in <module>
dataset = pandas.read_csv('input_df_xxxxxxxxx.csv')
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 454, in _read
data = parser.read(nrows)
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 1133, in read
ret = self._engine.read(nrows)
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 2037, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory
File "pandas\_libs\parsers.pyx", line 929, in pandas._libs.parsers.TextReader._read_rows
File "pandas\_libs\parsers.pyx", line 916, in pandas._libs.parsers.TextReader._tokenize_rows
File "pandas\_libs\parsers.pyx", line 2071, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Buffer overflow caught - possible malformed input file.
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
Traceback (most recent call last):
File "PythonScriptWrapper.PY", line 7, in <module>
dataset = pandas.read_csv('input_df_XXXXXXXX.csv')
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 454, in _read
data = parser.read(nrows)
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 1133, in read
ret = self._engine.read(nrows)
File "C:\USERS\XXXXF\APPDATA\LOCAL\CONTINUUM\ANACONDA3\lib\site-packages\pandas\io\parsers.py", line 2037, in read
data = self._reader.read(nrows)
File "pandas\_libs\parsers.pyx", line 860, in pandas._libs.parsers.TextReader.read
File "pandas\_libs\parsers.pyx", line 875, in pandas._libs.parsers.TextReader._read_low_memory
File...
ErrorCode=-2147467259
ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException
as the error says you are consuming too much memory. Try reducing the data set, or check if your Python code has a runaway recurrency somewhere.
thank you Ibendlin, i have actually reduce the column to 3 columns and row to less than 3000 rows. but i guess its the description column which contains free text that is eating up the memory. Thanks for the sharing!
Power Query can distinguish between quoted line feeds and "regular" csv files. This is especially important when your free text column may contain line feeds. Not sure if your panda importer is equally smart.
So maybe import the csv in Power Query and then run your python script against the imported rows?
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Power BI update to learn about new features.
User | Count |
---|---|
14 | |
13 | |
8 | |
8 | |
7 |
User | Count |
---|---|
17 | |
11 | |
7 | |
6 | |
6 |