Solved: Python script: Error tokenizing data

dbarrera · ‎08-02-2019

After combining several Excel documents using Power Query, I tried to transform the result using a Python script, but received the following error:

"pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 3, saw 6"

I made a bit of investigation and found that the problem might be delimiters in the data or the first row. The solutions that I found online are related to "pandas.read_csv()" to indicate the delimiter used or avoid the header in the original data. (https://stackoverflow.com/questions/18039057/python-pandas-error-tokenizing-data) However, in PowerBi the "pandas.read_csv()" is not used, since the data is already given in "dataset".

I first tried my code using anaconda and it worked fine. I would appreciate if anyone could help me to solve this issue since I didn´t found any possible solutions online. Thanks in advance.

SteveCampbell · ‎08-07-2019

Hard to say without more info on the error.

I would try, Highlight all columns, go to Transform > Format > Clean. Then doTransform > Format > Trim.

If this does not work still, try renaming each of the column headers, maybe just test with A / B / C etc to see if that works.

If still erroring, it may be useful to see the M code, if you copy the code from Home > Advanced Editor.

Love hearing about Power BI tips, jobs and news?
I love to share about these - connect with me!

Stay up to date on
Read my blogs on

Remember to spread knowledge in the community when you can!

Did I answer your question? Mark my post as a solution! Proud to be a Super User!

Connect with me!
Stay up to date on
Read my blogs on

View solution in original post

SteveCampbell · ‎08-07-2019

Hard to say without more info on the error.

I would try, Highlight all columns, go to Transform > Format > Clean. Then doTransform > Format > Trim.

If this does not work still, try renaming each of the column headers, maybe just test with A / B / C etc to see if that works.

If still erroring, it may be useful to see the M code, if you copy the code from Home > Advanced Editor.

Love hearing about Power BI tips, jobs and news?
I love to share about these - connect with me!

Stay up to date on
Read my blogs on

Remember to spread knowledge in the community when you can!

Did I answer your question? Mark my post as a solution! Proud to be a Super User!

Connect with me!
Stay up to date on
Read my blogs on

dbarrera · ‎08-08-2019

Thanks for all your help!

I don't understand why but I did all your recommendations together and it worked:

1st. Clean and trim example file before combination as you explained

2nd. Replace (,) for blank in example file before combination

3rd. Clean and trim combined table

4th. Change column name for A / B / C...

5th. Execute Python Script

Thank you very much!

SteveCampbell · ‎08-08-2019

There was something in there causing an issue. Pandas often has problems with characters - may have been an invisible character, or something odd.

Quite possible it may have had more than one issue, why only works when they all are used!

Glad it's working!!

Love hearing about Power BI tips, jobs and news?
I love to share about these - connect with me!

Stay up to date on
Read my blogs on

Remember to spread knowledge in the community when you can!

Did I answer your question? Mark my post as a solution! Proud to be a Super User!

Connect with me!
Stay up to date on
Read my blogs on

SteveCampbell · ‎08-02-2019

could you give a bit more explanation, maybe post some of the script with a screenshot example of the data input?

Did I answer your question? Mark my post as a solution! Proud to be a Super User!

Connect with me!
Stay up to date on
Read my blogs on

dbarrera · ‎08-05-2019

The code fail with something as simple as

df = dataset

The error that Power BI return is

DataSource.Error: ADO.NET: Python script error...

pandas.errors.ParserError: Error tokenizing data. C error: Expected 5 fields in line 3, saw 6

I created an environment in anaconda with Python 3.5 since I was having troubles using python 3.6 or 3.7

About the data input, I transformed and combined several Excel documents (using power query) that have the following structure:

Hope this gives a bit more information

SteveCampbell · ‎08-06-2019

If you can post a screenshot of the table before the python step, that could help.

Python actually uses pandas.read_csv() to import the data into "dataset", although you cannot see this directly:

Annotation 2019-08-06 195356.png

What I would guess is happening, is that you have commas in your field. As pandas is using read_csv it is detecting this as a delimiter, and incorrectly splitting your column.

@dbarrera wrote:
About the data input, I transformed and combined several Excel documents (using power query) that have the following structure:

You could try adding a step of Find and Replace, and replace all commas with something else (such as ; ). Do this as a first step to your files in Power Query, before you comine them.

Did I answer your question? Mark my post as a solution! Proud to be a Super User!

Connect with me!
Stay up to date on
Read my blogs on

dbarrera · ‎08-07-2019

Thanks for your reply. I did as you said

@SteveCampbell wrote:
You could try adding a step of Find and Replace, and replace all commas with something else (such as ; ). Do this as a first step to your files in Power Query, before you comine them.

I tried changing (,) for (;) and also just for blank, but is not working as well.

Following, a screenshot of the table before the python step:

Python script: Error tokenizing data

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

How to Get Your Question Answered Quickly

Join the #PBI10 DataViz contest

Python script: Error tokenizing data

Helpful resources

Join our Fabric User Panel

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

How to Get Your Question Answered Quickly