cancel
Showing results for 
Search instead for 
Did you mean: 
Reply
valcat27
Helper III
Helper III

IndexError: index 1323169739 is out of bounds for axis 0 with size 1322985896

Hello all, 

 

I'm pretty new to Power BI and I'm facing some difficulties in running python script.

 

In fact, I'm getting an error in Power Query when running python script, for which I couldn't find a solution that works.

 

I have already worked with the latest version of python (3.9), but at this moment I'm working with version 3.6 because I suspect it may be more compatible with Power BI. Moreover, the packages used in the code have also been installed and imported.

 

My code: 

import pandas as pd

pd.crosstab(index=dataset[‘clientID’], columns=dataset[‘productID’])

 

The error:

DataSource.Error: ADO.NET: Python script error.

Traceback (most recent call last):

  File "PythonScriptWrapper.PY", line 15, in <module>

    pd.crosstab(index=dataset['clientID'], columns=dataset['productID'])

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab

    **kwargs

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6089, in pivot_table

    observed=observed,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 127, in pivot_table

    table = agged.unstack(to_unstack)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6393, in unstack

    return unstack(self, level, fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 412, in unstack

    return _unstack_frame(obj, level, fill_value=fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 442, in _unstack_frame

    constructor=obj._constructor,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 142, in __init__

    self._make_selectors()

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 177, in _make_selectors

    mask.put(selector, True)

IndexError: index 1323169739 is out of bounds for axis 0 with size 1322985896

 

Details:

    DataSourceKind=Python

    DataSourcePath=Python

    Message=Python script error.

Traceback (most recent call last):

  File "PythonScriptWrapper.PY", line 15, in <module>

    pd.crosstab(index=dataset['clientID'], columns=dataset['productID'])

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab

    **kwargs

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6089, in pivot_table

    observed=observed,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 127, in pivot_table

    table = agged.unstack(to_unstack)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6393, in unstack

    return unstack(self, level, fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 412, in un...

    ErrorCode=-2147467259

    ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException

 

 

I would appreciate if someone could help me.

4 REPLIES 4
valcat27
Helper III
Helper III

Thank you for your answer. 

If pandas crosstab has a limit, I cannot find it... 

Both columns have about 1 263 000 rows. ClienteID has almost 40 000 unique values and ProductID has more than 460 000 unique values. 

So you should expect a crosstab with 18.4 billion cells.  A bit rich.

I have already tried with a sample of the dataset and I think it works. However, when running python script, a decimal place is added to "ProductID". For example, before I had "12345" and now "12345.0". I tried to round it in python but it says it is a string, so I cannot understand where it came from. I think it happens before applying crosstab function since in the input dataset (created after running the script), this change is already there.
Also, when applying the power query changes it returns the error: Fail to save modifications on the server. Error returned: 'The SUM function only accepts column reference as the argument number 1. The '215965.0' column does not exist in the rowset. ' .

lbendlin
Super User
Super User

What is the cardinality of your clientID and ProductID columns?  Looks like you are hitting a pandas crosstab limit.  Might want to try with a smaller dataset.

Helpful resources

Announcements
PBI Sept Update Carousel

Power BI September 2023 Update

Take a look at the September 2023 Power BI update to learn more.

Learn Live

Learn Live: Event Series

Join Microsoft Reactor and learn from developers.

Dashboard in a day with date

Exclusive opportunity for Women!

Join us for a free, hands-on Microsoft workshop led by women trainers for women where you will learn how to build a Dashboard in a Day!

MPPC 2023 PBI Carousel

Power Platform Conference-Power BI and Fabric Sessions

Join us Oct 1 - 6 in Las Vegas for the Microsoft Power Platform Conference.

Top Solution Authors
Top Kudoed Authors