Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
valcat27
Helper III
Helper III

IndexError: index 1323169739 is out of bounds for axis 0 with size 1322985896

Hello all, 

 

I'm pretty new to Power BI and I'm facing some difficulties in running python script.

 

In fact, I'm getting an error in Power Query when running python script, for which I couldn't find a solution that works.

 

I have already worked with the latest version of python (3.9), but at this moment I'm working with version 3.6 because I suspect it may be more compatible with Power BI. Moreover, the packages used in the code have also been installed and imported.

 

My code: 

import pandas as pd

pd.crosstab(index=dataset[‘clientID’], columns=dataset[‘productID’])

 

The error:

DataSource.Error: ADO.NET: Python script error.

Traceback (most recent call last):

  File "PythonScriptWrapper.PY", line 15, in <module>

    pd.crosstab(index=dataset['clientID'], columns=dataset['productID'])

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab

    **kwargs

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6089, in pivot_table

    observed=observed,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 127, in pivot_table

    table = agged.unstack(to_unstack)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6393, in unstack

    return unstack(self, level, fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 412, in unstack

    return _unstack_frame(obj, level, fill_value=fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 442, in _unstack_frame

    constructor=obj._constructor,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 142, in __init__

    self._make_selectors()

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 177, in _make_selectors

    mask.put(selector, True)

IndexError: index 1323169739 is out of bounds for axis 0 with size 1322985896

 

Details:

    DataSourceKind=Python

    DataSourcePath=Python

    Message=Python script error.

Traceback (most recent call last):

  File "PythonScriptWrapper.PY", line 15, in <module>

    pd.crosstab(index=dataset['clientID'], columns=dataset['productID'])

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 577, in crosstab

    **kwargs

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6089, in pivot_table

    observed=observed,

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\pivot.py", line 127, in pivot_table

    table = agged.unstack(to_unstack)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\frame.py", line 6393, in unstack

    return unstack(self, level, fill_value)

  File "C:\USERS\...\PYTHON36\lib\site-packages\pandas\core\reshape\reshape.py", line 412, in un...

    ErrorCode=-2147467259

    ExceptionType=Microsoft.PowerBI.Scripting.Python.Exceptions.PythonScriptRuntimeException

 

 

I would appreciate if someone could help me.

4 REPLIES 4
valcat27
Helper III
Helper III

Thank you for your answer. 

If pandas crosstab has a limit, I cannot find it... 

Both columns have about 1 263 000 rows. ClienteID has almost 40 000 unique values and ProductID has more than 460 000 unique values. 

So you should expect a crosstab with 18.4 billion cells.  A bit rich.

I have already tried with a sample of the dataset and I think it works. However, when running python script, a decimal place is added to "ProductID". For example, before I had "12345" and now "12345.0". I tried to round it in python but it says it is a string, so I cannot understand where it came from. I think it happens before applying crosstab function since in the input dataset (created after running the script), this change is already there.
Also, when applying the power query changes it returns the error: Fail to save modifications on the server. Error returned: 'The SUM function only accepts column reference as the argument number 1. The '215965.0' column does not exist in the rowset. ' .

lbendlin
Super User
Super User

What is the cardinality of your clientID and ProductID columns?  Looks like you are hitting a pandas crosstab limit.  Might want to try with a smaller dataset.

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

PBI_Carousel_NL_June

Fabric Community Update - June 2024

Get the latest Fabric updates from Build 2024, key Skills Challenge voucher deadlines, top blogs, forum posts, and product ideas.

Top Solution Authors
Top Kudoed Authors