Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
jonoCX
Frequent Visitor

How to investigate Python script error

Hi

 

I am trying to write a simple script in Power Query Editor in PowerBI, and keep getting an error below:

 

DataSource.Error: ADO.NET: Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
ErrorCode=-2147467259

The python scripts works in Spyder, and I also created a separate PowerBI file to test the scripts on a more simple makeup dataset and it works! So my guess is that the actual data is the problem. How does one start to investigate where the potential data issues could cause above problem?

 

Below is the script if that helps.. with Excel import tables: [dataset=#"Replaced Errors", AcctList=AccountHierachy]

 

 

import pandas as pd
aList = AcctList.loc[AcctList["Type"]=="Account"]
volTbl = pd.DataFrame(columns=["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"])
for x in aList["PortfolioName"]:
    data = dataset.loc[dataset["PortfolioName"] == x,["MonthEnd","PortfolioReturn","Benchmark Returns", "Excess"] ]
    vol12M = data.rolling(12, min_periods = 12).std()*(12**0.5)
    vol12M["Date"] = data["MonthEnd"]
    vol12M=vol12M.dropna()
    vol12M.rename(columns = {"PortfolioReturn":"Port_12M_vol", "Benchmark Returns":"Bmk_12M_vol", "Excess":"Port_12M_TE","MonthEnd":"Date"}, inplace=True)
    vol12M["Portfolio"]=x
    vol12M = vol12M[["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"]]
    #print(x, vol12M.shape)
    vol36M = data.rolling(36, min_periods = 36).std()*(12**0.5)
    vol36M["Date"] = data["MonthEnd"]
    vol36M=vol36M.dropna()
    vol36M.rename(columns = {"PortfolioReturn":"Port_36M_vol", "Benchmark Returns":"Bmk_36M_vol", "Excess":"Port_36M_TE","MonthEnd":"Date"}, inplace=True)
    volM = pd.merge(vol12M,vol36M, how = 'left', sort=True)
    vol60M = data.rolling(60, min_periods = 60).std()*(12**0.5)
    vol60M["Date"] = data["MonthEnd"]
    vol60M=vol60M.dropna()
    vol60M.rename(columns = {"PortfolioReturn":"Port_60M_vol", "Benchmark Returns":"Bmk_60M_vol", "Excess":"Port_60M_TE","MonthEnd":"Date"}, inplace=True)
    volM = pd.merge(volM,vol60M, how = 'left', sort=True)
    volTbl = volTbl.append(volM, sort=True)
    volTbl = volTbl[["Date","Portfolio","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Port_36M_vol", "Bmk_36M_vol", "Port_36M_TE","Port_60M_vol", "Bmk_60M_vol", "Port_60M_TE"]]
print(volTbl)

 

 

I have been pulling my hair out on this one, so any help would be appreciated!!

 

Many thanks

Jonathan

3 REPLIES 3
jonoCX
Frequent Visitor

Hi

 

Thanks for your help and inputs.. I have no choice at the end, but go thru my data segment and segments and see which part of the data that may be causing the issue.

 

So the problem at the end is that is that a few of the "PortfolioName" (x in the loop) in aList does not match the dataset table "PortfolioName", hence producing some empty dataframe which hence cannot be merged in PowerBI. This name mismatch problem did not cause a problem in Python based application however, it must somehow ignore the empty dataframe, and continue the loop. 

 

Anyway, woud be nice to have a way to debug Python codes in PowerBI.. as it seems to behave slightly differently in PowerBI, and find it hard how to identify the issue - if is a data issue, how powerbi execute the python codes or something else....

 

Many thanks

 

Jonathan

v-yingjl
Community Support
Community Support

Hi @jonoCX ,

As the error message mentioned,

pandas.errors.MergeError: Data columns not unique: 


Please check whether there are multiple duplicated columns names 'Date' in your data source table when you create it by the python script.

 

Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.

YukiK
Impactful Individual
Impactful Individual

You may want to look into this documentation taking about how to use an external python IDE with power bi. You might be able to debug there, which I'm not entirely sure is possible. Use an external Python IDE with Power BI - Power BI | Microsoft Docs 

 

But it looks like it's more of an error that resides in your code regarding pandas merge operation. Since it's complaining about "Data columns not unique", I'm guessing there are more than 1 column in the table where the column name is "Date". So maybe consolidating that would help.

 

Hope that helps!

 

 

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

PBI_Carousel_NL_June

Fabric Community Update - June 2024

Get the latest Fabric updates from Build 2024, key Skills Challenge voucher deadlines, top blogs, forum posts, and product ideas.

Top Solution Authors
Top Kudoed Authors