March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early bird discount ends December 31.
Register NowBe one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now
Hi
I am trying to write a simple script in Power Query Editor in PowerBI, and keep getting an error below:
DataSource.Error: ADO.NET: Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
Details:
DataSourceKind=Python
DataSourcePath=Python
Message=Python script error.
<pi>pandas.errors.MergeError: Data columns not unique: Index(['Date', 'Date'], dtype='object')
</pi>
ErrorCode=-2147467259
The python scripts works in Spyder, and I also created a separate PowerBI file to test the scripts on a more simple makeup dataset and it works! So my guess is that the actual data is the problem. How does one start to investigate where the potential data issues could cause above problem?
Below is the script if that helps.. with Excel import tables: [dataset=#"Replaced Errors", AcctList=AccountHierachy]
import pandas as pd
aList = AcctList.loc[AcctList["Type"]=="Account"]
volTbl = pd.DataFrame(columns=["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"])
for x in aList["PortfolioName"]:
data = dataset.loc[dataset["PortfolioName"] == x,["MonthEnd","PortfolioReturn","Benchmark Returns", "Excess"] ]
vol12M = data.rolling(12, min_periods = 12).std()*(12**0.5)
vol12M["Date"] = data["MonthEnd"]
vol12M=vol12M.dropna()
vol12M.rename(columns = {"PortfolioReturn":"Port_12M_vol", "Benchmark Returns":"Bmk_12M_vol", "Excess":"Port_12M_TE","MonthEnd":"Date"}, inplace=True)
vol12M["Portfolio"]=x
vol12M = vol12M[["Date","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Portfolio"]]
#print(x, vol12M.shape)
vol36M = data.rolling(36, min_periods = 36).std()*(12**0.5)
vol36M["Date"] = data["MonthEnd"]
vol36M=vol36M.dropna()
vol36M.rename(columns = {"PortfolioReturn":"Port_36M_vol", "Benchmark Returns":"Bmk_36M_vol", "Excess":"Port_36M_TE","MonthEnd":"Date"}, inplace=True)
volM = pd.merge(vol12M,vol36M, how = 'left', sort=True)
vol60M = data.rolling(60, min_periods = 60).std()*(12**0.5)
vol60M["Date"] = data["MonthEnd"]
vol60M=vol60M.dropna()
vol60M.rename(columns = {"PortfolioReturn":"Port_60M_vol", "Benchmark Returns":"Bmk_60M_vol", "Excess":"Port_60M_TE","MonthEnd":"Date"}, inplace=True)
volM = pd.merge(volM,vol60M, how = 'left', sort=True)
volTbl = volTbl.append(volM, sort=True)
volTbl = volTbl[["Date","Portfolio","Port_12M_vol", "Bmk_12M_vol", "Port_12M_TE","Port_36M_vol", "Bmk_36M_vol", "Port_36M_TE","Port_60M_vol", "Bmk_60M_vol", "Port_60M_TE"]]
print(volTbl)
I have been pulling my hair out on this one, so any help would be appreciated!!
Many thanks
Jonathan
Hi
Thanks for your help and inputs.. I have no choice at the end, but go thru my data segment and segments and see which part of the data that may be causing the issue.
So the problem at the end is that is that a few of the "PortfolioName" (x in the loop) in aList does not match the dataset table "PortfolioName", hence producing some empty dataframe which hence cannot be merged in PowerBI. This name mismatch problem did not cause a problem in Python based application however, it must somehow ignore the empty dataframe, and continue the loop.
Anyway, woud be nice to have a way to debug Python codes in PowerBI.. as it seems to behave slightly differently in PowerBI, and find it hard how to identify the issue - if is a data issue, how powerbi execute the python codes or something else....
Many thanks
Jonathan
Hi @jonoCX ,
As the error message mentioned,
pandas.errors.MergeError: Data columns not unique:
Please check whether there are multiple duplicated columns names 'Date' in your data source table when you create it by the python script.
Best Regards,
Community Support Team _ Yingjie Li
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
You may want to look into this documentation taking about how to use an external python IDE with power bi. You might be able to debug there, which I'm not entirely sure is possible. Use an external Python IDE with Power BI - Power BI | Microsoft Docs
But it looks like it's more of an error that resides in your code regarding pandas merge operation. Since it's complaining about "Data columns not unique", I'm guessing there are more than 1 column in the table where the column name is "Date". So maybe consolidating that would help.
Hope that helps!
March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!
User | Count |
---|---|
24 | |
13 | |
11 | |
10 | |
7 |
User | Count |
---|---|
43 | |
26 | |
18 | |
16 | |
11 |