Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Below is the code that I have got that is working but I have been trying to replace the local source excel files with Power Bi query tables which I have already created, called 'Contract Mating' for Contract and 'Contract Mating Invoiced Sales' for Credit.
import pandas as pd
import numpy as np
import os
from openpyxl import load_workbook
from datetime import datetime, date
# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
# Change directory to the appropriate path
os.chdir(r'C:\Users\cosullivan\Documents\Contract Mating')
Contract_Mating_Listing = 'Return_list_27_June'
Test = 'Test'
Version = '_V4'
# Bring in the file that includes credit amounts
Credit = pd.read_excel(Test + '.xlsx', header=0, sheet_name='IT_Data')
# Bring in the file that includes contract_returns and mating recommendations
Contract = pd.read_excel(Contract_Mating_Listing + '.xlsx', header=0)
# Keep the herds that are in the credit file for testing purposes
Contracts = Contract[Contract['HERD_Dam'].isin(Credit['HERD_Dam'])].reset_index()
# Calculate the number of dams that received each sire recommendation per herd
Contract_dams_Herd_sire = Contracts.groupby(['HERD_Dam', 'Sire']).agg({'Dam': ['count']}).reset_index()
Contract_dams_Herd_sire.columns = ['HERD_Dam', 'Sire', 'No_of_Dams_Recommended_to_sire']
Contract_dams_Herd_sire['Max_number_of_straws_to_credit_per_sire'] = Contract_dams_Herd_sire['No_of_Dams_Recommended_to_sire'] * 2
# Calculate the number of dams recommended for matings in each herd
Contract_dams_herd = Contracts.groupby('HERD_Dam')['Dam'].nunique().reset_index()
Contract_dams_herd.columns = ['HERD_Dam', 'No_of_Dams_Contracted_to_herd']
Contract_dams_herd['Max_number_of_straws_to_credit_per_herd'] = Contract_dams_herd['No_of_Dams_Contracted_to_herd'] * 2
# Merge summary herd stats
Contract_summary = pd.merge(left=Contract_dams_Herd_sire, right=Contract_dams_herd, how='outer', left_on='HERD_Dam', right_on='HERD_Dam')
# Merge in herd stats with the sales figures from IT
Contract_Credit = pd.merge(left=Credit, right=Contract_summary, how='outer', left_on=['HERD_Dam', 'Sire'], right_on=['HERD_Dam', 'Sire'])
# Replace any sires who were contracted but have not been purchased to 0
Contract_Credit['Total DIY Quantity purchased'] = Contract_Credit['Total DIY Quantity purchased'].fillna(0)
# Replace any sires who were contracted but have no value to 0
Contract_Credit['Average cost per straw Amount'] = Contract_Credit['Average cost per straw Amount'].fillna(0)
# Calculate the maximum number of straws to credit based on purchases
def potential_straw_credit(x):
if x['Max_number_of_straws_to_credit_per_sire'] >= x['Total DIY Quantity purchased']:
return x['Total DIY Quantity purchased']
else:
return x['Max_number_of_straws_to_credit_per_sire']
Contract_Credit['Potential_Straws_to_credit'] = Contract_Credit.apply(potential_straw_credit, axis=1)
# Sort by the average cost of the straw per herd so that you end up crediting for the more expensive ones first
Contract_Credit = Contract_Credit.sort_values(by=['HERD_Dam', 'Average cost per straw Amount'], ascending=False).reset_index(drop=True)
# Cumulative sum the straws that were purchased off what needs to be credited
Contract_Credit['cumsum_herd'] = Contract_Credit.groupby(['HERD_Dam'])['Potential_Straws_to_credit'].cumsum()
Contract_Credit['previous'] = Contract_Credit.groupby(['HERD_Dam'])['cumsum_herd'].shift(1).fillna(0)
# Function to work out should the purchase be credited - any minus will be blanked as enough has been credited
def Cumsum(x):
if x['cumsum_herd'] <= x['Max_number_of_straws_to_credit_per_herd']:
return x['Potential_Straws_to_credit']
elif x['cumsum_herd'] > x['Potential_Straws_to_credit']:
return x['Potential_Straws_to_credit'] - x['previous']
else:
return 0
Contract_Credit['Straws_to_credit'] = Contract_Credit.apply(Cumsum, axis=1)
Contract_Credit['Straws_to_credit'] = np.where(Contract_Credit['Straws_to_credit'] < 0, 0, Contract_Credit['Straws_to_credit'])
# Calculate the amount to credit
Contract_Credit['Amount_to_credit'] = Contract_Credit['Straws_to_credit'] * Contract_Credit['Average cost per straw Amount']
# Summarize the total credit amount per herd
Contract_Credit_Per_Herd = Contract_Credit.groupby(['HERD_Dam']).agg({'Amount_to_credit': ['sum']}).reset_index()
Contract_Credit_Per_Herd.columns = ['HERD_Dam', 'Total_Credit_Amount (€)']
# Output final DataFrame to Power BI
dataset = Contract_Credit_Per_Herd
Solved! Go to Solution.
We ended up getting what we needed by recreating what the script was doing by just implementing the steps in Power BI but thanks anyway for the help!
I am getting the following error: Formula.Firewall: Query 'Contract Mating' (step 'Run Python script') references other queries or steps, so it may not directly access a data source. Please rebuild this data combination.
import pandas as pd
import numpy as np
data = dataset
# Split the combined dataset into 'Credit' and 'Contract' DataFrames based on the 'Source' column
Credit = data[data['Source'] == 'Contract Mating Invoiced Sales'].copy()
Contract = data[data['Source'] == 'Contract Mating'].copy()
# Bring back all the herds in the contract file
Contracts = Contract.reset_index(drop=True)
# Calculate the number of dams that received each sire recommendation per herd
Contract_dams_Herd_sire = Contracts.groupby(['HERD_Dam', 'Sire']).agg({'Dam': ['count']}).reset_index()
Contract_dams_Herd_sire.columns = ['HERD_Dam', 'Sire', 'No_of_Dams_Recommended_to_sire']
Contract_dams_Herd_sire['Max_number_of_straws_to_credit_per_sire'] = Contract_dams_Herd_sire['No_of_Dams_Recommended_to_sire'] * 2
# Calculate the number of dams recommended for matings in each herd
Contract_dams_herd = Contracts.groupby('HERD_Dam')['Dam'].nunique().reset_index()
Contract_dams_herd.columns = ['HERD_Dam', 'No_of_Dams_Contracted_to_herd']
Contract_dams_herd['Max_number_of_straws_to_credit_per_herd'] = Contract_dams_herd['No_of_Dams_Contracted_to_herd'] * 2
# Merge summary herd stats
Contract_summary = pd.merge(left=Contract_dams_Herd_sire, right=Contract_dams_herd, how='outer', left_on='HERD_Dam', right_on='HERD_Dam')
# Merge in herd stats with the sales figures from IT
Contract_Credit = pd.merge(left=Credit, right=Contract_summary, how='outer', left_on=['HERD_Dam', 'Sire'], right_on=['HERD_Dam', 'Sire'])
# Replace any sires who were contracted but have not been purchased to 0
Contract_Credit['Total DIY Quantity purchased'] = Contract_Credit['Total DIY Quantity purchased'].fillna(0)
# Replace any sires who were contracted but have no value to 0
Contract_Credit['Average cost per straw Amount'] = Contract_Credit['Average cost per straw Amount'].fillna(0)
# Calculate the maximum number of straws to credit based on purchases
def potential_straw_credit(x):
if x['Max_number_of_straws_to_credit_per_sire'] >= x['Total DIY Quantity purchased']:
return x['Total DIY Quantity purchased']
else:
return x['Max_number_of_straws_to_credit_per_sire']
Contract_Credit['Potential_Straws_to_credit'] = Contract_Credit.apply(potential_straw_credit, axis=1)
# Sort by the average cost of the straw per herd so that you end up crediting for the more expensive ones first
Contract_Credit = Contract_Credit.sort_values(by=['HERD_Dam', 'Average cost per straw Amount'], ascending=False).reset_index(drop=True)
# Cumulative sum the straws that were purchased off what needs to be credited
Contract_Credit['cumsum_herd'] = Contract_Credit.groupby(['HERD_Dam'])['Potential_Straws_to_credit'].cumsum()
Contract_Credit['previous'] = Contract_Credit.groupby(['HERD_Dam'])['cumsum_herd'].shift(1).fillna(0)
# Function to work out should the purchase be credited - any minus will be blanked as enough has been credited
def Cumsum(x):
if x['cumsum_herd'] <= x['Max_number_of_straws_to_credit_per_herd']:
return x['Potential_Straws_to_credit']
elif x['cumsum_herd'] > x['Potential_Straws_to_credit']:
return x['Potential_Straws_to_credit'] - x['previous']
else:
return 0
Contract_Credit['Straws_to_credit'] = Contract_Credit.apply(Cumsum, axis=1)
Contract_Credit['Straws_to_credit'] = np.where(Contract_Credit['Straws_to_credit'] < 0, 0, Contract_Credit['Straws_to_credit'])
# Calculate the amount to credit
Contract_Credit['Amount_to_credit'] = Contract_Credit['Straws_to_credit'] * Contract_Credit['Average cost per straw Amount']
# Summarize the total credit amount per herd
Contract_Credit_Per_Herd = Contract_Credit.groupby(['HERD_Dam']).agg({'Amount_to_credit': ['sum']}).reset_index()
Contract_Credit_Per_Herd.columns = ['HERD_Dam', 'Total_Credit_Amount (€)']
# Output final DataFrame to Power BI
result = Contract_Credit_Per_Herd
Please show the Power Query steps before and after the Python script step.
CombinedData Table
CombinedData Table
Contract Mating Invoiced Sales Table
Contract Mating Table
CombinedData Table
What's the source of the CombinedData table? You may have to run the script in each of the individual sources.
We ended up getting what we needed by recreating what the script was doing by just implementing the steps in Power BI but thanks anyway for the help!
Hi,@COS019970
We are very glad to know that the issue has been resolved. If you wish, consider accepting your solution as a solution that will also benefit other community members who have the same problem as you and find a solution faster.
Of course, if there is anything else we can do for you, please do not hesitate to contact us.
Looking forward to your reply.
Best Regards,
Leroy Lu
Thanks for your sharing, but I have a question
How can I change your data source of my report with "DirectQuery connection" on service by local source (may be code), due to I want to reuse my report on local.
Thanks your response!
You need to reference a prior step in your Power Query that acts as a sort of dataframe. And you are required to return something at the end, ideally in a similar table format.
Thanks for your reply that is what chatgpt was telling me to do and was trying to follow it but was not working. Could you tell me how to do this please?
how to do what? There is a standard process where Power Query uses one of its connectors to acquire the raw data from an initial source. You seem to be attempting to circumvent that process and use your own data acquisition. What made you choose that approach?
The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now!
| User | Count |
|---|---|
| 40 | |
| 37 | |
| 33 | |
| 29 | |
| 27 |
| User | Count |
|---|---|
| 134 | |
| 104 | |
| 63 | |
| 60 | |
| 55 |