Power BI is turning 10! Tune in for a special live episode on July 24 with behind-the-scenes stories, product evolution highlights, and a sneak peek at what’s in store for the future.
Save the dateJoin us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.
Hi Experts,
I am trying the PoC of a project via loading data from excel file(just taking excel as an example.) and transform in python dataframe, then load to Power bi. . It runs well in PyDev. But when running in Powerbi, the columns used for groupby are missing. Please could I have your help?
Thanks!
Environment:
Windows 10
Powerbi desktop Version: 2.108.997.0 64-bit (August 2022)
Python3.7
Matplotlib 3.21
Numpy 1.18.4
Pandas 1.0.1
Python codes:
import pandas as pd
import openpyxl as opxls
#===============Read Workgroup Mapping==========
df_orginal=pd.read_excel(open('C:\local\Porject\Configuration\groupby_output_test.xls', 'rb'), sheet_name='groupby_output_test')
print(df_orginal)
print('====================1================================')
df_groupby_date=df_orginal.groupby(by=['Date']).sum()
print(df_groupby_date)
print('======================2==============================')
df_groupby_date_sales=df_orginal.groupby(by=['Date', 'Sales']).sum()
print(df_groupby_date_sales)
print("---------------end------------")
PyDev (groupby) result:
Power BI result
Before groupby, looks fine.
After Groupby[Date]), Date column missing
After Groupby[Date] and [Sales]), these 2 columns missing
Look at your Python output. It is pivoted which results in the column names for Date and Sales moving down a row. When Power Query then tries to read the data it cannot find the column names.
Feed the raw data to Power Query, not the processed data.
By the way, none of this requires Python. It will be much faster with native functions.
Thanks lbendlin. Yes I noticed it. I think it is due to how Power BI interpretes the dataframe which is pivtol. I did export the groupbyed dataframe to csv. the format is correct. For the above scenario, your are right, no need of python. but the project might have more complex logics, for example , the order of different queries, and the results criterias matching. I had a post before, but didn't seem to be possible just in PQ or PBI domians.
The current approach is to use Python to proscess the data and load to PBI for visualization.
Any other better solution?
Thanks!
Power Query and VBA do not mix well. They have very different philosophies. I assume you have seen this advice
the BackgroundQuery=True must be set. otherwise, if it set to False the afterfresh event will not be triggered
Check out the July 2025 Power BI update to learn about new features.
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
User | Count |
---|---|
63 | |
59 | |
56 | |
38 | |
29 |
User | Count |
---|---|
82 | |
62 | |
45 | |
41 | |
40 |