The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hello
I've recently started using Python when building queries for some more advanced data manipulation. However, it seems like Python (more specifically the Pandas library) automatically assumes the text value "NA" should be interpreted as NaN (i.e. a null value). This is problematic, for example, when working with country ISO codes, where "NA" represents Namibia. Example:
Original data:
Python script (empty, i.e. no manipulations taking place):
= Python.Execute("",[dataset=Source])
Resulting data:
From some quick Internet research, this appears to be standard functionality of the Pandas library, but can be avoided by supplying an additional argument when creating the DataFrame (see e.g. https://stackoverflow.com/questions/16596188/pandas-convert-na-to-nan). However, since Power BI creates the DataFrame before the script is executed, I can't prevent this unintended type conversion.
Has anyone encountered this problem before, or can anyone provide input on how to circumvent it?
Thanks in advance.
Hi @Anonymous
A temperory method you may have a try:
Replace the value "NA" with another spefic value before executing python script in Power BI,
After importing the data in Power BI, use Query editor to replace the value (Queries editor->Transform->Replace values).
Best Regards
Maggie
Thanks for your response!
This does seem to work, but it is of course a bit of a hack, so I hope a more elegant solution could be found in the future. It does seem like Python/Pandas is the culprit here, but ideally Power BI would provide an option to bypass this strange behaviour.
Should I do anything more to report this, or is this post enough?