Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.

Reply
travelsandbooks
Frequent Visitor

PowerQuery: python script converts unrelated text columns to int

I have a table with a 19-digit numerical ID column, formatted as text, and another text column.

 

When I run a Python script in PowerQuery which manipulates the data in the text column, I've noticed that the ID column gets auto-converted to numbers - even when I don't reference that column in my script.

 

Initially my ID column looks like this:

travelsandbooks_0-1666352543705.png

 

Then I run the script and it looks like this:

travelsandbooks_3-1666352719191.png

 

I noticed that after running the script, PowerQuery automatically inserts a step that converts the ID column to Int64 - but even when I delete that step and prevent that happening, the Python script itself still converts the column to this:

travelsandbooks_2-1666352679224.png

 

How can I prevent the Python script from doing anything to my ID column, please?

1 ACCEPTED SOLUTION
travelsandbooks
Frequent Visitor

The solution to this was that the ID column, despite being formatted as text in the data model, was not formatted as text properly.

When I tried to make it the key column of the table, PowerBI told me that it had nulls, and therefore it couldn't be used.

I removed the null, and hey presto - the Python script worked as it should.

View solution in original post

3 REPLIES 3
travelsandbooks
Frequent Visitor

The solution to this was that the ID column, despite being formatted as text in the data model, was not formatted as text properly.

When I tried to make it the key column of the table, PowerBI told me that it had nulls, and therefore it couldn't be used.

I removed the null, and hey presto - the Python script worked as it should.

lbendlin
Super User
Super User

Don't include that column in your dataframe?

 

Maybe you can show a sanitized version of your Power Query code.

When you write a Python script, the entire dataset is passed in as "dataset". I only actually do work on the text column - I don't mention the ID column at all. That's why this chewing up of that column is so strange.

 

I could create a new table with only the text column, and do the work on that - but then joining back to the OG table using the long text string as a key doesn't feel like best practice.

 

Here is my code:

import requests 
import pandas as pd

def fix(x):
     try:
          return (requests.head(x,allow_redirects=True,timeout=1)).url
     except:
          return "Not available"

dataset['urls2'] = dataset['urls'].apply(lambda x: fix(x))

 

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Prices go up Feb. 11th.

Jan25PBI_Carousel

Power BI Monthly Update - January 2025

Check out the January 2025 Power BI update to learn about new features in Reporting, Modeling, and Data Connectivity.

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.