Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI Data Visualization World Championships is back! It's time to submit your entry. Live now!

Reply
travelsandbooks
Frequent Visitor

PowerQuery: python script converts unrelated text columns to int

I have a table with a 19-digit numerical ID column, formatted as text, and another text column.

 

When I run a Python script in PowerQuery which manipulates the data in the text column, I've noticed that the ID column gets auto-converted to numbers - even when I don't reference that column in my script.

 

Initially my ID column looks like this:

travelsandbooks_0-1666352543705.png

 

Then I run the script and it looks like this:

travelsandbooks_3-1666352719191.png

 

I noticed that after running the script, PowerQuery automatically inserts a step that converts the ID column to Int64 - but even when I delete that step and prevent that happening, the Python script itself still converts the column to this:

travelsandbooks_2-1666352679224.png

 

How can I prevent the Python script from doing anything to my ID column, please?

1 ACCEPTED SOLUTION
travelsandbooks
Frequent Visitor

The solution to this was that the ID column, despite being formatted as text in the data model, was not formatted as text properly.

When I tried to make it the key column of the table, PowerBI told me that it had nulls, and therefore it couldn't be used.

I removed the null, and hey presto - the Python script worked as it should.

View solution in original post

3 REPLIES 3
travelsandbooks
Frequent Visitor

The solution to this was that the ID column, despite being formatted as text in the data model, was not formatted as text properly.

When I tried to make it the key column of the table, PowerBI told me that it had nulls, and therefore it couldn't be used.

I removed the null, and hey presto - the Python script worked as it should.

lbendlin
Super User
Super User

Don't include that column in your dataframe?

 

Maybe you can show a sanitized version of your Power Query code.

When you write a Python script, the entire dataset is passed in as "dataset". I only actually do work on the text column - I don't mention the ID column at all. That's why this chewing up of that column is so strange.

 

I could create a new table with only the text column, and do the work on that - but then joining back to the OG table using the long text string as a key doesn't feel like best practice.

 

Here is my code:

import requests 
import pandas as pd

def fix(x):
     try:
          return (requests.head(x,allow_redirects=True,timeout=1)).url
     except:
          return "Not available"

dataset['urls2'] = dataset['urls'].apply(lambda x: fix(x))

 

Helpful resources

Announcements
FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.