Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

The Power BI DataViz World Championships are on! With four chances to enter, you could win a spot in the LIVE Grand Finale in Las Vegas. Show off your skills.

Reply
travelsandbooks
Frequent Visitor

PowerQuery: python script converts unrelated text columns to int

I have a table with a 19-digit numerical ID column, formatted as text, and another text column.

 

When I run a Python script in PowerQuery which manipulates the data in the text column, I've noticed that the ID column gets auto-converted to numbers - even when I don't reference that column in my script.

 

Initially my ID column looks like this:

travelsandbooks_0-1666352543705.png

 

Then I run the script and it looks like this:

travelsandbooks_3-1666352719191.png

 

I noticed that after running the script, PowerQuery automatically inserts a step that converts the ID column to Int64 - but even when I delete that step and prevent that happening, the Python script itself still converts the column to this:

travelsandbooks_2-1666352679224.png

 

How can I prevent the Python script from doing anything to my ID column, please?

1 ACCEPTED SOLUTION
travelsandbooks
Frequent Visitor

The solution to this was that the ID column, despite being formatted as text in the data model, was not formatted as text properly.

When I tried to make it the key column of the table, PowerBI told me that it had nulls, and therefore it couldn't be used.

I removed the null, and hey presto - the Python script worked as it should.

View solution in original post

3 REPLIES 3
travelsandbooks
Frequent Visitor

The solution to this was that the ID column, despite being formatted as text in the data model, was not formatted as text properly.

When I tried to make it the key column of the table, PowerBI told me that it had nulls, and therefore it couldn't be used.

I removed the null, and hey presto - the Python script worked as it should.

lbendlin
Super User
Super User

Don't include that column in your dataframe?

 

Maybe you can show a sanitized version of your Power Query code.

When you write a Python script, the entire dataset is passed in as "dataset". I only actually do work on the text column - I don't mention the ID column at all. That's why this chewing up of that column is so strange.

 

I could create a new table with only the text column, and do the work on that - but then joining back to the OG table using the long text string as a key doesn't feel like best practice.

 

Here is my code:

import requests 
import pandas as pd

def fix(x):
     try:
          return (requests.head(x,allow_redirects=True,timeout=1)).url
     except:
          return "Not available"

dataset['urls2'] = dataset['urls'].apply(lambda x: fix(x))

 

Helpful resources

Announcements
Feb2025 Sticker Challenge

Join our Community Sticker Challenge 2025

If you love stickers, then you will definitely want to check out our Community Sticker Challenge!

Jan NL Carousel

Fabric Community Update - January 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors
Top Kudoed Authors