Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us for an expert-led overview of the tools and concepts you'll need to become a Certified Power BI Data Analyst and pass exam PL-300. Register now.

Reply
pbi_is_ok
New Member

AutoML Data Type Error Preventing All Runs

I am trying to use AutoML for the first time (definitely a beginner with machine learning), but I can't even run the basic commands given to me because I run into the same data type error every single time I try a new AutoML run. 

Here is the cell that always causes the error:

from sklearn.pipeline import Pipeline
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer


time_col = "Time"
ts_col = X.pop(time_col)
X.insert(0, time_col, ts_col.apply(lambda x: np.datetime64(x, "ns")))

# convert object type to nearest dtype
X = X.convert_dtypes()
X = X.dropna(axis=1, how='all')

# select columns for model training
X = X.select_dtypes(include=['number', 'datetime', 'category'])

from sklearn.model_selection import train_test_split

# You may need to update the test_size based on your scenario
X_train, X_test = train_test_split(X, test_size=int(X.shape[0] / 1 * 0.2) * 1, shuffle=False, random_state=41)

mean_features, median_features, mode_features = [], [], []
 
preprocessor, all_features, datetime_features = create_fillna_processor(X_train, mean_features, median_features, mode_features)
X_train = fillna(X_train, preprocessor, all_features, datetime_features)
X_test = fillna(X_test, preprocessor, all_features, datetime_features)
 
y_train = X_train.pop(target_col)
y_test = X_test.pop(target_col)

display(X_train[:10])
The error message every time is "Cell In[19], line 8, in <lambda>(x) 6 time_col = "Time" 7 ts_col = X.pop(time_col) ----> 8 X.insert(0, time_col, ts_col.apply(lambda x: np.datetime64(x, "ns"))) 10 # convert object type to nearest dtype 11 X = X.convert_dtypes() TypeError: 'float' object cannot be interpreted as an integer". I am using two columns for my runs, Unit Sales (prediction column) and Time (time column). Time is a date data type, and unit sales are double or integer if I change it manually. Neither column is a float data type, and yet every single time I try to run AutoML, it always throws the same error at this step and I can't complete the run.
 
Why am I getting this error when neither column is a float data type, and how do I fix it?
1 ACCEPTED SOLUTION

Hi,

 

np.datetime64(nan, 'ns') will raise the exact error you're seeing, because NumPy is trying to interpret nan as a timestamp and failing.
You can perform this debugging steps to check for any nulls-

 

print(X[time_col].head())
print(X[time_col].apply(type).value_counts())
print(X[time_col].isnull().sum())

 

This will tell you what data types you actually have in that column.

 

Then proceed with this-

 

X[time_col] = pd.to_datetime(X[time_col], errors='coerce')
but make sure to call NumPy and Pandas as

 

import numpy as np
import pandas as pd

 

Hope this helps!
If the response has addressed your query, please Accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank You!

View solution in original post

6 REPLIES 6
v-prasare
Community Support
Community Support

@pbi_is_ok As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for your issue worked? or let us know if you need any further assistance here?

 

 

 

 

Thanks,

Prashanth Are

MS Fabric community support

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly and give Kudos if helped you resolve your query

v-prasare
Community Support
Community Support

@pbi_is_ok As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for your issue worked? or let us know if you need any further assistance here?

 

 

 

 

Thanks,

Prashanth Are

MS Fabric community support

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly and give Kudos if helped you resolve your query

v-prasare
Community Support
Community Support

@pbi_is_ok As we haven’t heard back from you, we wanted to kindly follow up to check if the solution provided for your issue worked? or let us know if you need any further assistance here?

 

 

 

 

Thanks,

Prashanth Are

MS Fabric community support

 

If this post helps, then please consider Accept it as the solution to help the other members find it more quickly and give Kudos if helped you resolve your query

v-prasare
Community Support
Community Support

Hi @pbi_is_ok,

 

The error you're seeing happens because some entries in your "Time" column are either missing or not in a valid date format. When Python tries to convert these invalid values into a date using  np.datetime(x, "ns"), it fails—because it can't turn something like NaN(a float) into a datetime.

To fix this, you should use pd.to_datafeame() instead, which is a safer way to convert strings to datetime in pandas. It can handle bad values by turning them into NaT(Not a Time). After that, simply remove any rows where the date is invalid before continuing with your machine learning steps. This ensures you're only working with clean, usable date values.

 

 

Thanks,

Prashanth Are

Hello,

 

I can confirm with 100% certainty that all of the values within the column are proper date data types and there are no null or missing values, so I don't think that is the issue. Also, it is ambiguous as to which command/cell the suggested code is supposed to be applied to.

Hi,

 

np.datetime64(nan, 'ns') will raise the exact error you're seeing, because NumPy is trying to interpret nan as a timestamp and failing.
You can perform this debugging steps to check for any nulls-

 

print(X[time_col].head())
print(X[time_col].apply(type).value_counts())
print(X[time_col].isnull().sum())

 

This will tell you what data types you actually have in that column.

 

Then proceed with this-

 

X[time_col] = pd.to_datetime(X[time_col], errors='coerce')
but make sure to call NumPy and Pandas as

 

import numpy as np
import pandas as pd

 

Hope this helps!
If the response has addressed your query, please Accept it as a solution and give a 'Kudos' so other members can easily find it.
Thank You!

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.

June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

June 2025 community update carousel

Fabric Community Update - June 2025

Find out what's new and trending in the Fabric community.

Top Solution Authors