The ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM.
Get registeredEnhance your career with this limited time 50% discount on Fabric and Power BI exams. Ends August 31st. Request your voucher.
Hi all,
I'm currently doing this fabric tutorial on youtube (https://www.youtube.com/watch?v=HLVkIsZvgzE).
Uploading the parquet files into the Lakehouse was no problem. Then one needs to convert them into tables. With some of the files this is no problem however others have spaces in the column names which cause an error and cannot be convertet directly into a tables. For this the video shows a method how one can use dataflow Gen2 to convert the column names.
After setting up the dataflow Gen2 and publishing it as shown in the video on 23:00 min I get the following error:
Data_Store_1 parquet_WriteToDataDestination: Mashup Exception Data Format Error Couldn't refresh the entity because of an issue with the mashup document MashupException.Error: DataFormat.Error: Error in replacing table's content with new data in a version: #{0}., InnerException: We can't insert values of type '#{0}' into column '#{1}' because it expects values of type '#{2}'., Underlying error: We can't insert values of type 'DateTime' into column 'Open_Date' because it expects values of type 'Date'. Details: Reason = DataFormat.Error;Message = We can't insert values of type 'DateTime' into column 'Open_Date' because it expects values of type 'Date'.;Message.Format = We can't insert values of type '#{0}' into column '#{1}' because it expects values of type '#{2}'.;Message.Parameters = {"DateTime", "Open_Date", "Date"};Microsoft.Data.Mashup.Error.Context = User
In the youtube tutorial this error does not show up, so does anyone know how I can fix this??
Kind regards
I have same issue please help to resolve the issue
Datatype is same in both place lakehouse Table and Gen2 data flow
We can't insert values of type 'DateTime' into column 'Date' because it expects values of type 'Date
what should i do in that case
Of course type DateTime is not the same as type "date". You need to explicitly convert type "datetime" to type "date", which will drop the time component of type "datetime".
FYI type "time" does not exist in lakehouses. But you can still store time information by using the lakehouse type "timestamp", which will store date and time info. In DFgen2 the equivalent type is called "datetime".
So if your lakehouse column is of type Date, but also needs to hold time information, then you need to change the column type to "timestamp".
Here is just an example on what I am doing with type mapping when using a Copy data activity in a pipeline. This shows you the destination type of a lakehouse that I use as a sink. As you can see, the sink, or lakehouse destination type is set to "timestamp" to store an incoming SQL Server data of type "datetime2".
But if a pipeline is not an option, you will have to drop down to a PySpark Notebook and run some code to effect a type change on your lakehouse column, like so:
from delta.tables import DeltaTable
from pyspark.sql.functions import col
from pyspark.sql.types import TimestampType
# 1. Specify your table
catalog_table = "your_catalog.your_schema.your_table" # e.g. "LH_name.raw_events"
# 2. Load the Delta table as a DataFrame
df = spark.table(catalog_table)
# 3. Cast the column from Date to Timestamp
df_converted = df.withColumn(
"your_date_column",
col("your_date_column").cast(TimestampType())
)
# 4. Overwrite the existing table with the new schema
df_converted.write.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.saveAsTable(catalog_table)
Best practices
Backup / dev test: Try this first on a development copy of your table to validate downstream queries.
Partitions: If your table is partitioned, include those same partition columns in your rewrite to avoid full-table shuffles. For example:
df_converted.write \
.format("delta") \
.mode("overwrite") \
.option("overwriteSchema", "true") \
.partitionBy("year", "month") \
.saveAsTable(catalog_table)
Zero-downtime pattern: If you need to avoid interrupting readers, write to a temporary table, then swap names:
Write to your_table_tmp
spark.sql("DROP TABLE your_table")
spark.sql("ALTER TABLE your_table_tmp RENAME TO your_table")
That’s all you need to recast a date column to a full timestamp/datetime in Fabric’s Delta Lakehouse via PySpark.
Hope this helps.
If the source query is staged, try disable staging as I suggested in my comment of 03-21-2024.
Hi @Janica123,
can you please try disabling staging for the query which has the data destination (based on you initial comment it should be Data Store_1 parquet) and then re-publish your dataflow?
As a reminder, you can disable staging by right-clicking on the query, and unchecking the "Enable staging" menu item.
Thanks,
Alessandro
I haven't done the tutorial but did get a similar error when trying to extract from a table even though source and target tables had the same type. The dataset import appears to sometimes guess the wrong type so you have to manually override what it says - in your case, change the datatype of the "Open_Date" column
Thanks for your reply, I already tried to overriede the Date-Type manually in the Dataflow but it didn't change anything...
Hi @Janica123
Thanks for using Microsoft Fabric Community.
Apologize for the inconvenience that you are facing here and the delay in response from my end.
Please reach out to our support team to gain deeper insights and explore potential solutions, If its a bug, we will definitely would like to know and properly address it. It's highly recommended that you reach out to our support team. Their expertise will be invaluable in suggesting the most appropriate approach.
Please go ahead and raise a support ticket to reach our support team:
https://support.fabric.microsoft.com/support
After creating a Support ticket please provide the ticket number as it would help us to track for more information.
Thank you.
To which table does this column 'Open_Date' belong? Have you tried changing its type from 'datetime' to 'date' using the Editor GUI?
Can you go back to the Data Destination settings and show a screenshot of the column and data types mappings that show in there for the table?
Thanks for your reply!
Here a screenshot of the column and data types mappings for the table
Remark: The columns in my source originally have spaces. In the Data Destination settings Fabric automatically suggest me to fix this, since they are not allowed in destination. Thats reason for the information on the top
This is strange.. I remember having the same issue once and I fixed it by opening the Advanced Editor and adding {"DATE", type date} to the params list of function Table.TransformColumnTypes() like so:
#"Changed column type" = Table.TransformColumnTypes(
dbo_RWS_15M,
{
{"DATE", type date},
...
go to the workspace list view, find the dataflow, click the 3 dots and get the ".json" file for the dataflow. In there you can check the Mashup script and see if there's any reference made to a datetime - that could be causing the issue.
Hi @Janica123
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others.
Otherwise, will respond back with the more details and we will try to help.
Thanks.