Join us at FabCon Atlanta from March 16 - 20, 2026, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.
Register now!The Power BI Data Visualization World Championships is back! Get ahead of the game and start preparing now! Learn more
Hi,
I maintain server lakehouses, and due to issues with the deployment pipelines tend to apply schema changes through script, and to retain the data in the table use the following process:
For Step 1, to be able base the new table on the existing table I need to be able to identify existing data types for all existing fields, and it appears there is no reliable way of doing this. The main issue relates to Char and Varchar fields, as there appears to be no current mothod for determining the appropriate character length.
I have tried various methods on the lakehouse, but those always show the fields to be String, with no maximum size.
I have also tried querying INFORMATION_SCHEMA COLUMNS through the sql endpoint, and the problem here is that value for CHARACTER_MAXIMUM_LENGTH appears to be 4 times the actual defined maximum number of characters, up to a maximum of 8000. I.e. A character length of 100 is shown as 400, 1000 is shown as 4000, 2000 and higher are always shown as 8000.
Does anyone know of a reliable way of generating a create statement for an existing lakehouse table?
Solved! Go to Solution.
One final update on this.
The logfile containing the schema is not necessarily the most recent, and there is not necessarily only one version of the schema. There is a schema associated with every modification made to the table structure, be that the original creation or subsequent alterations. Consequently, the logfile we need is the most recent with a schema.
Hi @JonBFabric,
What is a lakehouse? - Microsoft Fabric | Microsoft Learn
Table utility commands | Delta Lake
If this response was helpful in any way, I’d gladly accept a 👍much like the joy of seeing a DAX measure work first time without needing another FILTER.
Please mark it as the correct solution. It helps other community members find their way faster (and saves them from another endless loop 🌀.
Thanks. Great to get the explanation as to what is happening and why. But going back to the original question...
Is it possible to identify the SQL statement used to originally create a table? I'm getting the impression that the answer is no. And given that the maximum record length that can be handled by the SQL endpoint is 8060 bytes, those character limits are crucial and need to be tightly controlled.
Hi @JonBFabric , Thank you for reaching out to the Microsoft Community Forum.
Fabric lakehouse cannot return the original CREATE TABLE statement because that information is never stored. Delta Lake only keeps a structural schema in its transaction log and all string columns are recorded simply as string without any notion of the original CHAR(n) or VARCHAR(n) definitions. Because the lakehouse storage layer does not preserve fixed length constraints, there is no system level metadata you can query later to recover them.
The SQL analytics endpoint also cannot help because it exposes a compatibility projection rather than the real underlying schema. Its inflated character lengths, including the 4× multiplier and the cap at 8000 are generated by the endpoint itself and do not represent the actual table definition or any original DDL. That surface is designed for querying, not schema reconstruction and therefore does not retain the information you are looking for.
Given these constraints, there is no reliable way to extract the exact SQL used to create a lakehouse table after the fact. If those character limits matter for downstream SQL workloads, the only viable approach is to rebuild the DDL by measuring actual data lengths in the table and defining controlled column sizes going forward. For future tables, the only dependable method is to version control the DDL at creation time or store it explicitly as metadata, because the platform does not preserve it automatically.
What Is Data Warehousing in Microsoft Fabric? - Microsoft Fabric | Microsoft Learn
What is a lakehouse? - Microsoft Fabric | Microsoft Learn
Data Types in Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn
What is the SQL analytics endpoint for a lakehouse? - Microsoft Fabric | Microsoft Learn
Whilst I understand everything that you are saying, I would like to describe another scenario which suggests that some of the above is not actually correct.
Using only SparkSQL in a notebook I have created a lakehouse table with a single varchar(10) field. If I try to insert any value with more than 10 characters I get the following error:
[DELTA_EXCEED_CHAR_VARCHAR_LIMIT] Exceeds char/varchar type length limitation. Failed check: (isnull('String) OR (length('String) <= 10)).
Obviously something in the spark engine or the delta table metadata ia storing the size restriction
Hi @JonBFabric , Thank you for reaching out to the Microsoft Community Forum.
Yes, Spark/Delta can record and enforce CHAR(n) / VARCHAR(n) when a table is created through Spark or other Delta-aware APIs; the engine stores that constraint in the Delta metadata and will reject writes that exceed the declared width (hence the DELTA_EXCEED_CHAR_VARCHAR_LIMIT error). The authoritative place to get that declaration is the Spark/Delta surface (for example, run SHOW CREATE TABLE or DESCRIBE TABLE EXTENDED in a Spark notebook or read the Delta transaction log/Delta Table API). Those commands return the DDL/metadata that Spark/Delta actually enforces.
Do not rely on the Fabric SQL analytics endpoint or INFORMATION_SCHEMA alone to recover declared widths. Those surfaces present a T-SQL compatibility projection that can inflate, cap or otherwise transform reported lengths (the 4×/8000 behaviour you saw) and therefore are not a trustworthy source of the original Spark declared sizes. If you cannot run Spark against the table, your fallback is to inspect the _delta_log or compute observed max character/byte lengths and reconstruct conservative VARCHAR widths and for long term safety you must version-control the DDL or persist it as table metadata at creation time.
Good Morning,
I'm not looking for a way to access the metadata through the SQL endpoint, by using INFORMATION_SCHEMA or any other objects/functions, I just used it as an example of the only place that displayed anything other than string. I have already tried both SHOW CREATE TABLE and DESCRIBE TABLE EXTENDED, the first is not supported by Fabric ([DELTA_OPERATION_NOT_ALLOWED] Operation not allowed: `SHOW CREATE TABLE` is not supported for Delta tables) and the 2nd only shows string.
Please could you provide me with an example of how to access the delta metadata responsible for enforcing the DELTA_EXCEED_CHAR_VARCHAR_LIMIT error. It doesn't need to be pretty.
Thanks again
Hi @JonBFabric , Thank you for reaching out to the Microsoft Community Forum.
If Fabric is blocking SHOW CREATE TABLE and DESCRIBE TABLE EXTENDED only shows string, the next step is to read the Delta metadata directly through Spark, because the enforcement you are seeing (DELTA_EXCEED_CHAR_VARCHAR_LIMIT) comes from the schema stored in the Delta transaction log, not from the SQL endpoint. The length constraint is kept in the Delta log under metaData.schemaString and Spark/Delta will surface it correctly when you query the table through the Delta APIs. The simplest approach is to use a Spark notebook and load the table with DeltaTable.forPath(...).toDF(), which will show VarcharType(n) in the schema if the table was created with varchar(n). If that surface is not available, you can read the latest commit JSON in the _delta_log folder and print the metaData.schemaString field; that text contains the exact schema Spark is enforcing, including declared lengths.
Spark example you can run in a Fabric notebook to retrieve the metadata responsible for the enforcement:
from delta.tables import DeltaTable import json
table_path = "/lakehouses/<your-lakehouse>/Tables/<your-table>" # update this
dt = DeltaTable.forPath(spark, table_path) print(dt.toDF().schema) # shows VarcharType(n) if declared
log_dir = f"{table_path}/_delta_log" files = [f.path for f in dbutils.fs.ls(log_dir) if f.name.endswith(".json")] latest = sorted(files)[-1]
content = dbutils.fs.head(latest, 500000) commit = json.loads(content) print(commit.get("metaData", {}).get("schemaString"))
This will show you the exact schema stored in Delta and the constraint that triggers the length violation error. If the table is large and uses checkpoint parquet files, the same field appears in the checkpoint’s metaData struct. In short, the SQL endpoint cannot return the declared widths, but Spark and the Delta log always can.
Explore the lakehouse data with a notebook - Microsoft Fabric | Microsoft Learn
Data Types in Fabric Data Warehouse - Microsoft Fabric | Microsoft Learn
Delta Lake Logs in Warehouse - Microsoft Fabric | Microsoft Learn
Thanks.
This didn't quite work out of the box, possibly because it was originally written for DataBricks rather than Fabric, but I have got it working. I will explan the differences as I go:
Hi @JonBFabric , Thanks for the update and the insights on how to solve this issue. We really appreciate it.
If you have any queries, please feel free to create a new post, we are always happy to help.
One final update on this.
The logfile containing the schema is not necessarily the most recent, and there is not necessarily only one version of the schema. There is a schema associated with every modification made to the table structure, be that the original creation or subsequent alterations. Consequently, the logfile we need is the most recent with a schema.