Re: Load data from Semantic Model to Lakehouse whe...

BryanCarmichael · ‎08-16-2024

Hi there - trying to lload a tabe from a semantic model to a lakehouse but the issue is that the table in the semantic model I want to load from has column names that start with Capital letters (for ease of use).

Obviously this breaks the rules of lakehouse tables - how do i fix this - ideally dynamically ?

Code below

import sempy.fabric as fabric

#Dev

#workspace_ID = "XXXXXX"

#Stage

#workspace_ID = "XXXXXX"

#Prod

workspace_ID = "XXXXXX"

workspace_Safe_ID = workspace_ID.replace("-","_")

df_table = fabric.read_table("PSG_CTD_GDS_OMNI_Production", "_Environments",workspace=workspace_ID)

df_table.to_lakehouse_table("Environments", "overwrite")

And error log

Cell In[8], line 13
10 workspace_Safe_ID = workspace_ID.replace("-","_")
11 df_table = fabric.read_table("PSG_CTD_GDS_OMNI_Production", "_Environments",workspace=workspace_ID)
---> 13 df_table.to_lakehouse_table("Environments", "overwrite")

File ~/cluster-env/trident_env/lib/python3.10/site-packages/sempy/_utils/_log.py:273, in mds_log.<locals>.get_wrapper.<locals>.log_decorator_wrapper(*args, **kwargs)
270 raise
272 try:
--> 273 result = func(*args, **kwargs)
275 # The invocation for get_message_dict moves after the function
276 # so it can access the state after the method call
277 message.update(extractor.get_completion_message_dict(result, arg_dict))

File ~/cluster-env/trident_env/lib/python3.10/site-packages/sempy/fabric/_dataframe/_fabric_dataframe.py:559, in FabricDataFrame.to_lakehouse_table(self, name, mode, spark_schema)

550 with SparkConfigTemporarily(spark, "spark.sql.parquet.int96RebaseModeInWrite", "CORRECTED"):
551 spark_df = spark.createDataFrame(df_converted, schema=converted_schema)
553 (spark_df.write
554 .option("parquet.vorder.enabled", True)
555 .mode(mode)
556 .format("delta")
557 # enable column mapping to support special characters common w/ Power BI (e.g. [])
558 .option("delta.columnMapping.mode", "name")
--> 559 .saveAsTable(name))

File /opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py:1521, in DataFrameWriter.saveAsTable(self, name, format, mode, partitionBy, **options)

1519 if format is not None:
1520 self.format(format)
-> 1521 self._jwrite.saveAsTable(name)

File ~/cluster-env/trident_env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):

File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:175, in capture_sql_exception.<locals>.deco(*a, **kw)
171 converted = convert_exception(e.java_exception)
172 if not isinstance(converted, UnknownException):
173 # Hide where the exception came from that shows a non-Pythonic
174 # JVM exception message.
--> 175 raise converted from None
176 else:
177 raise

Anonymous · ‎08-21-2024

Hi @BryanCarmichael

Did you have a chance to try @frithjof_v 's solution? Does it work or not? Or did you find any other solution? Let us know if you have any questions.

Best Regards,
Jing

frithjof_v · ‎08-16-2024

ChatGPT helped me with this. The main item in the code is this:

df_lower = df.toDF(*[c.lower() for c in df.columns])

ChatGPT created this example:

# Example DataFrame
data = [("John", 28), ("Alice", 23), ("Bob", 35)]
columns = ["Name", "Age"]
df = spark.createDataFrame(data, columns)

# Convert all column names to lowercase
df_lower = df.toDF(*[c.lower() for c in df.columns])

# Show the resulting DataFrame
df_lower.show()

Explanation:

df.columns: Retrieves the list of column names.
[c.lower() for c in df.columns]: Converts each column name to lowercase.
df.toDF(*[c.lower() for c in df.columns]): Creates a new DataFrame with the lowercase column names.

This will rename all columns in the DataFrame to their lowercase equivalents.

Here is a StackOverflow thread mentioning the same method. Perhaps ChatGPT learned from it:

https://stackoverflow.com/questions/43005744/convert-columns-of-pyspark-data-frame-to-lowercase

So in your case, the code could be like this:

import sempy.fabric as fabric

#Dev
#workspace_ID = "XXXXXX"
#Stage
#workspace_ID = "XXXXXX"
#Prod
workspace_ID = "XXXXXX"

workspace_Safe_ID = workspace_ID.replace("-","_")
df_table = fabric.read_table("PSG_CTD_GDS_OMNI_Production", "_Environments",workspace=workspace_ID)
df_lower = df_table.toDF(*[c.lower() for c in df_table.columns])

df_lower.to_lakehouse_table("Environments", "overwrite")

Load data from Semantic Model to Lakehouse where semantic model column names start with Capitals

Helpful resources

Fabric Monthly Update - August 2025

Fabric Community Update - August 2025

Join us at FabCon Vienna from September 15-18, 2025

Load data from Semantic Model to Lakehouse where semantic model column names start with Capitals

Helpful resources

Fabric Monthly Update - August 2025

Fabric Community Update - August 2025