Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
BryanCarmichael
Advocate I
Advocate I

Load data from Semantic Model to Lakehouse where semantic model column names start with Capitals

Hi there - trying to lload a tabe from a semantic model to a lakehouse but the issue is that the table in the semantic model I want to load from has column names that start with Capital letters (for ease of use). 

Obviously this breaks the rules of lakehouse tables  - how do i fix this - ideally dynamically ?

Code below 

import sempy.fabric as fabric

#Dev
#workspace_ID = "XXXXXX"
#Stage
#workspace_ID = "XXXXXX"
#Prod
workspace_ID = "XXXXXX"

workspace_Safe_ID = workspace_ID.replace("-","_")
df_table = fabric.read_table("PSG_CTD_GDS_OMNI_Production", "_Environments",workspace=workspace_ID)

df_table.to_lakehouse_table("Environments", "overwrite")

And error log 

Cell In[8], line 13
10 workspace_Safe_ID = workspace_ID.replace("-","_")
11 df_table = fabric.read_table("PSG_CTD_GDS_OMNI_Production", "_Environments",workspace=workspace_ID)
---> 13 df_table.to_lakehouse_table("Environments", "overwrite")

 

File ~/cluster-env/trident_env/lib/python3.10/site-packages/sempy/_utils/_log.py:273, in mds_log.<locals>.get_wrapper.<locals>.log_decorator_wrapper(*args, **kwargs)
270 raise
272 try:
--> 273 result = func(*args, **kwargs)
275 # The invocation for get_message_dict moves after the function
276 # so it can access the state after the method call
277 message.update(extractor.get_completion_message_dict(result, arg_dict))

File ~/cluster-env/trident_env/lib/python3.10/site-packages/sempy/fabric/_dataframe/_fabric_dataframe.py:559, in FabricDataFrame.to_lakehouse_table(self, name, mode, spark_schema)


550 with SparkConfigTemporarily(spark, "spark.sql.parquet.int96RebaseModeInWrite", "CORRECTED"):
551 spark_df = spark.createDataFrame(df_converted, schema=converted_schema)
553 (spark_df.write
554 .option("parquet.vorder.enabled", True)
555 .mode(mode)
556 .format("delta")
557 # enable column mapping to support special characters common w/ Power BI (e.g. [])
558 .option("delta.columnMapping.mode", "name")
--> 559 .saveAsTable(name))

File /opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py:1521, in DataFrameWriter.saveAsTable(self, name, format, mode, partitionBy, **options)


1519 if format is not None:
1520 self.format(format)
-> 1521 self._jwrite.saveAsTable(name)

 

File ~/cluster-env/trident_env/lib/python3.10/site-packages/py4j/java_gateway.py:1322, in JavaMember.__call__(self, *args)
1316 command = proto.CALL_COMMAND_NAME +\
1317 self.command_header +\
1318 args_command +\
1319 proto.END_COMMAND_PART
1321 answer = self.gateway_client.send_command(command)
-> 1322 return_value = get_return_value(
1323 answer, self.gateway_client, self.target_id, self.name)
1325 for temp_arg in temp_args:
1326 if hasattr(temp_arg, "_detach"):

File /opt/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py:175, in capture_sql_exception.<locals>.deco(*a, **kw)
171 converted = convert_exception(e.java_exception)
172 if not isinstance(converted, UnknownException):
173 # Hide where the exception came from that shows a non-Pythonic
174 # JVM exception message.
--> 175 raise converted from None
176 else:
177 raise

2 REPLIES 2
v-jingzhan-msft
Community Support
Community Support

Hi @BryanCarmichael 

Did you have a chance to try @frithjof_v 's solution? Does it work or not? Or did you find any other solution? Let us know if you have any questions. 

 

Best Regards,
Jing

frithjof_v
Community Champion
Community Champion

ChatGPT helped me with this. The main item in the code is this:

 

df_lower = df.toDF(*[c.lower() for c in df.columns])

 

 

ChatGPT created this example:

 

# Example DataFrame
data = [("John", 28), ("Alice", 23), ("Bob", 35)]
columns = ["Name", "Age"]
df = spark.createDataFrame(data, columns)

# Convert all column names to lowercase
df_lower = df.toDF(*[c.lower() for c in df.columns])

# Show the resulting DataFrame
df_lower.show()

 

Explanation:

  • df.columns: Retrieves the list of column names.
  • [c.lower() for c in df.columns]: Converts each column name to lowercase.
  • df.toDF(*[c.lower() for c in df.columns]): Creates a new DataFrame with the lowercase column names.

This will rename all columns in the DataFrame to their lowercase equivalents.

 

Here is a StackOverflow thread mentioning the same method. Perhaps ChatGPT learned from it:

https://stackoverflow.com/questions/43005744/convert-columns-of-pyspark-data-frame-to-lowercase

 

So in your case, the code could be like this:

 

import sempy.fabric as fabric

#Dev
#workspace_ID = "XXXXXX"
#Stage
#workspace_ID = "XXXXXX"
#Prod
workspace_ID = "XXXXXX"

workspace_Safe_ID = workspace_ID.replace("-","_")
df_table = fabric.read_table("PSG_CTD_GDS_OMNI_Production", "_Environments",workspace=workspace_ID)
df_lower = df_table.toDF(*[c.lower() for c in df_table.columns])

df_lower.to_lakehouse_table("Environments", "overwrite")

 

 

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

November Update

Fabric Monthly Update - November 2024

Check out the November 2024 Fabric update to learn about new features.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Top Solution Authors
Top Kudoed Authors