Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
Hello. I am quite new to Spark Notebooks. I am using one to extract JSON data to save to tables in a Lakehouse. It works, but there are some slight issues. The data, being JSON, has nexted objects. I have included a screenshot here to highlight my issues.
I am starting with a data frame to read the entire JSON file. But the contents of the nested fields contain nested objects. So I have a second data frame that selects elements from the first using something like this:
df2 = df1.select( "Id", "EmployeeNumber",..."PositionData.Manager.Id"..."WorkLocation.Id"...)
But the second and third columns that are "Id" come out named "Id". I now have THREE columns named "Id" in the data frame. I want to rename the second and third ones to "Manager_Id" and "WorkLocation_Id", respectively.
I want to flatten the entire JSON file (there are no nested arrays, just nexted objects) such that I have the original Id (for the Employee) and Manager Id and Work Location Id.
I tried data frame with Column Rename but it renames all column named Id.
If this was SQL I could write it as: select..."PositionData.Manager.Id" AS [Manager_Id]...
Is there a way to rename a column inline in a dataframe select operation? Or is there another/better option?
Thanks in advance
Proud to be a Super User! | |
Solved! Go to Solution.
Hi @ToddChitt @can you try aliasing the column when using .select
df_renamed = df.select(col("Name").alias("EmployeeName"), col("Department").alias("Dept")) df_renamed.show()
Hello @AndyDDC and thank you for the reply.
I tried your suggestion but it generated an error: ...name 'col; is not defined.
But from this website PySpark alias() Column & DataFrame Examples - Spark By {Examples} (sparkbyexamples.com)(which I think is about to become my new best friend 🙂 ) I added this line of code at the top of the block:
Proud to be a Super User! | |
Great to hear. And yes the Spark By Example website is awesome!
Hi @ToddChitt @can you try aliasing the column when using .select
df_renamed = df.select(col("Name").alias("EmployeeName"), col("Department").alias("Dept")) df_renamed.show()
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
9 | |
5 | |
4 | |
3 | |
3 |