Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Fabric Data Days Monthly is back. Join us on March 26th for two expert-led sessions on 1) Getting Started with Fabric IQ and 2) Mapping & Spacial Analytics in Fabric. Register now

Reply
Richtpt
Helper I
Helper I

How to reference a JSON array element with no name using a Notebook or OPENROWSET?

I'm extremely new to notebooks and accessing data within JSON files that have been imported into a Lakehouse.  I have the following JSON schema in a file (from notebook df.printSchema()):

Richtpt_0-1773270458132.png

The Modifiers array element doesn't have a name, it just looks like this:

 

"Modifiers": [

       "US"

]

 

I can't figure out how to reference that using a notebook or SQL's OPENROWSET.  Any ideas?  Thanks!

1 ACCEPTED SOLUTION
v-nmadadi-msft
Community Support
Community Support

Hi @Richtpt  ,
Thanks for reaching out to the Microsoft Fabric Community forum.

If you just want to see the data in the same nested format as it appears in the JSON, you can run:

df.select("PRCCollection.Modifiers").show()

Since both PRCCollection and Modifiers are arrays, Spark will display the result as something like [[US]]. That simply means there is an array inside another array.

vnmadadimsft_0-1773308672555.png

 

If you’d prefer to flatten the structure and extract the actual value, you can use explode() to expand each array level:

from pyspark.sql.functions import explode



df2 = df.withColumn("prc", explode("PRCCollection")) \

        .withColumn("Modifier", explode("prc.Modifiers"))



df2.select("Modifier").show()

Here, the first explode() opens up the PRCCollection array, and the second one expands the Modifiers array inside it. The output will then show the value directly (for example US) instead of the nested [[US]] structure.

vnmadadimsft_1-1773308681197.png

 



For reference this is the output of df.schema() for my data

vnmadadimsft_2-1773308690662.png

 

 


I hope this information helps. Please do let us know if you have any further queries.
Thank you


View solution in original post

6 REPLIES 6
v-nmadadi-msft
Community Support
Community Support

Hi @Richtpt  ,
Thanks for reaching out to the Microsoft Fabric Community forum.

If you just want to see the data in the same nested format as it appears in the JSON, you can run:

df.select("PRCCollection.Modifiers").show()

Since both PRCCollection and Modifiers are arrays, Spark will display the result as something like [[US]]. That simply means there is an array inside another array.

vnmadadimsft_0-1773308672555.png

 

If you’d prefer to flatten the structure and extract the actual value, you can use explode() to expand each array level:

from pyspark.sql.functions import explode



df2 = df.withColumn("prc", explode("PRCCollection")) \

        .withColumn("Modifier", explode("prc.Modifiers"))



df2.select("Modifier").show()

Here, the first explode() opens up the PRCCollection array, and the second one expands the Modifiers array inside it. The output will then show the value directly (for example US) instead of the nested [[US]] structure.

vnmadadimsft_1-1773308681197.png

 



For reference this is the output of df.schema() for my data

vnmadadimsft_2-1773308690662.png

 

 


I hope this information helps. Please do let us know if you have any further queries.
Thank you


That code helped a lot, but I still wasn't getting what I wanted.  Probably because I'm extremely new to PySpark (as in this is my first time using it).  I finally asked Grok how to flatten this.  I got back several responses including this one that worked perfect.  Wanted to share to close this up.  Thanks much for the help!

 

from pyspark.sql.functions import col, explode_outer, explode
df = spark.read.json("Files/test.json")

df_deep = (
    df
    .select("*", explode_outer("PRC0Collection").alias("PR"))
    .select(
        "*",
        col("PR.*"),
        explode_outer(col("PR.Modifiers")).alias("Modifier")
    )
    .drop("PRC0Collection", "PR")
    .drop("PRC0Collection", "Modifiers")
)

display(df_deep)
 

btw, I forgot to post my json earlier.  Here it is:

{
    "AGAG_ID": "PHKETCHMC",
    "AGIP_CAP_IND": "",
    "CSPI_HIOS_ID_NVL": "73836AK09999",
    "CSPI_ID": "AMT01234",
    "GRGR_CK": 12345,
    "IPCD_ID": "R9999",
    "PRC0Collection": [
        {
            "Cache_ID": "1462da8b",
            "Exp_Date": "12/31/9999 00:00:00.000",
            "Modifiers": [
                "US"
            ],
            "PSCD_ID": "11",
            "Price": 16336.44,
            "Type": "negotiated"
        },
        {
            "Cache_ID": "156eab34",
            "Exp_Date": "12/31/9999 00:00:00.000",
            "Modifiers": [
                "UQ"
            ],
            "PSCD_ID": "11",
            "Price": 24499.76,
            "Type": "negotiated"
        }
    ],
    "PRPR_ID": "P00000099999",
    "SEDF_CAP_IND": "N"
}

 

 

Hi @Richtpt ,
We really appreciate your efforts and for letting us know the update on the issue.

Please continue using fabric community forum for your further assistance.
Thank you

 

Thanks, that helps a lot and I am able to see the Modifiers values.

trivedisunita
Advocate I
Advocate I

Modifier is an array, the value inside it is accessed by index number, not by name. firsy value is at index[0].

to get the value use code { df.select("PRC0Collection.Modifiers[0]").show() } 

May be this can help.

Thanks but this isn't working for me.  I have

 

df = spark.read.json("file1.json")

df.select("PRC0Collection.Modifiers[0]").show()

 

I get this error:  [FIELD_NOT_FOUND] No such struct field `Modifiers[0]` in `Cache_ID`, `Exp_Date`, `Modifiers`, `PSCD_ID`, `Price`, `Type`.

Helpful resources

Announcements
Join our Fabric User Panel

Join our Fabric User Panel

Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.

February Fabric Update Carousel

Fabric Monthly Update - February 2026

Check out the February 2026 Fabric update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.