Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Calling all Data Engineers! Fabric Data Engineer (Exam DP-700) live sessions are back! Starting October 16th. Sign up.

Reply
GrahamR
Regular Visitor

XML import problem

I have an XML file delivered to me where I need to display the Location text field.  I have tried various ways but need some help please.

 

Minimal example xml (just 1 object) is here and the text I need in a new column is in bold below:

 

<?xml version="1.0" encoding="utf-8"?>
<Object elementtype="Technology">
<ObjectIdentity>
<Number>ID234</Number>
<ItemMarked />
</ObjectIdentity>
<ObjectLocation elementtype="current location">
<Location>Site A<Shelf>1</Shelf><Box>32</Box></Location>
<Date>
<DateBegin>2022</DateBegin>
<DateEnd />
</Date>
</ObjectLocation>
</Object>

 

I have checked the xml against various validators online and all say it is OK..

 

The XML loads with no problems and when I try to expand the ObjectLocation column, it lists Location,Date,Attribute:elementtype.  If I select Location, it then shows a table with Shelf and Box (which then shows Shelf 1 and Box 32).  I can find no way of getting the text "Site A" displayed in a new 'Location' column.

 

Can anyone suggest how I can solve this please - many thanks for any guidance you can give.

1 ACCEPTED SOLUTION

Hi @GrahamR ,

 

You are correct this is the expected behavior.

The reason you are seeing only null values is that the plain text “Site A” is not retained when Power BI parses the XML. The Xml.Tables function processes only structured elements and attributes, and in your file, “Site A” appears as loose text before the <Shelf> and <Box> elements. As a result, it is considered mixed content and omitted during import.

Since this text does not make it into Power Query, it cannot be retrieved by any M or DAX expressions after the XML is loaded.

 

The best solutions are to either update the XML so that “Site A” is within its own tag or attribute, or to read the file as raw text and extract the relevant section before <Shelf> using text functions prior to converting it to XML.

In summary, your understanding is correct the data loss occurs during the XML parsing stage, and is not due to any issues with your process.

Thank you,

Tejaswi

View solution in original post

9 REPLIES 9
DaleT
Helper I
Helper I

Hi,

The XML isn't a standard XML file. "Site A" should be in a paire of tags or as an attribute.

<Location><site>Site A</site><Shelf>1</Shelf><Box>32</Box></Location>

or <Location site="Site A"><Shelf>1</Shelf><Box>32</Box></Location>

Hi DaleT

This was one of my thoughts and why I put the xml file through various checkers - none of the checkers throw up any fault.  Microsoft XML Notepad handles the file with no problems. However I agree that if the xml was structured as you suggest, there would be no problem.

 

The xml file is being supplied to me by the client and it is exported from a well respected software package used throughout this particular industry.  So my only option is to find some workaround.

 

Appreciate your reply but I need to find a way of grabbing this pesky text!

Hi @GrahamR ,

 

You're correct the XML you received is valid, but it contains mixed content (plain text combined with child elements within the same tag). As a result, Power Query does not automatically display the “Site A” text in the standard columns when expanding the XML.

 

To extract this text, you can use a custom column in Power Query. The following method is compatible with your XML structure:

 

let
    xml = Xml.Tables([ObjectLocation]),
    loc = try xml[Location]{0} otherwise null,
    textNode = try Record.Field(loc, "#text") otherwise null
in
    if textNode = null then null
    else if Value.Is(textNode, Binary.Type) then Text.Trim(Text.FromBinary(textNode))
    else Text.Trim(Text.From(textNode))

 

This approach retrieves the text node preceding the <Shelf> and <Box> elements, providing “Site A” as plain text in a new column.

The “Invalid Identifier” error occurred because [#text] is not a valid identifier in M. To access this field correctly, use Record.Field(loc, "#text") or loc["#text"].

 

I hope this solution helps you achieve the desired outcome.

 

Best regards,
Tejaswi.
Community Support Team

Hi Tejaswi

 

Many thanks for your suggested approach however this is giving me nulls.  I tried simplifying the method but still get nulls all the time. 

My latest thought is whether the XML input process is filtering out the 'extra' text so that the Xml.Tables does not contain the text string at all. 

 

Regards

Hi @GrahamR ,

 

You are correct this is the expected behavior.

The reason you are seeing only null values is that the plain text “Site A” is not retained when Power BI parses the XML. The Xml.Tables function processes only structured elements and attributes, and in your file, “Site A” appears as loose text before the <Shelf> and <Box> elements. As a result, it is considered mixed content and omitted during import.

Since this text does not make it into Power Query, it cannot be retrieved by any M or DAX expressions after the XML is loaded.

 

The best solutions are to either update the XML so that “Site A” is within its own tag or attribute, or to read the file as raw text and extract the relevant section before <Shelf> using text functions prior to converting it to XML.

In summary, your understanding is correct the data loss occurs during the XML parsing stage, and is not due to any issues with your process.

Thank you,

Tejaswi

Hi Tejaswi

 

This is in some ways a relief (!) after 4 days of getting nowhere.  I am going to use a workaround along the lines you suggest so I can get these reports out to the client.

 

My other takeaway is say thank you to yourself and to the wider community for responding with ideas and suggestions.  This has been a life-saver!

 

Very best wishes, Graham

Hi  @GrahamR ,

 

Great to hear that this solution was helpful and that you now have an approach for generating the reports. Thank you for acknowledging.

 

Best Regards,

Tejaswi.

Community Support

 

Nabha-Ahmed
Kudo Commander
Kudo Commander

Hi @GrahamR 

 

Use Power Query and the XML document structure

  1. Load your XML file normally in Power BI.

  2. In Power Query Editor, select the column that contains your XML (e.g. ObjectLocation.Location).

  3. Add a Custom Column with the formula below:= try Text.BeforeDelimiter(Text.FromBinary(Xml.Tables([ObjectLocation])[Location]{0}[#text]), "<") otherwise null

  4. Rename the new column to Location_Text (or “Site Name”).

 

if this not work for you mention me please to see another way 

Accept as Solution and giving it a Like 👍– it helps others in the community too.

Thanks 🌹

 

 

 

 

 

 

Hi

 

A very interesting approach !  However I get 'Invalid Identifier' error against the [#text] part of the try.  I have even been trying to just use the Text.FromBinary part but can't get any text back.

 

Any thoughts?

Helpful resources

Announcements
FabCon Global Hackathon Carousel

FabCon Global Hackathon

Join the Fabric FabCon Global Hackathon—running virtually through Nov 3. Open to all skill levels. $10,000 in prizes!

October Power BI Update Carousel

Power BI Monthly Update - October 2025

Check out the October 2025 Power BI update to learn about new features.

FabCon Atlanta 2026 carousel

FabCon Atlanta 2026

Join us at FabCon Atlanta, March 16-20, for the ultimate Fabric, Power BI, AI and SQL community-led event. Save $200 with code FABCOMM.