Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join us at the 2025 Microsoft Fabric Community Conference. March 31 - April 2, Las Vegas, Nevada. Use code FABINSIDER for $400 discount. Register now

Reply
adarshthouti141
Frequent Visitor

Loading multiple xer files in Microsoft Fabric

How to Efficiently Process and Load Multiple .xer Files from a Lakehouse into a Tabular Format for Power BI in Microsoft Fabric?

Hello Fabric Community,

I am working on a project where I need to process multiple .xer files stored in a Microsoft Fabric Lakehouse. The goal is to append the data from these files into a single tabular format and subsequently integrate the processed data with Power BI for analysis. Here's a detailed breakdown of my workflow and the challenges I'm facing:

Current Workflow:

  1. File Location:

    • All .xer files are stored in the Files section of a Lakehouse in Fabric.
  2. Required Data Transformation:

    • Read all .xer files from the Lakehouse folder.
    • Combine the files into a single Spark DataFrame.
    • Add a column to identify the source file for each row.
    • Perform transformations, such as filtering specific rows or columns.
  3. Integration with Power BI:

    • Write the processed data to a Delta table or any other format supported by Power BI.
    • Build Power BI reports based on the transformed data.
1 ACCEPTED SOLUTION
v-hashadapu
Community Support
Community Support

Hi @adarshthouti141 , thank you for reaching out to the Microsoft Fabric Community Forum.

Please consider below steps:

  1. Use Apache Spark to read all .xer files from the Lakehouse folder. Combine the files into a single Spark DataFrame.

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("ProcessXerFiles").getOrCreate()

df = spark.read.format("xer").load("path/to/lakehouse/folder")

  1. Add a column to the DataFrame to identify the source file for each row.

df = df.withColumn("source_file", spark.sparkContext._jvm.org.apache.spark.sql.functions.input_file_name())

  1. Filter specific rows or columns as needed. Perform any additional transformations required for your analysis.

# Example transformation: Filter specific rows

df = df.filter("your_condition_here")

  1. Write the processed data to a Delta table or any other format supported by Power BI.

df.write.format("delta").save("path/to/delta/table")

If this helps, please consider marking it 'Accept as Solution' so others with similar queries may find it more easily. If not, please share the details.
Thank you.

View solution in original post

4 REPLIES 4
v-hashadapu
Community Support
Community Support

Hi @adarshthouti141 , Hope your issue is solved. If it is, please consider marking the answer 'Accept as solution', so others with similar issues may find it easily. If it isn't, please share the details.
Thank you.

v-hashadapu
Community Support
Community Support

Hi @adarshthouti141 , Hope your issue is solved. If it is, please consider marking the answer 'Accept as solution', so others with similar issues may find it easily. If it isn't, please share the details.
Thank you.

v-hashadapu
Community Support
Community Support

Hi @adarshthouti141 , Hope your issue is solved. If it is, please consider marking it 'Accept as solution', so others with similar issues may find it easily. If it isn't, please share the details. Thank you.

v-hashadapu
Community Support
Community Support

Hi @adarshthouti141 , thank you for reaching out to the Microsoft Fabric Community Forum.

Please consider below steps:

  1. Use Apache Spark to read all .xer files from the Lakehouse folder. Combine the files into a single Spark DataFrame.

from pyspark.sql import SparkSession

spark = SparkSession.builder.appName("ProcessXerFiles").getOrCreate()

df = spark.read.format("xer").load("path/to/lakehouse/folder")

  1. Add a column to the DataFrame to identify the source file for each row.

df = df.withColumn("source_file", spark.sparkContext._jvm.org.apache.spark.sql.functions.input_file_name())

  1. Filter specific rows or columns as needed. Perform any additional transformations required for your analysis.

# Example transformation: Filter specific rows

df = df.filter("your_condition_here")

  1. Write the processed data to a Delta table or any other format supported by Power BI.

df.write.format("delta").save("path/to/delta/table")

If this helps, please consider marking it 'Accept as Solution' so others with similar queries may find it more easily. If not, please share the details.
Thank you.

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

FebFBC_Carousel

Fabric Monthly Update - February 2025

Check out the February 2025 Fabric update to learn about new features.

Feb2025 NL Carousel

Fabric Community Update - February 2025

Find out what's new and trending in the Fabric community.