Power BI is turning 10, and we’re marking the occasion with a special community challenge. Use your creativity to tell a story, uncover trends, or highlight something unexpected.
Get startedJoin us at FabCon Vienna from September 15-18, 2025, for the ultimate Fabric, Power BI, SQL, and AI community-led learning event. Save €200 with code FABCOMM. Get registered
I have a CSV file in my Lakehouse. When I use the Copy Activity to move it and change the format to Parquet, I see columns with null values, although the CSV file is configured correctly (with escape and quote characters)!!
i tried both to load file to table in lakehouse and read it as parquet using spark and in both it shows the null columns !!!
Solved! Go to Solution.
I found the issue to be that many rows with null values were intentionally added to the CSV, and after partitioning, the nulls were the first to be seen.
Hi @A_monged ,
Thanks for the reply from lbendlin .
I used PySpark statements in notebook to convert CSV files from lakehouse to parquet files:
# start SparkSession
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("CSV_to_Parquet").getOrCreate()
# load CSV file
df = spark.read.format("csv").option("header","true").load("Files/orders/2019.csv")
df.show()
# transform to Parquet form and save it
parquet_file_path = "Files/test.parquet"
df.write.parquet(parquet_file_path)
Works fine, as shown below, and my file does not contain null values:
The error may occur because the data types and schemas defined in the CSV file do not match the data types and schemas expected by the Parquet format.
Optionally, you can use the same method as I did to convert the CSV file to a Parquet file.
If you want to save it as a table you can use the following syntax:
df.write.mode(“overwrite”).saveAsTable(“parquetTestTable”)
If you have any other questions please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
I tried your approach but still got the same null columns attached table output and sample rows of raw CSV file
Does your CSV file contain quoted row delimiters?
File use comma as delimiter and " as a quote character
So commas are quoted. But what about linefeeds in your data? are they quoted?
I found the issue to be that many rows with null values were intentionally added to the CSV, and after partitioning, the nulls were the first to be seen.
Hi @A_monged ,
Thanks for the reply from lbendlin .
In order to deal with null values, an efficient way to get the data is to use Dataflow Gen2. In Dataflow Gen2, you can process the data, such as removing columns that contain null values. Then, set Destination to Lakehouse so that data that does not contain null values can be written to Lakehouse. This approach ensures data integrity and accuracy.
For more information on using Dataflow Gen2, you can refer to these official documents:
Create your first Microsoft Fabric dataflow - Microsoft Fabric | Microsoft Learn
If you have any other questions please feel free to contact me.
Best Regards,
Yang
Community Support Team
If there is any post helps, then please consider Accept it as the solution to help the other members find it more quickly.
If I misunderstand your needs or you still have problems on it, please feel free to let us know. Thanks a lot!
There are no linefeeds in my data
This is your chance to engage directly with the engineering team behind Fabric and Power BI. Share your experiences and shape the future.
Check out the June 2025 Fabric update to learn about new features.
User | Count |
---|---|
6 | |
4 | |
3 | |
3 | |
3 |