Supplies are limited. Contact info@espc.tech right away to save your spot before the conference sells out.
Get your discountScore big with last-minute savings on the final tickets to FabCon Vienna. Secure your discount
We have several PBI report with parquet files stored in ADLSG2 as datasource. There are multiple parquet files per dataset. To load all the parquet files at once we added below line in advance editor:
Source = AzureStorage.DataLake("https://xxxxxxxxxx.dfs.core.windows.net/Container"),
The reports are working fine on desktop. They are working after published to PBI service. The dataset can refresh successfully. So we setup schedule to refresh them 8 times per day. There are new data written to parquet everyday but no change on schema.
However, roughly about once per week. The dataset fail to refresh. The error is always:
The XXX column does not exist in the rowset
It can be fixed by simply refresh the report data with Power BI Desktop, then publish the report again. Then the dataset will be resumed to normal for a few days until it fail again with same error.
Any idea what is the cause?
Hi @Azure_newbie ,
Based on the information you have provided, the problem you are experiencing may be caused by a change in the original data source, such as a column being moved, renamed, or deleted.
Please check if the original data source has changed.
Best Regards,
Ada Wang
If this post helps, then please consider Accept it as the solution to help the other members find it more quickly.
There is no change of schema or columns.
However due to the working behaviour of parquet, the number of parquet files under https://xxxxxxxxxx.dfs.core.windows.net/Container will grow according to size of overall data.
Hi @Azure_newbie ,
If you have a Pro license you can open a Pro ticket at https://admin.powerplatform.microsoft.com/newsupportticket/powerbi
Otherwise you can raise an issue at https://community.fabric.microsoft.com/t5/Issues/idb-p/Issues .
Best Regards,
Ada Wang
I would suggest looking at the new parquet connector in power query. What should help you to successfully refresh data? Power Query Parquet connector - Power Query | Microsoft Learn
According to my test, the parquet connector can only read a specific parquet file. But we need it read a specific folder with several parquet partitions and files.
In Power BI it is possible to read multiple files in a folder and then connect to the Parquet files.
Sorry for not making myself clear.
Currently I am using a M query (connectors/powerbi/fn_ReadDeltaTable.pq at master · delta-io/connectors (github.com) ) to load multiple parquet files in a folder.
I am still learning so although the script is working, I don't know which connector it is using and why it is fail from time to time.