Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreGet certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now
I encounter a very strange issue/behavior in PBI, which I suspect has something to do with the way merge transformation is used in PowerQuery.
Basically, I extract data from SQL Server database to PBI tables/queries. There are 5 queries in total (1 for Date, 3 "dimensions", and 1 "fact").
The source tables do not have obvious key field in them, I concatenate several fields to make up a key field.
Also, I added Index to the table/query in PowerQuery (via Add Column > Index Column).
I did this for the 3 "dimension" tables.
Then, for the "fact" table, I performed Merged Queries transformations (LEFT Outer with no fuzzy matching) with the 3 "dimension" tables based on the combination of fields which form the key fields.
From this transformation, I can then obtain the Index field (act like a surrogate key now) from each of the "dimension" tables, so that it replicates a proper star-schema / fact-dimension data model, which I believe is the best practice.
This is the approach that I follow https://radacad.com/build-your-first-star-schema-model-in-action-power-bi-modeling-basics
It was all working fine, until I performed data refresh.
Every time I refresh the data, even though the souce has not changed, the data in Power BI desktop keeps changing.
Looking at the Data tab, it seems that the Index keys in the fact table are out of whack, i.e. they dont correspond to the correct Index field value in the dimension tables.
This behaviour is also observed when I refresh data in Power BI Report Server and Power BI Service.
One thing that seems to "fix" it is to disable parallel loading in Power BI Desktop (File >Options and Settings > Options > Current File > Data Load > Parallel loading of tables > untick Enable parallel loading of tables).
It makes the data refresh/loading slow, but it seems that the resulting data does not get out of whack after it completes.
This only applies on Power BI Desktop though, i.e. even if I disable parallel loading in PBI Desktop, once it is published to PBIRS or PBI Service, the issue is still there.
Another thing that seems to "fix" the issue but only in PowerBI Desktop is as suggested by @ImkeF user in https://community.powerbi.com/t5/Power-Query/Inconsistent-Data-After-Refresh-Merge-Query-Issue/m-p/1... adding Table.Buffer at the end of PowerQuery steps.
The workaround:
Since the issue seems to be caused by PowerBI/PowerQuery incorrectly retrieves the Index field values after Merged Queries transformation in the fact table, I removed all the relationships which use these key/index fields.
Instead, I now use the concatenation of multiple fields to form the keys on each tables and join them in the Relationship.
By doing this, no issue with data getting out of whack - PBI Desktop and PBI Service data refresh seem fine.
Has anyone had experienced this before?
Is this a known bug?
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.