Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
andyclap
Advocate II
Advocate II

pbip - Great progress, but where is this headed?

I'm a software developer primarily, and understand how managing change across complex projects is enabled by modern vcs (i.e. git).

As such I've been following several attempts to enable proper vcs integration across power bi reports via serialzation to text format to allow diff & merge.

Starting with the original python pbit extractor, then Mathias Thierbach's excellent pbi-tools (https://github.com/pbi-tools/pbi-tools) - there have been various attempts, unfortunately complicated by the underlying serialization formats.

 

Now we have the pbip format, this is getting closer to enabling complex change in power bi - i.e. diff & merge - which is great!

The team has done excellent work on stabilizing ids and removing derrived elements that just add noise to the diff.

 

However, right now there are a few show-stoppers that if resolved would finish the job and take power-bi report development to the next level:

 

* Single file approach:

  * A complex report likely contains may tabs/pages.

  * Having these reports all in one file means that file churns

  * to paraphrase SRP: "A file should only have one reason to change"

  * Likewise a complex report likely contains may tables/data sources

  * (and simliarly but slightly less complex - the datasetDiagramLayout)

* I see a big benefit to splitting report.json by section; splitting model.bim by expression

 

* Serialization of complex json structures into strings

  * It's simply unfeasible to diff & merge a single line string value as complex as these

  * visual properties in each visuals' config property are around 2K long each, and are almost-impossible to diff & merge.

  * Bookmarks in report.json's /config property is unmanageable - one of my report's config line is 600K long, and expands when reserailized to 61K lines of json! This is big enough to warrant splitting into files by bookmark let alone serializing.

* Ideally these would be serialized as proper subdocument properties.

 

* Minor things

  * json arrays are ordered, you don't need ordinals: if you reorder anything every node has a change.

  * why the blank first line in multi-line measure expressions?

 

My actual question here is are these likely to be addressed by the pbi team? Is there an open roadmap for this?

Or will it be worth investing time to build out further transformation tooling ourselves to process the pbip files and re-serialize to a format that is closer to the goal of clean and precise diff & merge?

  

1 REPLY 1
andyclap
Advocate II
Advocate II

Note - there's already a pbi idea for splitting model.pbim,

https://ideas.fabric.microsoft.com/ideas/idea/?ideaid=b3477fe6-4e22-ee11-a81c-0022484f371d

I encourage people interested in this subject to vote for it 🙂

 

Also I appreciate the report schema isn't final/published yet so hopefully the report side of things is being thought about.

 

Helpful resources

Announcements
July 2024 Power BI Update

Power BI Monthly Update - July 2024

Check out the July 2024 Power BI update to learn about new features.

July Newsletter

Fabric Community Update - July 2024

Find out what's new and trending in the Fabric Community.