Solved: How to reproduce brancg names in azure blob storag...

jaryszek · ‎06-19-2025

Hi Guys,

in Azure Data Lake Storage there is possibility to active folders and use as branch names?
Did you try this approach?

Or what other tools I can use ?

Best,
Jacek

Nasif_Azam · ‎06-24-2025

Hey @jaryszek ,

Both the single-container approach and multi-container approach can work, but each has its pros and cons, depending on your specific needs.

Single Container Approach (Using Directories as Branches): In this approach, all your data resides in a single container with folders (directories) acting as branches.

Pros:

Centralized Management: Easier to manage access controls and permissions.
Cost-Efficient: More cost-effective for large volumes of small files.
Data Organization: Simplifies querying and listing data, especially with hierarchical namespaces (ADLS Gen2).

Cons:

Scalability Limitations: Performance can degrade as data grows, particularly with large directories.
Access Control Granularity: Harder to enforce strict access controls for different teams within the same container.

Multi-Container Approach (Each Container is a Branch): In this approach, each branch gets its own container.

Pros:

Granular Access Control: Allows specific permissions for each container, providing teams with full control over their branch.
Scalability: Isolated containers prevent data in one branch from affecting the performance of others.
Clear Separation: Independent data storage for each branch helps with version management and minimizes accidental overwrites.

Cons:

Complexity: Managing multiple containers requires more configuration, leading to higher overhead.
Potential Cost Increase: Multiple containers may increase management and storage costs, depending on the structure.

Recommendation:

Single Container Approach: Ideal for smaller projects or when aiming to reduce complexity. It uses directories within a single container to represent branches.
Multi-Container Approach: Suitable for projects requiring clear separation, better access control, or scalability with independent teams. Each branch gets its own container for better management and isolation.

It all depends on your workflow, access control needs, and the scale of the data.

If you found this solution helpful, please consider accepting it and giving it a kudos (Like) it’s greatly appreciated and helps others find the solution more easily.

Best Regards,
Nasif Azam

Did I answer your question?
If so, mark my post as a solution!
Also consider helping someone else in the forums!

Proud to be a Super User!

View solution in original post

v-echaithra · ‎06-23-2025

Hi @jaryszek ,

We wanted to kindly follow up to check if the solution provided for the issue worked? or Let us know if you need any further assistance?
If our response addressed, please mark it as Accept as solution and click Yes if you found it helpful.

Regards,
Chaithra.

Nasif_Azam · ‎06-19-2025

Hey @jaryszek ,

In Azure Blob Storage, there isn’t an exact equivalent to Git-style branches, but you can simulate branching using a few different strategies:

Folder Structure (Directory Organization):
You can organize your data into directories that act like branches. For example:
```
/container/main/
/container/feature-branch1/
/container/feature-branch2/
```
This allows you to segregate data based on different stages or versions (like branches).
Naming Conventions for Blob Names:
Another approach is using naming conventions where the "branch" name is part of the blob’s name itself:
```
/container/main/file1.csv
/container/feature-branch1/file1.csv
/container/feature-branch2/file1.csv
```
Azure Data Lake Gen2 (ADLS Gen2):
If you need hierarchical namespace support, ADLS Gen2 provides the ability to create directories and manage data with a more structured hierarchy. It might be useful for managing large datasets across different "branches."
Versioning in Blob Storage:
For versioning purposes, Azure Blob Storage supports object versioning, allowing you to track changes over time, though it’s not the same as branching in Git.
Other Tools:
If you're looking for full version control (like Git branching), tools like Azure Repos or GitHub might be more suitable for that purpose.

If you're primarily managing data in Azure Blob Storage and want a simple approach to mimicking branches, I’d recommend starting with directory-based organization or using blob naming conventions. However, if you need true version control and collaboration on code or data changes, Azure Repos or GitHub would be better options.

For Detailed Information:

Folder Structure (Directory Organization)

Naming Conventions for Blob Names

Azure Data Lake Gen2 (ADLS Gen2)

Versioning in Blob Storage

If you found this solution helpful, please consider accepting it and giving it a kudos (Like) it’s greatly appreciated and helps others find the solution more easily.

Best Regards,
Nasif Azam

Did I answer your question?
If so, mark my post as a solution!
Also consider helping someone else in the forums!

Proud to be a Super User!

jaryszek · ‎06-24-2025

Thank you Nasif.

Question about this solution:

/container/main/

why to use one container instead of multi-container approach where each container equals branch name?

Best,
Jacek

Nasif_Azam · ‎06-24-2025

Hey @jaryszek ,

Both the single-container approach and multi-container approach can work, but each has its pros and cons, depending on your specific needs.

Single Container Approach (Using Directories as Branches): In this approach, all your data resides in a single container with folders (directories) acting as branches.

Pros:

Centralized Management: Easier to manage access controls and permissions.
Cost-Efficient: More cost-effective for large volumes of small files.
Data Organization: Simplifies querying and listing data, especially with hierarchical namespaces (ADLS Gen2).

Cons:

Scalability Limitations: Performance can degrade as data grows, particularly with large directories.
Access Control Granularity: Harder to enforce strict access controls for different teams within the same container.

Multi-Container Approach (Each Container is a Branch): In this approach, each branch gets its own container.

Pros:

Granular Access Control: Allows specific permissions for each container, providing teams with full control over their branch.
Scalability: Isolated containers prevent data in one branch from affecting the performance of others.
Clear Separation: Independent data storage for each branch helps with version management and minimizes accidental overwrites.

Cons:

Complexity: Managing multiple containers requires more configuration, leading to higher overhead.
Potential Cost Increase: Multiple containers may increase management and storage costs, depending on the structure.

Recommendation:

Single Container Approach: Ideal for smaller projects or when aiming to reduce complexity. It uses directories within a single container to represent branches.
Multi-Container Approach: Suitable for projects requiring clear separation, better access control, or scalability with independent teams. Each branch gets its own container for better management and isolation.

It all depends on your workflow, access control needs, and the scale of the data.

If you found this solution helpful, please consider accepting it and giving it a kudos (Like) it’s greatly appreciated and helps others find the solution more easily.

Best Regards,
Nasif Azam

Did I answer your question?
If so, mark my post as a solution!
Also consider helping someone else in the forums!

Proud to be a Super User!

v-echaithra · ‎06-19-2025

Hi @jaryszek ,

Thank you for reaching out to Microsoft Community.

Azure Data Lake Storage (ADLS), particularly Gen2, supports a hierarchical namespace, which allows you to organize all the objects and files within your storage account into a hierarchy of directories and nested subdirectories, similar to folders in a file system. However, these folders are purely logical constructs for organizing data and don’t function like Git branches but you can simulate them using virtual folders. Blob names can include slashes (/), which creates a virtual directory structure. For example:

projectA/branch1/data.csv
projectA/branch2/data.csv

This way, each “branch” is just a prefix in the blob name. You can then use tools like Azure Data Factory to automate data movement between branches (folders), Synapse to query data across folders using wildcard paths, GitHub Actions or Azure DevOps Pipelines to push data from different Git branches into corresponding folders in Blob or ADLS. or Power BI to dynamically access blobs based on the branch name.

If this helped, please mark it as the solution so others can benefit too. And if you found it useful, kudos are always appreciated.

Thanks,
Chaithra E.

How to reproduce brancg names in azure blob storage?

Helpful resources

Power BI Monthly Update - September 2025

FabCon is coming to Atlanta

How to reproduce brancg names in azure blob storage?

Helpful resources

Power BI Monthly Update - September 2025