Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get certified in Microsoft Fabric—for free! For a limited time, get a free DP-600 exam voucher to use by the end of 2024. Register now

Reply
RomainB
New Member

Number of dataset

What is the best practice for a company:

- Create a single data set and base all reports on it

- create a data set based on

1 ACCEPTED SOLUTION
123abc
Community Champion
Community Champion

The best practice for managing datasets within a company can vary depending on the specific needs and requirements of the organization. Both approaches you mentioned have their advantages and disadvantages, and the choice between them should be made considering factors like data volume, data complexity, scalability, and business goals. Let's explore both options:

  1. Create a Single Data Set and Base All Reports on It:

    • Advantages:

      • Consistency: Using a single, centralized dataset can ensure that all reports and analyses are based on the same source of truth, leading to consistency in decision-making.
      • Simplicity: Managing one dataset can be simpler and easier to maintain than multiple datasets, especially for smaller organizations or less complex data needs.
      • Reduced Redundancy: This approach can help reduce data redundancy and potential conflicts.
    • Disadvantages:

      • Limited Flexibility: A single dataset may not accommodate all reporting and analysis needs, leading to limitations in the types of insights that can be generated.
      • Scalability Issues: As the organization grows and data volume increases, a single dataset may become unwieldy and less performant.
      • Data Governance Challenges: Ensuring data quality, security, and compliance may be more challenging with a single, central dataset.
  2. Create Multiple Data Sets Based on Use Cases:

    • Advantages:

      • Flexibility: Creating datasets tailored to specific use cases allows for more flexibility in data analysis and reporting. Different teams can have datasets optimized for their needs.
      • Scalability: This approach can scale more easily as the organization grows, as new datasets can be created as needed.
      • Better Performance: Smaller, specialized datasets may perform better for specific types of analyses.
    • Disadvantages:

      • Potential for Inconsistency: Managing multiple datasets can lead to data inconsistencies and challenges in aligning different analyses.
      • Complexity: Maintaining multiple datasets requires more effort in terms of data governance, quality control, and documentation.
      • Increased Overhead: Managing multiple datasets can lead to increased storage and computational costs.

In practice, many organizations adopt a hybrid approach, combining the two strategies:

  • Centralized Data Warehouse: Maintain a centralized data warehouse or data lake where raw data is stored and transformed into a common dataset. This common dataset can serve as a foundation for important enterprise-level reporting and analysis.

  • Specialized Datasets: Create specialized datasets or data marts for specific departments or use cases. These can be derived from the central dataset and customized to meet specific requirements.

  • Data Governance: Implement strong data governance practices to ensure data quality, security, and compliance across all datasets.

The choice between these approaches should be driven by the organization's unique needs, resources, and goals. It's important to regularly review and adapt the data management strategy as the company evolves and its data requirements change.

 

If I answered your question, please mark my post as solution, Appreciate your Kudos !

 

View solution in original post

3 REPLIES 3
Idrissshatila
Super User
Super User

Hello @RomainB ,

 

What I would recommend is to get the data that matches the requirments you want to acheive.

 

Also as long as you can do the transformations in the data source, then do them there else do them in Power query.

 

also build your data Model schema as a star schema https://learn.microsoft.com/en-us/power-bi/guidance/star-schema

 

If I answered your question, please mark my post as solution, Appreciate your Kudos 👍

Follow me on Linkedin
Vote for my Community Mobile App Idea 💡



Did I answer your question? Mark my post as a solution! Appreciate your Kudos
Follow me on LinkedIn linkedIn
Vote for my Community Mobile App Idea

Proud to be a Super User!




123abc
Community Champion
Community Champion

The best practice for managing datasets within a company can vary depending on the specific needs and requirements of the organization. Both approaches you mentioned have their advantages and disadvantages, and the choice between them should be made considering factors like data volume, data complexity, scalability, and business goals. Let's explore both options:

  1. Create a Single Data Set and Base All Reports on It:

    • Advantages:

      • Consistency: Using a single, centralized dataset can ensure that all reports and analyses are based on the same source of truth, leading to consistency in decision-making.
      • Simplicity: Managing one dataset can be simpler and easier to maintain than multiple datasets, especially for smaller organizations or less complex data needs.
      • Reduced Redundancy: This approach can help reduce data redundancy and potential conflicts.
    • Disadvantages:

      • Limited Flexibility: A single dataset may not accommodate all reporting and analysis needs, leading to limitations in the types of insights that can be generated.
      • Scalability Issues: As the organization grows and data volume increases, a single dataset may become unwieldy and less performant.
      • Data Governance Challenges: Ensuring data quality, security, and compliance may be more challenging with a single, central dataset.
  2. Create Multiple Data Sets Based on Use Cases:

    • Advantages:

      • Flexibility: Creating datasets tailored to specific use cases allows for more flexibility in data analysis and reporting. Different teams can have datasets optimized for their needs.
      • Scalability: This approach can scale more easily as the organization grows, as new datasets can be created as needed.
      • Better Performance: Smaller, specialized datasets may perform better for specific types of analyses.
    • Disadvantages:

      • Potential for Inconsistency: Managing multiple datasets can lead to data inconsistencies and challenges in aligning different analyses.
      • Complexity: Maintaining multiple datasets requires more effort in terms of data governance, quality control, and documentation.
      • Increased Overhead: Managing multiple datasets can lead to increased storage and computational costs.

In practice, many organizations adopt a hybrid approach, combining the two strategies:

  • Centralized Data Warehouse: Maintain a centralized data warehouse or data lake where raw data is stored and transformed into a common dataset. This common dataset can serve as a foundation for important enterprise-level reporting and analysis.

  • Specialized Datasets: Create specialized datasets or data marts for specific departments or use cases. These can be derived from the central dataset and customized to meet specific requirements.

  • Data Governance: Implement strong data governance practices to ensure data quality, security, and compliance across all datasets.

The choice between these approaches should be driven by the organization's unique needs, resources, and goals. It's important to regularly review and adapt the data management strategy as the company evolves and its data requirements change.

 

If I answered your question, please mark my post as solution, Appreciate your Kudos !

 

Thank for your response, i think that i have find my solution

Helpful resources

Announcements
November Carousel

Fabric Community Update - November 2024

Find out what's new and trending in the Fabric Community.

Live Sessions with Fabric DB

Be one of the first to start using Fabric Databases

Starting December 3, join live sessions with database experts and the Fabric product team to learn just how easy it is to get started.

Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount! Early Bird pricing ends December 9th.

Nov PBI Update Carousel

Power BI Monthly Update - November 2024

Check out the November 2024 Power BI update to learn about new features.