Solved: How to design and build data sets of large amount ...

Anonymous · ‎11-05-2018

I am new to Power BI. Please excuse any terminology not consistent with Power BI. I have been placed in charge of building dashboards for a large amount of data. I am trying to determine how to design the data sets to be used by Power BI.

As background, in the below example, I have built some aggregate tables for use in MS SQL reporting. A quick list of aggregate tables show these tables, which are equivalent to levels in reporting:

1. Region - additional categories - Month for 6 metrics

2. Market - Region - additional categories - Month for 6 metrics

3. Company - Market - Region - additional categories - Month for 6 metrics

4. Then the Contact table and fact tables

This works well in a SQL world. How do these "levels" get translated into a Power BI world?

Here is our data. Our basic facts are in 3 tables of 40 million, 5 million, and 1 million. The fact records hold information about Contacts and three different types of activities by date for the last 36 months. Our Contacts belong to Companies, which are categorized by Market, and then by Region. The hierarchy is like this:

Region - 10

Market - 55

Company - 6,000

Contacts - 150,000

Our users typically analyze each month fact counts, by region, market and company. the users want to interact with the dashboards to:

1. Compare 6 metrics based on counts by Region.

2. Then drill to Markets within any one Region

3. Then compare Companies, usually showing the top 20 Companies for a given month. Here we need a selector or filter to allow us to pick a month, and show the top 20 companies for that month.

4. then the users want to click on a Company and show the Contacts for that Company for that Month.

Note: realising how much data there is, we are considering limiting the dashboards to the last three months only. This would limit the facts to about 9 million records

What I am not sure of is this:

- Should I build a data set for Region and Market, and then one for each Market and the Companies in each Market?

- And how do I handle Company/Contact data?

Or in genera, how do others determine from large data sets, what levels, like region, market, company, to build data sets?

Thanks for any advice

This post is intended to give me the starting point to map out the data sets I need to build.

v-juanli-msft · ‎11-06-2018

Hi @Anonymous

1. Compare 6 metrics based on counts by Region.

I don't understand what "Compare 6 metrics" means, please give me an example.

counts by Region can be achived by a measure

counts by region = CALCULATE(COUNT('fact'[contacts]),ALLEXCEPT(contact,contact[region]))

Before i create a relationship between table "contact" and "fact" based on "contact" column

2. Then drill to Markets within any one Region

Add "Region","Market","Company" columns in the row field of the matrix visual, the measure in the value field.

measure->

count change with hierarchy = COUNT('fact'[contacts])

Then drill down from" region" to "market" level.

reference:

Using drill down with the Matrix visual

Drill mode in a visualization in Power BI

3. Then compare Companies, usually showing the top 20 Companies for a given month. Here we need a selector or filter to allow us to pick a month, and show the top 20 companies for that month.

Create a rank measure, then add this measure to the visual level filter to show items when value is less than or equal 20.

counts by companies = CALCULATE(COUNT('fact'[contacts]),ALLEXCEPT(contact,contact[companies]))

rank = RANKX(ALLSELECTED(contact),[counts by companies],,DESC,Dense)

To keep other visuals on the dashboard not affected by the slicer, plase edit interactions as below

Visualization interactions in a Power BI report

4. then the users want to click on a Company and show the Contacts for that Company for that Month.

Based on step3, add column "company" to the slicer, then add [counts by companies] in a card visual

Best Regards

Maggie

View solution in original post

v-juanli-msft · ‎11-06-2018

Hi @Anonymous

1. Compare 6 metrics based on counts by Region.

I don't understand what "Compare 6 metrics" means, please give me an example.

counts by Region can be achived by a measure

counts by region = CALCULATE(COUNT('fact'[contacts]),ALLEXCEPT(contact,contact[region]))

Before i create a relationship between table "contact" and "fact" based on "contact" column

2. Then drill to Markets within any one Region

Add "Region","Market","Company" columns in the row field of the matrix visual, the measure in the value field.

measure->

count change with hierarchy = COUNT('fact'[contacts])

Then drill down from" region" to "market" level.

reference:

Using drill down with the Matrix visual

Drill mode in a visualization in Power BI

3. Then compare Companies, usually showing the top 20 Companies for a given month. Here we need a selector or filter to allow us to pick a month, and show the top 20 companies for that month.

Create a rank measure, then add this measure to the visual level filter to show items when value is less than or equal 20.

counts by companies = CALCULATE(COUNT('fact'[contacts]),ALLEXCEPT(contact,contact[companies]))

rank = RANKX(ALLSELECTED(contact),[counts by companies],,DESC,Dense)

To keep other visuals on the dashboard not affected by the slicer, plase edit interactions as below

Visualization interactions in a Power BI report

4. then the users want to click on a Company and show the Contacts for that Company for that Month.

Based on step3, add column "company" to the slicer, then add [counts by companies] in a card visual

Best Regards

Maggie

Anonymous · ‎11-07-2018

Thanks for your post.

What I am trying to figure out is this:

- Do I load all 46 million fact records into a pbix? or

- Do I break up the data in some fashion for faster loading dashboards? Maybe by region by market or by market by company?

v-juanli-msft · ‎11-11-2018

Hi @Anonymous

When you load all 46 million fact records into a pbix, it may slow down the performance.

If you need all the data, after import, you could follow these tips to improve the performance.

if you need only some data, you could use parameter to connect data source and add parameter to filter data.

For more details, please read through this article.

https://www.red-gate.com/simple-talk/sql/bi/power-bi-introduction-working-with-parameters-in-power-bi-desktop-part-4/

Best Regards

Maggie

Anonymous · ‎11-06-2018

My question involves how to split data sets? My original post may have been complicated, but I am looking for advice on how to approach splitting my data to suuport our dashboarding needs.

How have others approach data set splitting?

How to design and build data sets of large amount of records

Helpful resources

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

How to Get Your Question Answered Quickly

Become a Certified Power BI Data Analyst!

How to design and build data sets of large amount of records

Helpful resources

Power BI Monthly Update - June 2025

Fabric Community Update - June 2025

How to Get Your Question Answered Quickly