Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Power BI is turning 10! Let’s celebrate together with dataviz contests, interactive sessions, and giveaways. Register now.

Reply
PBI_AMS
Helper I
Helper I

Steps for controlling volume of data from Dataverse to Datamart?

I have several large MS Dynamics CRM tables that I want to bring into a Datamart for easier report building in PowerBI and sharing with other users.  The main tables are Contacts, Accounts and Opportunities and I'm accessing them through Dataverse. Each of these has way too many rows and columns, and I've succeeded in getting smaller versions into a Datamart by choosing just the columns I want and by filtering for just the last two years of "Created on" Date column. So it works. BUT this is not what I actually want. I want the last two years of Opportunity data and then any Accounts related to an Opportunity and any Contacts related to that subset of Accounts. So, I'm really trying to understand what technique I need to learn to filter the data coming from Dataverse into my Datamart so I only get the Opportunities from the last two years and related Accounts and Contacts.  My assumption is that I need to pull in my Opportunity data and then add some sort of query (DAX? SQL?) to select ONLY Accounts where the accountid matches the accountid on the Opportunity. Any suggestions?

1 ACCEPTED SOLUTION

What you are doing with a datamart is duplicating your data subject infrastructure  (ie the company enterprise data warehouse or data lake etc), often for the reasons  you mention (insufficient or blocked access to the EDW).  You are adding steps to a process rather than taking steps away.

 

The merging of queries in Power Query is highly discouraged as you will run into issues with the formula firewall rather sooner than later.  Also note that Power Query is disk based so performance will be poor. (Power BI's Vertipaq engine runs in memory)

View solution in original post

6 REPLIES 6
PBI_AMS
Helper I
Helper I

Okay, again with my ignorance here. I'm adding steps, but in the cause of controlling the data upstream from my reports. So I'm okay with that. I don't really have an alternative if I want to use Dataverse data, I don't think.  

 

When you say disk based, do you mean on my desktop computer? I should be clear that I'm using Power Query in the service. I know there's a disk somewhere, but my experience has been that it's much faster, and that's confirmed by Performance Analyzer in the reports I build on desktop. 

 

I think more globally I'm just having a problem zeroing in on where I should be looking to find best practices for this sort of situation. To put it in terms I'm absorbing from the Explicit Measures podcast, I'm sort of an advanced business user, not a data engineer by any means, and I'm not going to get attention from any data engineers anytime soon!  So, basically, what's the most efficient way to assemble a data source that includes data from the company warehouse AND data that I supply from business units that I and other users can reuse for different reports, whether built in the Powerbi service or desktop?  

Performance Analyzer is for DAX, you probably meant Query Diagnostics?

 

There is no best practice for your situation, only more bad and less bad options.  I fully understand the limitations you have (We have the same) but that still doesn't allow me to recommend you either of these approaches.

 

The most efficient way - in my opinion - is a (promoted/certified) dataset.

Yes, sorry. I do get those confused.  And I wish we had someone who was creating promoted/certified datasets! One thing I am learning from trying to do this myself is that we are woefully under-resourced for our data infrastructure. 

lbendlin
Super User
Super User

You should not need to use a datamart if you already have a dataverse. That's a step back. 

 

What are your main concerns with using the dataverse as is?

I'm just going to put my ignorance out here, but here's my experience so far. I have very limited read access to Dataverse. If I access it directly from pbi desktop or the service, I have to build my data model new each time. What I like about Datamart so far is that I can build a reusable data model in it and I can add other data to it like a table of individual sales goals or a calendar table. And so far writing reports that refer to the Datamart is much faster than my reports that go directly to Dataverse. Does that make sense? Is there something I haven't understood about Dataverse? 

 

I should note that I think I figured out at least the start of my solution by using the merge query feature in Power query. I started with the opportunity table and created a left join with the Accounts table. But I'm wondering is there is a more efficient solution. 

What you are doing with a datamart is duplicating your data subject infrastructure  (ie the company enterprise data warehouse or data lake etc), often for the reasons  you mention (insufficient or blocked access to the EDW).  You are adding steps to a process rather than taking steps away.

 

The merging of queries in Power Query is highly discouraged as you will run into issues with the formula firewall rather sooner than later.  Also note that Power Query is disk based so performance will be poor. (Power BI's Vertipaq engine runs in memory)

Helpful resources

Announcements
June 2025 Power BI Update Carousel

Power BI Monthly Update - June 2025

Check out the June 2025 Power BI update to learn about new features.

May 2025 Monthly Update

Fabric Community Update - May 2025

Find out what's new and trending in the Fabric community.