Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Earn a 50% discount on the DP-600 certification exam by completing the Fabric 30 Days to Learn It challenge.

Reply
Anonymous
Not applicable

Excel 2016 Import Data via Power Pivot (Data Model) vs PowerQuery

Hi, very basic question here.

 

I am very new to the whole PowerBi World, I depend a lot on it for our daily reporting, but I reckon I am NOT using it in the best way.

 

Let's start with a simple question: how to import the data, this is how I am going about it and these are the issues I have.

 

  1. Our ERP software is configured to export 4 report files to a specific location on a network drive, once a day.
  2. I created 4 different queries that fetch data (3 xls files and 1 csv), carry out a lot of cleaning / sorting operations on them
  3. Some of the queries are merged since I need to carry out the equivalent of VLOOKUP
  4. I then open the Data Model, add more columns and define the measures
  5. At this point the file is already VERY slow.
  6. Once I have all my data in Excel, I create the Pivots, but it gets confusing, since I see both the Tables and the Queries in the Pivot Field list. What should I do?

 

  • Can I just remove all the worksheets and create my pivots based ONLY on the Data Model? Would it be quicker?
  • What if I do nOT use the PowerQuery but instead import the data straight from the Data Model? Is it cleaner / simpler?
1 ACCEPTED SOLUTION
edhans
Super User
Super User

Unless you are accessing SSAS tabular data, you should always use Power Query to import your data. It does the transformations. You cannot do any transformations in Power Pivot's data model. You can add columns, but you cannot filter, remove data, or otherwise transform. Any filtering is visual only, and does not limit what is loaded into the model.

 

When building your data, you shouldn't have anything actually in the Excel spreadsheets. If you are loading data, load only to the data model as below. The Table section should be "Connection Only"

 

20191025 14_07_07-Book1 - Excel.png

 

Even if you do have data in a table you load to Power Query, do some transformations, then load directly into the model, that will show your original table plus the query. Just use the tables/queries that have the little yellow database icon next to them. Once you get going, you just use the part of the Pivot Fields window that shows Active connections in this Pivot, vs All.

 

You should not add Calculated Columns as a rule to your model. If you can do the math to add a calculated column, do it in Power Query. Calculated columns will not compress, take up more room, and can slow the model down depending on the number of rows, especially on a refresh. There are some limited reasons to do it in Power BI when you need it for a slicer and cannot create it in Power Query, or there is some more complex calculation that isn't feasable or possible in Power Query (this is pretty rare though).

 

You shouldn't do any work in Excel's spreadsheets either. No vlookups, text concatenations, etc. Do it ALL in Power Query, then load those only into the data model. I have some very Power Query models with over 100 queries that eventually boil down to 10-12 that get loaded to the Data Model, and then I start working on the Pivot Table. There is not a single cell of data in Excel's spreadsheet side until I get to that point. 100% is imported directly into PQ.

 

As for speed, it depends on where your data is. If you are using Sharepoint, you'll usually want to change the initial connection string from SharePoint.Files to SharePoint.Contents, then go to the folder you want. SharePoint.Files loads every file name and other metadata before it will move to the actual file(s) you want. If OneDrive, it just depends on your connection. If it is a lot of data (hundreds of thousands of records) consider DataLake. We throw tons of text files into DataLake and use PQ to transform the data before pulling into Power Pivot or Power BI's data model.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

1 REPLY 1
edhans
Super User
Super User

Unless you are accessing SSAS tabular data, you should always use Power Query to import your data. It does the transformations. You cannot do any transformations in Power Pivot's data model. You can add columns, but you cannot filter, remove data, or otherwise transform. Any filtering is visual only, and does not limit what is loaded into the model.

 

When building your data, you shouldn't have anything actually in the Excel spreadsheets. If you are loading data, load only to the data model as below. The Table section should be "Connection Only"

 

20191025 14_07_07-Book1 - Excel.png

 

Even if you do have data in a table you load to Power Query, do some transformations, then load directly into the model, that will show your original table plus the query. Just use the tables/queries that have the little yellow database icon next to them. Once you get going, you just use the part of the Pivot Fields window that shows Active connections in this Pivot, vs All.

 

You should not add Calculated Columns as a rule to your model. If you can do the math to add a calculated column, do it in Power Query. Calculated columns will not compress, take up more room, and can slow the model down depending on the number of rows, especially on a refresh. There are some limited reasons to do it in Power BI when you need it for a slicer and cannot create it in Power Query, or there is some more complex calculation that isn't feasable or possible in Power Query (this is pretty rare though).

 

You shouldn't do any work in Excel's spreadsheets either. No vlookups, text concatenations, etc. Do it ALL in Power Query, then load those only into the data model. I have some very Power Query models with over 100 queries that eventually boil down to 10-12 that get loaded to the Data Model, and then I start working on the Pivot Table. There is not a single cell of data in Excel's spreadsheet side until I get to that point. 100% is imported directly into PQ.

 

As for speed, it depends on where your data is. If you are using Sharepoint, you'll usually want to change the initial connection string from SharePoint.Files to SharePoint.Contents, then go to the folder you want. SharePoint.Files loads every file name and other metadata before it will move to the actual file(s) you want. If OneDrive, it just depends on your connection. If it is a lot of data (hundreds of thousands of records) consider DataLake. We throw tons of text files into DataLake and use PQ to transform the data before pulling into Power Pivot or Power BI's data model.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
RTI Forums Carousel3

New forum boards available in Real-Time Intelligence.

Ask questions in Eventhouse and KQL, Eventstream, and Reflex.

MayPowerBICarousel

Power BI Monthly Update - May 2024

Check out the May 2024 Power BI update to learn about new features.

LearnSurvey

Fabric certifications survey

Certification feedback opportunity for the community.

Top Solution Authors
Top Kudoed Authors