Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Be one of the first to start using Fabric Databases. View on-demand sessions with database experts and the Microsoft product team to learn just how easy it is to get started. Watch now

Reply
Arturo24
Frequent Visitor

Columnar database

Sorry if this is a strange question:

Does it make sense to have a history columnar database and a current columnar database and have the current one append data to the history one? 

I know that the engine if very efficient for storing data.  Right now I use simple text files exported out of an access database, but constantly importing the same text files over and over seems like it will get super slow over time.  So I thought having two columnar databases (and appending current to history) could work.  But not sure if this is what this system was desinged for. 

Any thoughts for this newbie would be appreciated.  Thank you

1 ACCEPTED SOLUTION

You should always use the same bitness of Power BI as you are using for Office. That said, you might be able to install the 64bit version of the Microsoft Access 2010 Redistributable engine. That will allow Power BI 64bit to read your ACCDB files. But be forwarned, I cannot guarantee that installing that 64bit package will not mess up your 32bit version of Office!

 

As to Power BI retaining history and only getting new data, that is exactly what the incremental refresh feature does, but it requires a backend database that supports query folding (Oracle, SAP, SQL Server, PostGRE SQL, etc, and possibly Access) but it will not work with any flat file of any sort as there is no engine on the backend to limit the data. Power Query must process all data to know what to keep and not keep, so it is still doing 95% of the work in just the query. Loading takes comparatively little time at all. So if you must stick with CSV files, just load it each time.

 

Note that there is an interesting workaround in this blog by Chris Webb, but I am not sure I would use it in production as you will constantly have to keep up with what is "loaded" and what isn't and get invested in the use of ALM Toolkit to manage the data in the cloud. To me it is more interesting as an exercise than useful as a production tool.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

View solution in original post

3 REPLIES 3
edhans
Super User
Super User

I'm not 100% clear on what you are asking @Arturo24 but a few points:

  1. Why not connect directly to Microsoft Access vs the CSV files? Just use the Access connector.
  2. Reimporting the same files will not get slower over time. The import is a wipe/import/load so it won't get bigger and bigger. Power BI only holds what was in the last refresh. It does not add to the existing data. Existing data is removed.

If you want to keep history and only import new data, then Incremental Refresh is what you need. Theoritically you can do an incremental refresh against an Access database as PQ will fold some statements. You definitely cannot against CSV files. You have to have a DB server (even the Access JET engine) for this to work. But without having your DB to play with I cannot confirm it will work.

Incremental Refresh is really made for VERY VERY large datasets found on SQL Server and other relational databases. It wouldn't matter much on Access unless the data was well over a few hundred MB in size, which would be millions of records.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Good night. I first put this question in English and it was translated into Spanish. Interesting! I appreciate your answer. What happens is that the Power BI I use is 64 GB and the MS Access I use is 32 GB. And that forbids direct connection. I know that having all the CSVs in the same file allows you to import each CSV as long as they have their data in the same shape (columns with the same characteristics).

But if it's hundreds of text files each with millions of lines, I'm wasting time reimporting the data that's already been in Power BI, every time there's a new csv. That's why I thought I'd ask if it was possible to have one BI database history and another with new data and just add what's new to history. But what you're telling me isn't typical of using two distincto databases in the power BI model.

Again I appreciate your response and continue to investigate.

You should always use the same bitness of Power BI as you are using for Office. That said, you might be able to install the 64bit version of the Microsoft Access 2010 Redistributable engine. That will allow Power BI 64bit to read your ACCDB files. But be forwarned, I cannot guarantee that installing that 64bit package will not mess up your 32bit version of Office!

 

As to Power BI retaining history and only getting new data, that is exactly what the incremental refresh feature does, but it requires a backend database that supports query folding (Oracle, SAP, SQL Server, PostGRE SQL, etc, and possibly Access) but it will not work with any flat file of any sort as there is no engine on the backend to limit the data. Power Query must process all data to know what to keep and not keep, so it is still doing 95% of the work in just the query. Loading takes comparatively little time at all. So if you must stick with CSV files, just load it each time.

 

Note that there is an interesting workaround in this blog by Chris Webb, but I am not sure I would use it in production as you will constantly have to keep up with what is "loaded" and what isn't and get invested in the use of ALM Toolkit to manage the data in the cloud. To me it is more interesting as an exercise than useful as a production tool.



Did I answer your question? Mark my post as a solution!
Did my answers help arrive at a solution? Give it a kudos by clicking the Thumbs Up!

DAX is for Analysis. Power Query is for Data Modeling


Proud to be a Super User!

MCSA: BI Reporting

Helpful resources

Announcements
Las Vegas 2025

Join us at the Microsoft Fabric Community Conference

March 31 - April 2, 2025, in Las Vegas, Nevada. Use code MSCUST for a $150 discount!

ArunFabCon

Microsoft Fabric Community Conference 2025

Arun Ulag shares exciting details about the Microsoft Fabric Conference 2025, which will be held in Las Vegas, NV.

December 2024

A Year in Review - December 2024

Find out what content was popular in the Fabric community during 2024.