Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more. Get started

Reply
elaj
Helper IV
Helper IV

Working together on same datasource

Hi,

 

we work as a team on our data. And because the query editor is not able to work with our typical survey data (too many columns), we structure the data with R (mainly verticalize it) to then load it into powerbi. We will have a new wave every month for several years. so we will prepare every wave with R again to then append it to the existing data. We would like to connect our source to power bi service, so that we dont have to always upload the pbix and be able only to update the datasource and refresh.

Since we work together as a team, and the datasource will be huge, we want to share the source-database (or source-files). so if somebody changed something in the R script and generated a new output.. everybodys powerbi desktop should be able to pull it via refresh (and also the database in powerbi service)

 

We tried MySQL and it seems like that this approach has two drawbacks..

  • it needs a lot of time to build and we will have frequent corrections and changes.. which is very timeconsuming
  • it needs a gateway, because powerbi service is not able to connect differently

Then i thought maybe parquet files. they are leightweight and easy to replace. But how to connect to them via powerbi service?

We have sharepoint, but parquet files have problems when loaded from a sharepoint.. (csv would work.. BUT a parquet with 450MB would be a csv 5,5GB.. and that would again need a lot of time to upload when changed)

 

Is here somebody facing the same thing? And already have a solution to that? We dont want to pay for Azure stuff. Only if it is not possible otherwise.

 

Thank you for your help

1 REPLY 1
Sahir_Maharaj
Super User
Super User

Hello @elaj,

 

Power BI Dataflows can be a good solution for team collaboration. You can create and schedule dataflows to refresh data from various sources, including the output of your R scripts. Dataflows also support storing data in Azure Data Lake Storage Gen2 (cost-effective option compared to other Azure services)


For the SharePoint issue with Parquet files, consider automating the process of converting Parquet files to a more SharePoint-friendly format if needed.

 

Also If you are avoiding heavy database solutions, consider using a lightweight database like SQLite. It's file-based, supports concurrent access, and can be a good intermediary storage solution.

 

Should you have any questions or further assistance, please do not hesitate to reach out to me.

 


Did I answer your question? Mark my post as a solution, this will help others!

If my response(s) assisted you in any way, don't forget to drop me a "Kudos" 🙂

Kind Regards,
Sahir Maharaj
Data Scientist | Data Engineer | Data Analyst | AI Engineer
P.S. Want me to build your Power BI solution?
➤ Lets connect on LinkedIn: Join my network of 15K+ professionals
➤ Join my free newsletter: Data Driven: From 0 to 100
➤ Website: https://sahirmaharaj.com
➤ Email: sahir@sahirmaharaj.com
➤ Want me to build your Power BI solution? Lets chat about how I can assist!
➤ Join my Medium community of 30K readers! Sharing my knowledge about data science and artificial intelligence
➤ Explore my latest project (350K+ views): Wordlit.net
➤ 100+ FREE Power BI Themes: Download Now
LinkedIn Top Voice in Artificial Intelligence, Data Science and Machine Learning

Helpful resources

Announcements
Sept PBI Carousel

Power BI Monthly Update - September 2024

Check out the September 2024 Power BI update to learn about new features.

September Hackathon Carousel

Microsoft Fabric & AI Learning Hackathon

Learn from experts, get hands-on experience, and win awesome prizes.

Sept NL Carousel

Fabric Community Update - September 2024

Find out what's new and trending in the Fabric Community.