Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Shape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.

zenisekd

Snapshot data from Power BI Report automatically and feed them back with Fabric

Foreword

For some time I was dreaming of automatically saving calculated data on a regular basis from my report and using it to show the history trend. My use-case was that every quarter a commissions are paid to the employees per their performance. The problem is that the numbers may change over time, as some confirmed orders get canceled, so at the time of the compensation calculation, the sale is included, while a week later, when sale is canceled, it is excluded, which can change the compensation amount and thus causing discussion, why the number, based on which the commission was paid has changed. That is why we wanted to snapshot the amounts so we could reliably say, what amount was paid out to the employees over time.

Originally such snapshots could be done either with Power Automate or a third-party app. Nowadays it can be achieved also with Notebooks in Microsoft Fabric. As I was searching for the tutorial, I could not find any that would work for me right away, so I decided to write this blog post and summarize what I have done so you could set it up yourself quickly.

 

Requirements

  • Power BI PRO licence and higher
  • Power BI report published to service
  • Fabric capacity (because we will be using notebooks)

Steps in short

  • Create a Notebook
  • Add a semantic-link library to the environment
  • Add a Lakehouse
  • Use Semantic-link
  • Select your dataset
  • With Dax, specify the data you want to snapshot
  • Save the data to Lakehouse table
  • Schedule a refresh
  • Connect to the Lakehouse

More detailed steps (that worked for me)

1. Create a Notebook. Fairly simple, just click on "new item" and select "Notebook".

 

2. Set up the environment. In order to work with the semantic link. You need to import/install the semantic-link library to the environment, either on the workspace level or on the notebook level. 

zenisekd_3-1729753158116.png

When you start writing the name "semantic-link", it doesn't appear in the helper. Don't worry about it and just write the full library name there. It will stick.

 

3. Add a lakehouse to the notebook. Either take the existing one or create a new one.

zenisekd_0-1729749734689.png

 

4. Use Semantic Link. Next we import the semantic link, pandas and initialize spark session.

 

 

#%pip install semantic-link
import sempy.fabric as fabric
import pandas as pd
from pyspark.sql import SparkSession
# Initialize Spark session
spark = SparkSession.builder.getOrCreate()

 

 

5. Access the dataset you want to use for the snapshot through code./p>

 

 

mydataset = "Management Report"

 

 

6. Write DAX to retrieve the data you want to snapshot.
In my case, I wanted to do a quarterly snapshot, which would save the employee, his compensation (commission), Quarter, Year and a timestamp of when the snapshot was done.  
Next, I saved the dax to a data frame and renamed the columns to a more suitable format.

 

 

myDax = """
EVALUATE
	ADDCOLUMNS(
		SUMMARIZECOLUMNS(
			'Compensation User-Role'[User],
			'Compensation Roles'[Category],
			FILTER(
				'Date table',
				'Date table'[Qarter numbr] = QUARTER(TODAY() - 30) && 'Date table'[Year] = YEAR(TODAY() - 30)
			),
			"Compensation", [Total Compensation (All Curr)]
		),
		"Year", YEAR(TODAY() - 30),
		"Quarter", "Q" & QUARTER(TODAY() - 30),
		"Time Stamp", NOW() + TIME(1,0,0)
	)
"""    
result = fabric.evaluate_dax(mydataset, myDax)

# Convert the DAX query result to a DataFrame
df = pd.DataFrame(result)

# Rename columns if needed
df.columns = ['User', 'Category', 'Compensation', 'Year', 'Quarter', 'TimeStamp']
df

 

For some reason, I had troubles with using the magic %%, so instead I used """.  Also I recommend to write and test the Dax query first in Power BI, and see if it works and how it behaves, then copy-paste it to Notebook.

 

7. Save the data (frame) to the lakehouse - I used append mode (so the snapshotted tables stac on top of each other.)

 

 

spark.createDataFrame(df).write.mode("append").format("delta").saveAsTable("Appache_spark_test.Compensations_Snapshot_test")

 

 

Appache_spark_test - is the name of the Lakehouse

Compensations_Snapshot_test - is the name of the table, under which you want the data to be saved (no need for creating the table ahead).
 

8. Set a schedule in the notebook.

You can use the native schedule in a notebook.

zenisekd_1-1729751425728.png

Or use a Pipeline with a more complex refresh process, e.g. if you want to do the snapshot e.g. monthly/quarterly and under some conditions. 

 

9. Use the Lakehouse as a source for your report. In my case, I connected to the SQL endpoint. 

zenisekd_2-1729751793412.png

Final Word

And this is it. This worked for me.  Let me know, if you have any improvements or comments. It took me a while to get it working and it certainly is not perfect 🙂

Sources:

https://learn.microsoft.com/en-us/fabric/data-science/semantic-link-power-bi?tabs=python
https://medium.com/@sdaelemans/store-your-data-in-power-bi-to-onelake-with-semantic-link-in-microsof...
https://www.youtube.com/watch?v=zMiRGZsfQgs&t=2s&ab_channel=GuyinaCube
https://www.youtube.com/watch?v=VAHOl9KqSsk&ab_channel=HoosierBI 

https://community.fabric.microsoft.com/t5/Service/Snapshot-data-from-Power-BI-Report-to-Lakehouse/m-...