Starting December 3, join live sessions with database experts and the Microsoft product team to learn just how easy it is to get started
Learn moreShape the future of the Fabric Community! Your insights matter. That’s why we created a quick survey to learn about your experience finding answers to technical questions. Take survey.
Foreword
For some time I was dreaming of automatically saving calculated data on a regular basis from my report and using it to show the history trend. My use-case was that every quarter a commissions are paid to the employees per their performance. The problem is that the numbers may change over time, as some confirmed orders get canceled, so at the time of the compensation calculation, the sale is included, while a week later, when sale is canceled, it is excluded, which can change the compensation amount and thus causing discussion, why the number, based on which the commission was paid has changed. That is why we wanted to snapshot the amounts so we could reliably say, what amount was paid out to the employees over time.
Originally such snapshots could be done either with Power Automate or a third-party app. Nowadays it can be achieved also with Notebooks in Microsoft Fabric. As I was searching for the tutorial, I could not find any that would work for me right away, so I decided to write this blog post and summarize what I have done so you could set it up yourself quickly.
Requirements
Steps in short
More detailed steps (that worked for me)
1. Create a Notebook. Fairly simple, just click on "new item" and select "Notebook".
2. Set up the environment. In order to work with the semantic link. You need to import/install the semantic-link library to the environment, either on the workspace level or on the notebook level.
When you start writing the name "semantic-link", it doesn't appear in the helper. Don't worry about it and just write the full library name there. It will stick.
3. Add a lakehouse to the notebook. Either take the existing one or create a new one.
4. Use Semantic Link. Next we import the semantic link, pandas and initialize spark session.
#%pip install semantic-link
import sempy.fabric as fabric
import pandas as pd
from pyspark.sql import SparkSession
# Initialize Spark session
spark = SparkSession.builder.getOrCreate()
5. Access the dataset you want to use for the snapshot through code./p>
mydataset = "Management Report"
6. Write DAX to retrieve the data you want to snapshot.
In my case, I wanted to do a quarterly snapshot, which would save the employee, his compensation (commission), Quarter, Year and a timestamp of when the snapshot was done.
Next, I saved the dax to a data frame and renamed the columns to a more suitable format.
myDax = """
EVALUATE
ADDCOLUMNS(
SUMMARIZECOLUMNS(
'Compensation User-Role'[User],
'Compensation Roles'[Category],
FILTER(
'Date table',
'Date table'[Qarter numbr] = QUARTER(TODAY() - 30) && 'Date table'[Year] = YEAR(TODAY() - 30)
),
"Compensation", [Total Compensation (All Curr)]
),
"Year", YEAR(TODAY() - 30),
"Quarter", "Q" & QUARTER(TODAY() - 30),
"Time Stamp", NOW() + TIME(1,0,0)
)
"""
result = fabric.evaluate_dax(mydataset, myDax)
# Convert the DAX query result to a DataFrame
df = pd.DataFrame(result)
# Rename columns if needed
df.columns = ['User', 'Category', 'Compensation', 'Year', 'Quarter', 'TimeStamp']
df
For some reason, I had troubles with using the magic %%, so instead I used """. Also I recommend to write and test the Dax query first in Power BI, and see if it works and how it behaves, then copy-paste it to Notebook.
7. Save the data (frame) to the lakehouse - I used append mode (so the snapshotted tables stac on top of each other.)
spark.createDataFrame(df).write.mode("append").format("delta").saveAsTable("Appache_spark_test.Compensations_Snapshot_test")
Appache_spark_test - is the name of the Lakehouse
8. Set a schedule in the notebook.
You can use the native schedule in a notebook.
Or use a Pipeline with a more complex refresh process, e.g. if you want to do the snapshot e.g. monthly/quarterly and under some conditions.
9. Use the Lakehouse as a source for your report. In my case, I connected to the SQL endpoint.
Final Word
And this is it. This worked for me. Let me know, if you have any improvements or comments. It took me a while to get it working and it certainly is not perfect 🙂
Sources:
https://learn.microsoft.com/en-us/fabric/data-science/semantic-link-power-bi?tabs=python
https://medium.com/@sdaelemans/store-your-data-in-power-bi-to-onelake-with-semantic-link-in-microsof...
https://www.youtube.com/watch?v=zMiRGZsfQgs&t=2s&ab_channel=GuyinaCube
https://www.youtube.com/watch?v=VAHOl9KqSsk&ab_channel=HoosierBI
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.