This is best Fabric, Power BI, SQL and AI community event. How do we know? The last event sold out! Save €200 with code FABCMTY200.
Register nowA new Data Days event is coming soon! This time we’re going bigger than ever. Fabric, Power BI, SQL, AI and more. Don't miss out.
Hi,
Cornelia here. I am really excited to participate in this Hackathon!😁
I am trying to use Data Wrangler in order to generate a summary of a dataset.
I can save the csv or the generated Python code by clicking on specific buttons but I would need to automate this code or csv generation inside a data pipeline. Is there such a posibility or a workaround to use Data Wrangler in this way?
Thanks!
Solved! Go to Solution.
Hi Cornelia,
If I understand you correctly, you'd like to describe a dataset automatically/programatically, the way data wrangler does in python notebooks. I don't know that there is an api to use the data wrangler programatically.
what I can suggest is either using the dataframe description capabilites in spark:
https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.describe.html
or generate the description directly in the dataset (assuming we're talking about pbi datasets = semantic models) using MCode
https://learn.microsoft.com/en-us/powerquery-m/table-schema
https://learn.microsoft.com/en-us/powerquery-m/table-profile
hope this helps!
Thank you for your help! 😁
Even if Data Wrangler seems pretty good at generating python code for describing data, I ended up creating some custom pyspark functions - easier to modify and automate.
Hi Cornelia,
If I understand you correctly, you'd like to describe a dataset automatically/programatically, the way data wrangler does in python notebooks. I don't know that there is an api to use the data wrangler programatically.
what I can suggest is either using the dataframe description capabilites in spark:
https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.describe.html
or generate the description directly in the dataset (assuming we're talking about pbi datasets = semantic models) using MCode
https://learn.microsoft.com/en-us/powerquery-m/table-schema
https://learn.microsoft.com/en-us/powerquery-m/table-profile
hope this helps!
Once you have the Python code generated with data wrangler you can put it in a cell of a notebook and from the notebook schedule a data pipeline run of the notebook.Run add to pipeline in Microsoft Fabric notebooks
Check out the April 2026 Fabric update to learn about new features.
Sign up to receive a private message when registration opens and key events begin.