Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

60 Days of Data Days! Live and on-demand sessions, challenges, study groups and more! And it's all FREE!. Join now. Learn more

Reply
ccornelia
Frequent Visitor

Automating Data Wrangler output

Hi, 

Cornelia here. I am really excited to participate in this Hackathon!😁

 

I am trying to use Data Wrangler in order to generate a summary of a dataset.

I can save the csv or the generated Python code by clicking on specific buttons but I would need to automate this code or csv generation inside a data pipeline. Is there such a posibility or a workaround to use Data Wrangler in this way? 

 

Thanks!

1 ACCEPTED SOLUTION
Anonymous
Not applicable

Hi Cornelia, 

If I understand you correctly, you'd like to describe a dataset automatically/programatically, the way data wrangler does in python notebooks. I don't know that there is an api to use the data wrangler programatically. 

what I can suggest is either using the dataframe description capabilites in spark:

https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.describe.html

or generate the description directly in the dataset (assuming we're talking about pbi datasets = semantic models) using MCode

https://learn.microsoft.com/en-us/powerquery-m/table-schema

https://learn.microsoft.com/en-us/powerquery-m/table-profile 

 

hope this helps!

 

View solution in original post

3 REPLIES 3
ccornelia
Frequent Visitor

Thank you for your help! 😁

 

Even if Data Wrangler seems pretty good at generating python code for describing data, I ended up creating some custom pyspark functions - easier to modify and automate.

Anonymous
Not applicable

Hi Cornelia, 

If I understand you correctly, you'd like to describe a dataset automatically/programatically, the way data wrangler does in python notebooks. I don't know that there is an api to use the data wrangler programatically. 

what I can suggest is either using the dataframe description capabilites in spark:

https://spark.apache.org/docs/3.1.1/api/python/reference/api/pyspark.sql.DataFrame.describe.html

or generate the description directly in the dataset (assuming we're talking about pbi datasets = semantic models) using MCode

https://learn.microsoft.com/en-us/powerquery-m/table-schema

https://learn.microsoft.com/en-us/powerquery-m/table-profile 

 

hope this helps!

 

imejiauseche
Microsoft Employee
Microsoft Employee

Once you have the Python code generated with data wrangler you can put it in a cell of a notebook and from the notebook schedule a data pipeline run of the notebook.
Run add to pipeline in Microsoft Fabric notebooksRun add to pipeline in Microsoft Fabric notebooks

 

Helpful resources

Announcements
FabCon and SQLCon Barcelona 2026

FabCon & SQLCon – Barcelona 2026

Join us in Barcelona for FabCon and SQLCon, the Fabric, Power BI, SQL, and AI community event. Save €200 with code FABCMTY200.

60 days of Data Days Carousel

Data Days 2026

Join Fabric Data Days 2026: 60 days of free live/on-demand sessions, challenges, study groups, and certification opportunities.

June Fabric Update Carousel

Fabric Monthly Update - June 2026

Check out the June 2026 Fabric update to learn about new features.