Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Join the FabCon + SQLCon recap series. Up next: Power BI, Real-Time Intelligence, IQ and AI, and Data Factory take center stage. All sessions are available on-demand after the live show. Register now

Python Jobs

Having the ability to run python code outside of the notebook environment (like we can for pyspark jobs) could be a real win for efficiency and modularity. It would allow users to package robust, unit-tested code and deploy it to the fabric environment where it could run as a cost-effective single-node job. Databricks has an implementation for this, and it would be really nice to see something similar come to Fabric.

Notebooks are great for ad-hoc or exploratory stuff, but building something robust in them feels like shoving a peg into a wrong-shaped hole. They are (nearly) impossible to unit test, so you often end up creating libraries which allow you to package transformations in a way that can be tested, then your notebooks end up being essentially thin wrappers around a bunch of external code.

I think the most obvious example for this is the number of Fabric DBT implementations that essentially involve installing DBT core into a notebook and running it there (I know there is DBT jobs coming, but this is beside the point). This is a symptom of a larger need for this type of hosting/execution of code within the environment. Yes, you could host the code on a vm external to Fabric but that goes against the ethos of a unified data platform. Offering something like this would be a great way to increase the flexibility and extensibility of the platform.
The implementation can be fairly barebones and developer forward, aiming to provide maximum flexibility and extensibility. Basically a way to run python scripts/CLI tools on the Fabric capacity. 

 

Status: New