Find everything you need to get certified on Fabric—skills challenges, live sessions, exam prep, role guidance, and more.
Get startedGrow your Fabric skills and prepare for the DP-600 certification exam by completing the latest Microsoft Fabric challenge.
I have weekly data that I would like to upload to a serverless SQL pool. The Data is in the form of an Excel file. My plan so far:
1. Create a db in the Serverless SQL pool with a table for my data.
2. Upload the Excel File to an Azure Storage Account.
3. Use an Azure Synapse Spark Notebook to query the data, transform it using PySpark and then write it to the db in the Serveless SQL pool.
4. Creae a Pipeline using the Synapse Notebook and schedule it to run once a week.
5. At the end of the week append the new data into the Azure Storage Account.
Since each week has like 70,000 rows of data I just want a way for new data to be read and trasformed and inserted into the db.
My question is:
A) What Activities are best to use to build this pipeline?
Hi, @HamidBee Thanks for posting your question in Microsoft Fabric Community
You can use Synapse Notebook activity in the pipeline to achieve the above task. you can write the ETL code in PySpark in the Synapse Notebook and then use the Notebook activity to run the code in the pipeline.
The advantage of using the Notebook activity is that the user can write and test the ETL code in the same environment, and then schedule the pipeline to run the Notebook activity at a specific time.
Regards
Geetha
Ask questions in Data Engineering, Data Science, Data Warehouse and General Discussion.
Ask questions in Eventhouse and KQL, Eventstream, and Reflex.
User | Count |
---|---|
5 | |
2 | |
1 | |
1 | |
1 |