Check your eligibility for this 50% exam voucher offer and join us for free live learning sessions to get prepared for Exam DP-700.
Get StartedDon't miss out! 2025 Microsoft Fabric Community Conference, March 31 - April 2, Las Vegas, Nevada. Use code MSCUST for a $150 discount. Prices go up February 11th. Register now.
I have weekly data that I would like to upload to a serverless SQL pool. The Data is in the form of an Excel file. My plan so far:
1. Create a db in the Serverless SQL pool with a table for my data.
2. Upload the Excel File to an Azure Storage Account.
3. Use an Azure Synapse Spark Notebook to query the data, transform it using PySpark and then write it to the db in the Serveless SQL pool.
4. Creae a Pipeline using the Synapse Notebook and schedule it to run once a week.
5. At the end of the week append the new data into the Azure Storage Account.
Since each week has like 70,000 rows of data I just want a way for new data to be read and trasformed and inserted into the db.
My question is:
A) What Activities are best to use to build this pipeline?
Hi, @HamidBee Thanks for posting your question in Microsoft Fabric Community
You can use Synapse Notebook activity in the pipeline to achieve the above task. you can write the ETL code in PySpark in the Synapse Notebook and then use the Notebook activity to run the code in the pipeline.
The advantage of using the Notebook activity is that the user can write and test the ETL code in the same environment, and then schedule the pipeline to run the Notebook activity at a specific time.
Regards
Geetha
User | Count |
---|---|
37 | |
14 | |
4 | |
3 | |
2 |
User | Count |
---|---|
44 | |
21 | |
8 | |
7 | |
6 |