Don't miss your chance to take the Fabric Data Engineer (DP-600) exam for FREE! Find out how by attending the DP-600 session on April 23rd (pacific time), live or on-demand.
Learn moreNext up in the FabCon + SQLCon recap series: The roadmap for Microsoft SQL and Maximizing Developer experiences in Fabric. All sessions are available on-demand after the live show. Register now
Source: https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2023/index.html
Destination: my_lakehouse_root/files/ais/2023/
Lakehouse SQL Endpoint: xyz.datawarehouse.fabric.microsoft.com
I have successfully downloaded this in a VM on another cloud with:
$ wget -qO -np -r -nH -L --cut-dirs=3 https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2023/
I tried "Copy Data" in a pipeline, but it only grabs the webpage itself (index.html), and not the zip files.
I only need to copy the files, I will extract and process these zip files in Spark later.
Solved! Go to Solution.
I would donwload them via a python notebook instead
Create an iterator that generates the needed urls. from 01_01 until 09_30
download each url
https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2023/AIS_2023_01_01.zip
if you need inspiration on how to download files in python, then use the existing samples that you can install for free. The Machine detection one for example downloads its own data.
They all contain a code block that downloads a file and unzips it to the lakehouse. here is the example from the uplift sample
I would donwload them via a python notebook instead
Create an iterator that generates the needed urls. from 01_01 until 09_30
download each url
https://coast.noaa.gov/htdata/CMSP/AISDataHandler/2023/AIS_2023_01_01.zip
if you need inspiration on how to download files in python, then use the existing samples that you can install for free. The Machine detection one for example downloads its own data.
They all contain a code block that downloads a file and unzips it to the lakehouse. here is the example from the uplift sample
Thanks,
This worked. My error was in using my Lakehouse name is the destination path. I got the correct destination by going to the folder in the Lakehouse navigation sidebar, then left-clicking on the "...", and left-clicking "Copy File API Path"
```
Experience the highlights from FabCon & SQLCon, available live and on-demand starting April 14th.
If you have recently started exploring Fabric, we'd love to hear how it's going. Your feedback can help with product improvements.
Share feedback directly with Fabric product managers, participate in targeted research studies and influence the Fabric roadmap.