Here's an example to parallel download all 150+ Microsoft eBooks available from this site.
curl http://blogs.msdn.com/b/mssmallbiz/archive/2013/06/28/almost-150-free-microsoft-ebooks-covering-windows-7-windows-8-office-2010-office-2013-office-365-office-web-apps-windows-server-2012-windows-phone-7-windows-phone-8-sql-server-2008-sql-server-2012-sharepoint-server-2010-s.aspx | grep -o -E "http://ligman.me/[0-9A-Za-z]+" | uniq | xargs -P 5 -n 1 wget --content-disposition
So what's it doing?
First, download (or curl) the url specified. Second,search the second URL for a matching regular expression. Third, uniq(ue) the resultset. Fourth, xargs to deal with the large parameter list Fifth, wget to download the resulting URLs in a parallel fashion.
Why use both tools when they appear to do similar things?
http://stackoverflow.com/questions/636339/what-is-better-curl-or-wget
If you're a Windows user, and can't install Cygwin, Powershell can do some of the things that wget can do.
http://superuser.com/questions/25538/what-is-the-windows-equivalent-of-wget
(new-object System.Net.WebClient).DownloadFile('http://www.xyz.net/file.txt','C:\tmp\file.txt')
Here's a ton of GNU manuals, including wget.
No comments:
Post a Comment