Learn how to use GNU Wget, a powerful command-line tool for downloading files from the web using HTTP, HTTPS, and FTP protocols. Wget offers numerous features including the ability to download multiple files, resume downloads, limit bandwidth, perform recursive downloads, download in the background, mirror websites, and much more.
In this article, we provide practical examples and detailed explanations of the most common wget
options to help you make the most of this versatile utility.
Installing Wget
Most Linux distributions come with the wget
package pre-installed nowadays.
To determine if the Wget package is installed on your system, access your console and input “wget
” followed by hitting the enter key. If the package is already installed, the system will display “wget: missing URL
“. Conversely, if the package is not installed, the system will output “wget
command not found
“.
In case wget
is not present on your system, it can be conveniently installed through your distribution’s package manager.
Installing Wget on Ubuntu and Debian
sudo apt install wget
Installing Wget on CentOS and Fedora
sudo yum install wget
Wget Command Syntax
To begin with, let’s review the fundamental syntax before delving into the usage of the wget
command. The format of wget
utility expressions is as follows:
wget [options] [url]
options
– The Wget optionsurl
– URL refers to the file or directory that you intend to synchronize or download.
How to Download a File with wget
When utilized without any additional options, wget
will download the specified resource from the [url] and save it to the current directory in its most basic form.
In this instance, we’re retrieving the tar archive of the Linux kernel:
wget https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.17.2.tar.xz
The image depicted above illustrates that wget
initiates the process by resolving the IP address of the domain. Once the IP address is resolved, it establishes a connection with the remote server and commences the transfer.
While downloading using wget
, a progress bar is displayed that shows the file name, file size, download speed, and estimated time left for completion. After the download finishes, the downloaded file can be found in the current working directory.
To disable the output, employ the -q
flag. In case the file is already present, wget
will append .N
(number) to the file name.
Saving the Downloaded File Under a Different Name
If you wish to give a different name to the downloaded file, use the -O
option and specify the preferred name afterward.
wget -O latest-hugo.zip https://github.com/gohugoio/hugo/archive/master.zip
The aforementioned instruction will rename the most recent Hugo zip file obtained from GitHub as latest-hugo.zip
rather than preserving its original file name.
Limiting the Download Speed
You can use the --limit-rate
option to restrict the download speed. The default unit of measurement for speed is bytes/second, but you can indicate kilobytes, megabytes, or gigabytes by appending k
, m
, or g
, respectively. For instance, to download the Go binary and limit the speed to 1MB, execute the following command:
wget --limit-rate=1m https://dl.google.com/go/go1.10.3.linux-amd64.tar.gz
In situations where you prefer to limit the amount of bandwidth consumed by wget
, this option can be beneficial.
Resuming a Download
To continue a previously interrupted download of a large file, utilize the -c
option to resume the process rather than initiate it anew. This approach proves valuable when your internet connection fails during the file download, as it eliminates the need to restart the download from the beginning.
The ensuing instance pertains to the process of recommencing the download of the Ubuntu 18.04 iso file:
wget -c http://releases.ubuntu.com/18.04/ubuntu-18.04-live-server-amd64.iso
In case the remote server lacks support for download resumption, wget will commence the download anew, thereby replacing any pre-existing file.
Downloading in Background
You can initiate background downloads by adding the -b
option. In the instance below, we are executing a download of the OpenSuse iso file in the background:
wget -b https://download.opensuse.org/tumbleweed/iso/openSUSE-Tumbleweed-DVD-x86_64-Current.iso
By default, the output is redirected to the ‘wget-log
‘ file in the current directory. To monitor the download’s progress, use the ‘tail
‘ command:
tail -f wget-log
Changing the Wget User-Agent
In certain cases, the User-Agent of Wget may be obstructed by the remote server while downloading a file. To overcome this obstacle, you can use the -U
option to simulate a different web browser.
wget --user-agent="Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0" http://wget-forbidden.com/
By executing the aforementioned command, Firefox 60 will be simulated to make a request to wget-forbidden.com.
Downloading Multiple Files
To download numerous files simultaneously, specify the -i
parameter and then provide the path to a file, either local or external, that contains a list of URLs to be downloaded. It’s important to note that each URL must be listed on a distinct line.
In this illustration, we demonstrate how to obtain the iso files for Arch Linux, Debian, and Fedora by utilizing the URLs listed in the linux-distros.txt
file:
wget -i linux-distros.txt
http://mirrors.edge.kernel.org/archlinux/iso/2018.06.01/archlinux-2018.06.01-x86_64.iso
https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-9.4.0-amd64-netinst.iso
https://download.fedoraproject.org/pub/fedora/linux/releases/28/Server/x86_64/iso/Fedora-Server-dvd-x86_64-28-1.1.iso
In case you indicate a – as the filename, the standard input will be utilized to read the URLs.
Downloading via FTP
The process for downloading a file from an FTP server that requires a password involves providing your username and password in the following manner:
wget --ftp-user=FTP_USERNAME --ftp-password=FTP_PASSWORD ftp://ftp.example.com/filename.tar.gz
Creating a Mirror of a Website
When using wget, the -m option can be utilized to generate a mirror of a website. This option will enable the creation of a comprehensive local replica of the website, which will involve downloading all internal links as well as the website resources such as JavaScript, CSS, and Images.
wget -m https://example.com
To utilize the downloaded website for offline browsing, it is necessary to provide additional arguments to the aforementioned command.
wget -m -k -p https://example.com
By using the -k
option, wget will modify the links present in the downloaded files to enable local viewing. Additionally, with the -p
option, wget will retrieve all the essential files required to display the HTML page.
Skipping Certificate Check
To download a file through HTTPS from a host with an invalid SSL certificate, utilize the option “--no-check-certificate
“.
wget --no-check-certificate https://domain-with-invalid-ss.com
Downloading to the Standard Output
The given illustration demonstrates how to use wget to download the most recent version of WordPress silently (with the -q
flag), direct the output to stdout (with the -O -
flag), and then feed it to the tar tool for extraction into the /var/www folder.
wget -q -O - "http://wordpress.org/latest.tar.gz" | tar -xzf - -C /var/www
Conclusion
In conclusion, the Wget command is a powerful tool for downloading files from the internet on Linux systems. It offers a wide range of options that can be customized to suit different downloading needs. This article has provided a comprehensive overview of the Wget command, including its basic syntax, options, and examples. By following the examples in this article, you should be able to use Wget to download files from the internet efficiently and effectively. Remember to exercise caution when downloading files from untrusted sources and always check the integrity of downloaded files to avoid security risks. With the Wget command, downloading files on Linux has never been easier!