Wget Command Examples in Linux - Download files from command line

Wget Command

'Wget' is developed as a part of the GNU Project. You can use it to download/extract data and content from web servers. Its name is a combination of the "www" and the word get.

It supports downloading over multiple protocols like FTP, SFTP, HTTP and HTTPS.

Wget is written in c and can be used on any Unix system. It can also be compiled on mac, windows, AmigaOS and other popular operating systems.

Installing Wget

Most Linux distributions today have pre-installed wget packages. You can check if the wget package on your system is installed or not by typing the following command to check its version (or you can just run wget without any option).

$ wget --version 
GNU Wget 1.21.2 built on linux-gnu.

If your Linux machine does not have 'wget' installed yet, run the below command for installing:

On Ubuntu/Debian distros

$ sudo apt-get install wget

On CentOS/RHEL/Fedora Distro

# sudo yum install wget

On Arch Linux Distro

$ sudo pacman -S wget

Basic command Syntax

To check the syntax of 'wget', try with '--help' option:

$ wget --help

Running the above command gives us the following result:

GNU Wget 1.21.2, a non-interactive network retriever.
Usage: wget [OPTION]... [URL]...

Examples of 'Wget' Command in basic usage

We will show you some examples of the 'wget' command that you will probably use every day. It's worth noting that these examples can also be incorporated into shell scripts or scheduled tasks using cron jobs, allowing you to automate and streamline your workflow.

1. Download single file

In its simplest syntax, when used without any options, it downloads the resource specified in to the current directory (Type command 'pwd' to check your current directory)

$ wget http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz

Running the above command gives us the following result:

--2023-04-21 09:24:27--  http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz
Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188.20, 2001:470:142:3::b
Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.20|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2059922 (2.0M) [application/x-tar]
Saving to: 'wget2-latest.tar.lz'
...

During the download, a progress bar is displayed along with the file name, size, download speed, and estimated time to complete. Once the process is complete, you can find the downloaded file in your current directory.

Incase file is duplicated, 'wget' will add a .number automatically at the end of the file's name.

2. Download multiple files

With a single command, specify multiple file urls separated by space and it will download them all.

$ wget https://download.fedoraproject.org/pub/fedora/linux/releases/38/Workstation/aarch64/images/Fedora-Workstation-38-1.6.aarch64.raw.xz  https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.2.tar.xz

Running the above command gives us the following result:

--2023-04-21 13:38:39--  https://download.fedoraproject.org/pub/fedora/linux/releases/38/Workstation/aarch64/images/Fedora-Workstation-38-1.6.aarch64.raw.xz
Resolving download.fedoraproject.org (download.fedoraproject.org)... 13.125.120.8, 38.145.60.21, 13.233.183.170, ...
Connecting to download.fedoraproject.org (download.fedoraproject.org)|13.125.120.8|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://mirrors.tuna.tsinghua.edu.cn/fedora/releases/38/Workstation/aarch64/images/Fedora-Workstation-38-1.6.aarch64.raw.xz [following]
--2023-04-22 23:38:40--  https://mirrors.tuna.tsinghua.edu.cn/fedora/releases/38/Workstation/aarch64/images/Fedora-Workstation-38-1.6.aarch64.raw.xz
Resolving mirrors.tuna.tsinghua.edu.cn (mirrors.tuna.tsinghua.edu.cn)... 101.6.15.130, 2402:f000:1:400::2
Connecting to mirrors.tuna.tsinghua.edu.cn (mirrors.tuna.tsinghua.edu.cn)|101.6.15.130|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4080790616 (3.8G) [application/octet-stream]
Saving to: 'Fedora-Workstation-38-1.6.aarch64.raw.xz'
...

Or we can create a text file and put the download URLs in it.
This command will create a file named 'example.txt' and open the text editor.

vi link_URL.txt

Then, paste the download urls in it in plain text format one in each line:

http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz
https://download.fedoraproject.org/pub/fedora/linux/releases/38/Workstation/aarch64/images/Fedora-Workstation-38-1.6.aarch64.raw.xz  
https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-6.2.tar.xz

You can then use the '-i' option to get all the files contained in the example file:

$ wget -i link_URL.txt

Running the above command gives us the following result:

--2023-04-21 15:45:08--  http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz
Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188.20, 2001:470:142:3::b
Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.20|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2059922 (2.0M) [application/x-tar]
....

3. Download file with speed limit

With wget, you can limit the download speed. This is useful when you are downloading a large file but don't want it consume your entire internet bandwidth.

$ wget --limit-rate=100k http://mirrors.vhost.vn/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-NetInstall-2009.iso

Running the above command gives us the following result:

--2023-04-21 23:53:52--  http://mirrors.vhost.vn/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-NetInstall-2009.iso
Resolving mirrors.vhost.vn (mirrors.vhost.vn)... 103.27.60.115
Connecting to mirrors.vhost.vn (mirrors.vhost.vn)|103.27.60.115|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 602931200 (575M) [application/octet-stream]
Saving to: ‘CentOS-7-x86_64-NetInstall-2009.iso’

-7-x86_64-NetInstall-2009.iso   0%[                                                  ] 502.97K   100KB/s    eta 98m 4s

As you can see in the example above, we use option '--limit-rate=100k' and its limited speed to 100kb/s:

Append "k" for kilobytes. In another case, we can use the speed rate to "m" for megabytes, and "g" for gigabytes.

4. Download files in background

For extremely large files, you can use option '-b'. It will run the download process in the background. It will also create a 'wget-log' file in the current directory, which will contain progress and status data.

You can watch the status of the download with the tail command.

$ wget -b  https://releases.ubuntu.com/22.04.2/ubuntu-22.04.2-desktop-amd64.iso

Running the above command gives us the following result:

Continuing in background, pid 4521.
Output will be written to 'wget-log'5.

Next, we verify the status of current download progress.

$ tail -f wget-log

Running the above command gives us the following result:

207300K .......... .......... .......... .......... ..........  4% 57.7M 7m5s
207350K .......... .......... .......... .......... ..........  4% 61.6M 7m5s
207400K .......... .......... .......... .......... ..........  4% 17.5M 7m5s
207450K .......... .......... .......... .......... ..........  4%  113M 7m5s
207500K .......... .......... .......... .......... ..........  4% 27.5M 7m5s
207550K .......... .......... .......... .......... ..........  4% 15.6M 7m5s
207600K .......... .......... .......... .......... ..........  4% 90.2M 7m5s
207650K .......... .......... .......... .......... ..........  4% 19.8M 7m5s
207700K .......... .......... .......... .......... ..........  4%  112M 7m4s
...

5. Pause/Resume download

Wget supports a very useful option that allows you to resume downloading files that were interrupted for some reason. Instead of starting the whole download from the start resume from where it was interrupted with the '-c' option in wget.

For example, when terminal window is showing download progress of your file, you just enter the following keyboard shortcut to pause the download:

Ctrl + c

Next, we resume it as following command:

$ wget -c http://mirrors.vhost.vn/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-NetInstall-2009.iso

Running the above command gives us the following result:

--2023-04-24 17:58:42--  http://mirrors.vhost.vn/centos/7.9.2009/isos/x86_64/CentOS-7-x86_64-NetInstall-2009.iso
Resolving mirrors.vhost.vn (mirrors.vhost.vn)... 103.27.60.115
Connecting to mirrors.vhost.vn (mirrors.vhost.vn)|103.27.60.115|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 602931200 (575M) [application/octet-stream]
Saving to: ‘CentOS-7-x86_64-NetInstall-2009.iso’

CentOS-7-x86_64-NetInstall-20   6%[==&gt;                                               ]  34.80M  11.5MB/s    eta 47s

6. Save downloaded file with specific name

By default wget guesses the name of the file from the download url and picks up the part from the last forward slash "/" as the filename for saving.
Using the "-O" option we can specify the filename that we want to use for saving the file.

For example, I can download from "http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz", and save it with the name 'wget_01.tar.lz' with the following command

$ wget -O wget_01.tar.lz http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz

Running the above command gives us the following result:

--2023-04-24 18:46:01--  http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz
Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188.20, 2001:470:142:3::b
Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.20|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2059922 (2.0M) [application/x-tar]
Saving to: ‘wget_01.tar.lz’

wget_01.tar.lz            100%[===================================&gt;]   1.96M  1.02MB/s    in 1.9s

2023-04-24 18:46:04 (1.02 MB/s) - ‘wget_01.tar.lz’ saved [2059922/2059922]

7. Save to compressed file with tar

We can also pipe with 'tar' command to get a compressed output file with a single single command:

$ wget http://centos-hcm.viettelidc.com.vn/7.9.2009/isos/x86_64/0_README.txt | tar -czvf note_file.tar.gz /home/jayce/Downloads/0_README.txt

Running the above command gives us the following result:

--2023-04-24 20:48:07--  http://centos-hcm.viettelidc.com.vn/7.9.2009/isos/x86_64/0_README.txt
tar: /home/jayce/Downloads/0_README.txt: Cannot stat: No such file or directory
Resolving centos-hcm.viettelidc.com.vn (centos-hcm.viettelidc.com.vn)... tar: Exiting with failure status due to previous errors
115.84.182.155
Connecting to centos-hcm.viettelidc.com.vn (centos-hcm.viettelidc.com.vn)|115.84.182.155|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2740 (2.7K) [text/plain]
Saving to: ‘0_README.txt’

0_README.txt           100%[============================&gt;]   2.68K  --.-KB/s    in 0s

2023-04-24 20:48:07 (342 MB/s) - ‘0_README.txt’ saved [2740/2740]

Next, we list files to verify the result.

$ ls /home/jayce/Downloads
0_README.txt  note_file.tar.gz

8. Download and save files to specific directory

By default, the downloaded file will be saved in the current working directory. To save a file to a specific location, use the '-P' option:

$ wget -P /home/jayce/Downloads http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz

Running the above command gives us the following result:

--2023-04-24 20:03:46--  http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz
Resolving ftp.gnu.org (ftp.gnu.org)... 209.51.188.20, 2001:470:142:3::b
Connecting to ftp.gnu.org (ftp.gnu.org)|209.51.188.20|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2059922 (2.0M) [application/x-tar]
Saving to: ‘/home/jayce/Downloads/wget2-latest.tar.lz’

wget2-latest.tar.lz       100%[===================================&gt;]   1.96M  1.10MB/s    in 1.8s

2023-04-24 20:03:49 (1.10 MB/s) - ‘/home/jayce/Downloads/wget2-latest.tar.lz’ saved [2059922/2059922]

9. Change the user agent

When your browser connects to a website, it includes the 'user-agent' field in its HTTP header. The contents of the User agent field vary between browsers.

Each browser has its own separate User agent text. Basically, a User agent is a way for the browser to report its software name and version to the remote web server.

Some websites only accept certain user-agents. So, change the user-agent to download files from that site using the '--user-agent' option.

You can check current 'user-agent' field with option '-d':

$ wget -d http://google.com

Running the above command gives us the following result:

DEBUG output created by Wget 1.21.2 on linux-gnu.

Reading HSTS entries from /home/jayce/.wget-hsts
URI encoding = ‘UTF-8’
Converted file name 'index.html' (UTF-8) -&gt; 'index.html' (UTF-8)
--2023-04-24 21:17:20--  http://google.com/
Resolving google.com (google.com)... 142.251.130.14, 2404:6800:4005:814::200e
Caching google.com =&gt; 142.251.130.14 2404:6800:4005:814::200e
Connecting to google.com (google.com)|142.251.130.14|:80... connected.
Created socket 3.
Releasing 0x0000558851a6d760 (new refcount 1).

---request begin---
GET / HTTP/1.1
Host: google.com
User-Agent: Wget/1.21.2
Accept: */*
Accept-Encoding: identity
Connection: Keep-Alive
---request end---

As you can see, the default 'User-Agent' field is 'Wget/1.21.2'. Then, you want to change to another agent like 'Mozilla', try with command:

$ wget -d --user-agent=" Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36" "http://google.com"

Running the above command gives us the following result:

Setting --user-agent (useragent) to  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36
DEBUG output created by Wget 1.21.2 on linux-gnu.

Reading HSTS entries from /home/jayce/.wget-hsts
URI encoding = ‘UTF-8’
Converted file name 'index.html' (UTF-8) -&gt; 'index.html' (UTF-8)
--2023-04-24 21:53:16--  http://google.com/
Resolving google.com (google.com)... 142.250.204.142, 2404:6800:4005:80f::200e
Caching google.com =&gt; 142.250.204.142 2404:6800:4005:80f::200e
Connecting to google.com (google.com)|142.250.204.142|:80... connected.
Created socket 3.
Releasing 0x000055cde2db88b0 (new refcount 1).

---request begin---
GET / HTTP/1.1
Host: google.com
User-Agent:  Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36
Accept: */*
Accept-Encoding: identity
Connection: Keep-Alive

---request end---

Advanced usage of the wget command

Now lets take a look at some more examples of using wget in conjuction with other utilities to perform more complicated tasks.

1. Scheduling downloads at a specific time

Suppose you need to download files at 5:00 AM every day. Because wget itself doesn't have scheduling properties, so for this example we combine it with Crontab, which is a time-based job scheduler in Linux, used to schedule jobs (commands or shell scripts) to run periodically at fixed times, dates, or intervals.

Let's take a look on crontab command:

crontab -e    Edit crontab file, or create one if it doesn’t already exist.
crontab -l    crontab list of cronjobs , display crontab file contents.
crontab -r    Remove your crontab file.

To create a crontab file, you can follow the format syntax of the file, which consists of these fields: minute (m), hour (h), day of month (DOM), month (M), day of week (DOW), and the command (CMD) to execute.

M H DOM MON DOW COMMAND

Field      Description       Allowed Value

M         Minute field        0 to 59 or use '*' in these fields (for 'any').
H         Hour field          0 to 23 or use '*' in these fields (for 'any').
DOM       Day of Month        1-31 or use '*' in these fields (for 'any').
MON       Month field         1-12 or use '*' in these fields (for 'any').
DOW       Day Of Week         0-6 or use '*' in these fields (for 'any').
COMMAND   Command             Any command to be executed.

So, based on the example requirement, we have a crontab file below:

0 5 * * * wget http://ftp.gnu.org/gnu/wget/wget2-latest.tar.lz

Once you've added the entry, you should restart the 'cron.service' and check the status to ensure that the new job has been added to the schedule.

# systemctl restart cron.service
# systemctl status cron.service

Running the above command gives us the following result:

● cron.service - Regular background program processing daemon
     Loaded: loaded (/lib/systemd/system/cron.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2023-04-21 13:34:32 +07; 3s ago
       Docs: man:cron(8)
   Main PID: 6485 (cron)
      Tasks: 1 (limit: 4568)
     Memory: 408.0K
        CPU: 3ms
     CGroup: /system.slice/cron.service
             └─6485 /usr/sbin/cron -f -P

Apr 21 13:34:33 UBUNTU-SRV01 cron[6485]: (CRON) INFO (pidfile fd = 3)
Apr 21 13:34:32 UBUNTU-SRV01 systemd[1]: Started Regular background program processing daemon.
Apr 21 13:34:33 UBUNTU-SRV01 cron[6485]: (CRON) INFO (Skipping @reboot jobs -- not system startup)

2. Monitoring website changes

You can also use wget to monitor websites for changes, either manually or as part of a script. In the below example, we use wget to download the homepage of a website, and then compare the downloaded files to see if there are any changes.

Firstly, we download web data by using wget. We are going to check the change for EUR-USD price on Google Finance website:

# wget https://www.google.com/finance/quote/EUR-USD

Running the above command gives us the following result:

--2023-04-21 14:35:34--  https://www.google.com/finance/quote/EUR-USD
Resolving www.google.com (www.google.com)... 142.251.220.36, 2404:6800:4005:81c::2004
Connecting to www.google.com (www.google.com)|142.251.220.36|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘EUR-USD’

EUR-USD                       [                                 ]   1.02M  2.26MB/s    in 0.5s

2023-04-21 14:35:35 (2.26 MB/s) - ‘EUR-USD’ saved [1070042]

Then, we change the name of the file with the prefix "Previous" as follows.

# mv EUR-USD Previous_EUR-USD

After downloading the files, we use a below shell script. The "cmp" command to compare the downloaded file with the previous version is to check for any changes.

#!/bin/sh
wget https://www.google.com/finance/quote/EUR-USD
$Log_File = /root/log-monitoring-web-changes/log.txt
if cmp -s last_EUR-USD EUR-USD; then
   echo "`date`" ": EUR-USD Price not changed." &gt;&gt; $Log_File
else
   echo "`date`" ": EUR-USD Price changed."  &gt;&gt; $Log_File
fi
rm Previous_EUR-USD
mv EUR-USD Previous_EUR-USD

To verify that, we can run a test:

root@UBUNTU-SRV01:~# ./compare.sh

Running the above command gives us the following result:

--2023-04-21 15:11:15--  https://www.google.com/finance/quote/EUR-USD
Resolving www.google.com (www.google.com)... 142.250.66.100, 2404:6800:4005:813::2004
Connecting to www.google.com (www.google.com)|142.250.66.100|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘EUR-USD’

EUR-USD                       [                                 ]   1.02M  1.82MB/s    in 0.6s

2023-04-21 15:11:16 (1.82 MB/s) - ‘EUR-USD’ saved [1071314]

Checking the log is an essential step to ensure that the script is working correctly. It contains the date/time and result about checking changes:

root@UBUNTU-SRV01:~# cat log-monitoring-web-changes/log.txt
Sun Apr 21 03:05:18 PM +07 2023 : EUR-USD Price not changed.
Sun Apr 21 03:11:16 PM +07 2023 : EUR-USD Price changed.

To automate the script to run hourly or daily, you can add it to either the /etc/cron.hourly or /etc/cron.daily directory, depending on your desired frequency.

Troubleshooting with wget

The 'wget' command is not limited to downloading purposes only. In addition, it can be utilized for troubleshooting some issues in the daily operations.

For instance, network or connection timeout issues can be diagnosed and troubleshooted with 'wget' command. This is the most common issue caused by a variety of reasons, including network congestion, server issues, or site availability.

With the 'wget' command, you get several options that can help you retry failed connections refused by the server or adjust timeouts. Here are a few options you can refer:

--retry-connrefused: This option tells 'wget' to retry even if the connections have failed or refused.

--timeout: This option specifies the maximum amount of time to wait for a connection to be established.

Example we test http connection to website "medium.com"

# wget --retry-connrefused --timeout=15 http://medium.com/

Running the above command gives us the following result:

--2023-04-21 16:11:21--  http://medium.com/
Resolving medium.com (medium.com)... ::1, 127.0.0.1
Connecting to medium.com (medium.com)|::1|:80... failed: Connection refused.
Connecting to medium.com (medium.com)|127.0.0.1|:80... failed: Connection refused.
Retrying.

--2023-04-21 16:11:22--  (try: 2)  http://medium.com/
Connecting to medium.com (medium.com)|::1|:80... failed: Connection refused.
Connecting to medium.com (medium.com)|127.0.0.1|:80... failed: Connection refused.
Retrying.

--2023-04-21 16:11:24--  (try: 3)  http://medium.com/
Connecting to medium.com (medium.com)|::1|:80... failed: Connection refused.
Connecting to medium.com (medium.com)|127.0.0.1|:80... failed: Connection refused.
Retrying.

Another example:

# wget --retry-connrefused --timeout=15 http://google.com/

Running the above command gives us the following result:

--2023-04-21 16:13:10--  http://google.com/
Resolving google.com (google.com)... 142.250.199.78, 2404:6800:4005:804::200e
Connecting to google.com (google.com)|142.250.199.78|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.google.com/ [following]
--2023-04-21 16:13:11--  http://www.google.com/
Resolving www.google.com (www.google.com)... 142.250.207.68, 2404:6800:4005:80d::2004
Connecting to www.google.com (www.google.com)|142.250.207.68|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’

index.html                    [                                 ]  15.98K  --.-KB/s    in 0.06s

2023-04-21 16:13:11 (290 KB/s) - ‘index.html’ saved [16363]

Conclusion

Wget command is a powerful tool for downloading files from the web, whether it's a single file or an entire website. Moreover, Its ability to automate tasks and integrate with scripts makes it an indispensable tool for linux users and administrators.

There are also other commands like curl that can be used to download web content on the command line.

Let us know your comments below!

2 Comments

Wget Command Examples in Linux – Download files from command line

thoman
May 4, 2023 at 7:07 am

wget and curl is basic command and easy use

1. Silver Moon Post author
  May 16, 2023 at 10:10 am
  
  yes, both wget and curl are easy to use from the command line on linux.
  if you want these commands on windows you can use wsl to run command-line ubuntu, or use the cygwin utilities.

Wget Command Examples in Linux – Download files from command line

Wget Command

Installing Wget

Basic command Syntax

Examples of 'Wget' Command in basic usage

1. Download single file

2. Download multiple files

3. Download file with speed limit

4. Download files in background

5. Pause/Resume download

6. Save downloaded file with specific name

7. Save to compressed file with tar

8. Download and save files to specific directory

9. Change the user agent

Advanced usage of the wget command

1. Scheduling downloads at a specific time

2. Monitoring website changes

Troubleshooting with wget

Conclusion

2 Comments

Leave a Reply Cancel reply

About

Linux and Open Source

Other Categories