Top 15 WGET commands you need to know !

Top 15 WGET commands you need to know !

WGET cookbook PDF link : Download Here

wget is a powerful command-line utility for downloading files from the web. It supports various protocols such as HTTP, HTTPS, FTP, and FTPS.

  1. Basic File Download

This command downloads the file named file.zip from the specified URL.

wget https://example.com/file.zip

Example:

wget https://sample-videos.com/zip/10mb.zip
  1. Download to a specific directory
wget -P /path/to/directory https://example.com/file.zip

Example:

wget -P Downloads https://sample-videos.com/zip/10mb.zip
  1. Download with a different name
wget -O newname.zip https://example.com/file.zip

Downloads the file and save it as newname.zip

Example:

wget -O newname.zip  https://sample-videos.com/zip/10mb.zip
  1. Download multiple files
wget https://example.com/file1.zip https://example.com/file2.zip

Example:

wget https://sample-videos.com/zip/10mb.zip https://sample-videos.com/zip/20mb.zip
  1. Download in Background
wget -b https://example.com/largefile.zip

Example:

wget -b -O 20mnbfile.zip https://sample-videos.com/zip/20mb.zip

It will write the logs in to wget-log txt file.

  1. Rate Limiting Download
wget --limit-rate=200k https://example.com/largefile.zip

it limits the download rate to 200 Kb/s.

Example:

wget --limit-rate=200k https://sample-videos.com/zip/20mb.zip
  1. Resume Interrupted Download
wget -c https://example.com/largefile.zip

Example:

wget --limit-rate=200k https://sample-videos.com/zip/20mb.zip
  1. Downloading Entire Website
wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains example.com --no-parent https://example.com
  • --recursive: Enables recursive retrieval, meaning wget will download not only the specified URL but also follow and download links within that page, continuing recursively.

  • --no-clobber: This option prevents wget from overwriting existing files. If a file with the same name already exists in the local directory, wget will not download it again.

  • --page-requisites: Downloads all the elements needed to properly display the page offline. This includes inline images, stylesheets, and other resources referenced by the HTML.

  • --html-extension: Appends the .html extension to HTML files downloaded. This is useful when saving a complete website for offline browsing, as it helps maintain proper file extensions.

  • --convert-links: After downloading, converts the links in the downloaded documents to point to the local files, enabling offline browsing. This is important when you want to view the downloaded content without an internet connection.

  • --domains example.com: Restricts the download to files under the specified domain (example.com). This ensures that wget doesn't follow links to external domains, focusing only on the specified domain.

  • --no-parent: Prevents wget from ascending to the parent directory while recursively downloading. It ensures that only content within the specified URL and its subdirectories is downloaded.

  • https://example.com: The URL from which wget starts the recursive download.

Example:

wget --recursive --no-clobber --page-requisites --html-extension --convert-links --domains hashnode.dev --no-parent https://redterminal.hashnode.dev
  1. Mirror an entire website
wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com
  • --mirror: Enables mirroring, which includes recursion to download the entire website.

  • --convert-links: Converts the links in the downloaded documents to point to the local files for proper offline browsing.

  • --adjust-extension: Adds proper file extensions to downloaded files.

  • --page-requisites: Downloads all the elements needed to properly display the page offline, such as inline images and stylesheets.

  • --no-parent: Prevents wget from ascending to the parent directory while recursively downloading.

Example:

wget --mirror --convert-links --adjust-extension --page-requisites --no-parent https://example.com
  1. Download with a user-agent

Some websites block the request, if it finds the request is not coming from a browser. In those scenarios we can add the User-Agent in the http-header.

wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" https://example.com/file.zip

Example:

wget --user-agent="Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3" https://sample-videos.com/zip/20mb.zip
  1. Download with proxy - [[Proxy Server]]
wget --proxy=http://proxy.example.com:8080 https://example.com/file.zip
  1. Download files matching a pattern
wget -r -l1 -np -nd -A "*.jpg" https://example.com/images/
  • l1 -> Recursion depth level 1

  • np -> No parent directory files downloaded

  • nd -> No directory created

13.Test download url exists before downloading

wget --spider https://example.com

Example:

wget --spider https://sample-videos.com/zip/10mb.zip

14.Quit Download when it exceeds a certain time

wget -Q5m -i FILE-WHICH-HAS-URLS

Example:

wget -Q5m -i https://sample-videos.com/zip/10mb.zip https://sample-videos.com/zip/20mb.zip

Note: This quota will not get effect when you do a download a single URL. That is irrespective of the quota size everything will get downloaded when you specify a single file. This quota is applicable only for recursive downloads.

Lets try with recursive download,

wget --recursive -Q5m --no-clobber --page-requisites --html-extension --convert-links --domains hashnode.dev --no-parent https://redterminal.hashnode.dev
  1. Increase total number of retries
wget --tries=75 DOWNLOAD-URL