Categorygithub.com/cavaliercoder/grab
modulepackage
2.0.0+incompatible
Repository: https://github.com/cavaliercoder/grab.git
Documentation: pkg.go.dev

# README

grab

GoDoc Build Status Go Report Card

Downloading the internet, one goroutine at a time!

$ go get github.com/cavaliercoder/grab

Grab is a Go package for downloading files from the internet with the following rad features:

  • Monitor download progress concurrently
  • Auto-resume incomplete downloads
  • Guess filename from content header or URL path
  • Safely cancel downloads using context.Context
  • Validate downloads using checksums
  • Download batches of files concurrently
  • Apply rate limiters

Requires Go v1.7+

Example

The following example downloads a PDF copy of the free eBook, "An Introduction to Programming in Go" into the current working directory.

resp, err := grab.Get(".", "http://www.golang-book.com/public/pdf/gobook.pdf")
if err != nil {
	log.Fatal(err)
}

fmt.Println("Download saved to", resp.Filename)

The following, more complete example allows for more granular control and periodically prints the download progress until it is complete.

The second time you run the example, it will auto-resume the previous download and exit sooner.

package main

import (
	"fmt"
	"os"
	"time"

	"github.com/cavaliercoder/grab"
)

func main() {
	// create client
	client := grab.NewClient()
	req, _ := grab.NewRequest(".", "http://www.golang-book.com/public/pdf/gobook.pdf")

	// start download
	fmt.Printf("Downloading %v...\n", req.URL())
	resp := client.Do(req)
	fmt.Printf("  %v\n", resp.HTTPResponse.Status)

	// start UI loop
	t := time.NewTicker(500 * time.Millisecond)
	defer t.Stop()

Loop:
	for {
		select {
		case <-t.C:
			fmt.Printf("  transferred %v / %v bytes (%.2f%%)\n",
				resp.BytesComplete(),
				resp.Size,
				100*resp.Progress())

		case <-resp.Done:
			// download is complete
			break Loop
		}
	}

	// check for errors
	if err := resp.Err(); err != nil {
		fmt.Fprintf(os.Stderr, "Download failed: %v\n", err)
		os.Exit(1)
	}

	fmt.Printf("Download saved to ./%v \n", resp.Filename)

	// Output:
	// Downloading http://www.golang-book.com/public/pdf/gobook.pdf...
	//   200 OK
	//   transferred 42970 / 2893557 bytes (1.49%)
	//   transferred 1207474 / 2893557 bytes (41.73%)
	//   transferred 2758210 / 2893557 bytes (95.32%)
	// Download saved to ./gobook.pdf
}

Design trade-offs

The primary use case for Grab is to concurrently downloading thousands of large files from remote file repositories where the remote files are immutable. Examples include operating system package repositories or ISO libraries.

Grab aims to provide robust, sane defaults. These are usually determined using the HTTP specifications, or by mimicking the behavior of common web clients like cURL, wget and common web browsers.

Grab aims to be stateless. The only state that exists is the remote files you wish to download and the local copy which may be completed, partially completed or not yet created. The advantage to this is that the local file system is not cluttered unnecessarily with addition state files (like a .crdownload file). The disadvantage of this approach is that grab must make assumptions about the local and remote state; specifically, that they have not been modified by another program.

If the local or remote file are modified outside of grab, and you download the file again with resuming enabled, the local file will likely become corrupted. In this case, you might consider making remote files immutable, or disabling resume.

Grab aims to enable best-in-class functionality for more complex features through extensible interfaces, rather than reimplementation. For example, you can provide your own Hash algorithm to compute file checksums, or your own rate limiter implementation (with all the associated trade-offs) to rate limit downloads.

# Packages

No description provided by the author

# Functions

Get sends a HTTP request and downloads the content of the requested URL to the given destination file path.
GetBatch sends multiple HTTP requests and downloads the content of the requested URLs to the given destination directory using the given number of concurrent worker goroutines.
IsStatusCodeError returns true if the given error is of type StatusCodeError.
NewClient returns a new file download Client, using default configuration.
NewRequest returns a new file transfer Request suitable for use with Client.Do.

# Variables

DefaultClient is the default client and is used by all Get convenience functions.
ErrBadChecksum indicates that a downloaded file failed to pass checksum validation.
ErrBadLength indicates that the server response or an existing file does not match the expected content length.
ErrFileExists indicates that the destination path already exists.
ErrNoFilename indicates that a reasonable filename could not be automatically determined using the URL or response headers from a server.
ErrNoTimestamp indicates that a timestamp could not be automatically determined using the response headers from the remote server.

# Structs

A Client is a file download client.
A Request represents an HTTP file transfer request to be sent by a Client.
Response represents the response to a completed or in-progress download request.

# Interfaces

RateLimiter is an interface that must be satisfied by any third-party rate limiters that may be used to limit download transfer speeds.

# Type aliases

A Hook is a user provided callback function that can be called by grab at various stages of a requests lifecycle.
StatusCodeError indicates that the server response had a status code that was not in the 200-299 range (after following any redirects).