Categorygithub.com/mongodb/amboy
modulepackage
0.0.0-20250303191223-d8329aeac5c7
Repository: https://github.com/mongodb/amboy.git
Documentation: pkg.go.dev

# README

================================================ amboy -- Job and Worker Pool Infrastructure

Overview

Amboy is a collection of interfaces and tools for running and managing asynchronous background work queues in the context of Go programs, and provides a number of interchangeable and robust methods for running jobs.

Features

Queues


Queue implementations impose ordering and dispatching behavior, and
describe the storage of jobs before and after work is
complete. Current queue implementations include:

- a limited size queue that keep a fixed number of completed jobs in
  memory, which is ideal for long-running background processes.

- remote queues that store all jobs in an external storage system
  (e.g. a database) to support architectures where multiple processes
  can service the same underlying queue.

Queue Groups

The QueueGroup <https://godoc.org/github.com/mongodb/amboy#QueueGroup>_ interface provides a mechanism to manage collections of queues. There are remote and local versions of the queue group possible, but these groups make it possible to create new queues at runtime, and improve the isolation of queues from each other.

Retryable Queues


The `RetryableQueue
<https://godoc.org/github.com/mongodb/amboy#RetryableQueue>_` interface provides
a superset of the queue functionality. Along with regular queue operations, it
also supports jobs that can retry. When a job finishes executing and needs to
retry (e.g. due to a transient error), the retryable queue will automatically
re-run the job.

Runners
~~~~~~~

Runners are the execution component of the worker pool, and are
embedded within the queues, and can be injected at run time before
starting the queue pool. The `LocalWorkers
<https://godoc.org/github.com/mongodb/amboy/pool#LocalWorkers>`_
implementation executes jobs in a fixed-size worker pool, which is
the default of most queue implementations.

Additional implementation provide rate limiting, and it would be possible to
implement runners which used the REST interface to distribute workers to a
larger pool of processes, where existing runners simply use go routines.

Dependencies
~~~~~~~~~~~~

The `DependencyManager
<https://godoc.org/github.com/mongodb/amboy/dependency#Manager>`_
interface makes it possible for jobs to express relationships to each
other and to their environment so that Job operations can noop or
block if their requirements are not satisfied. The data about
relationships between jobs can inform job ordering.

The handling of dependency information is the responsibility of the
queue implementation. Most queue implementations do not support this, unless
explicitly stated.

Management
~~~~~~~~~~

The `management package
<https://godoc.org/github.com/mongodb/amboy/management>`_ centers around a
`management interface
<https://godoc.org/github.com/mongodb/amboy/management#Manager>`_ that provides
methods for reporting and safely interacting with the state of jobs.

REST Interface
~~~~~~~~~~~~~~

The REST interface provides tools to manage jobs in an Amboy queue provided as a
service. The rest package in Amboy provides the tools to build clients and
services, although any client that can construct JSON-formatted Job object can
use the REST API.

Additionally the REST package provides remote implementations of the `management
interface <https://godoc.org/github.com/mongodb/amboy/rest#ManagementService>`_
which makes it possible to manage and report on the jobs in an existing queue,
and the `abortable pool
<https://godoc.org/github.com/mongodb/amboy/rest#AbortablePoolManagementService>`_
interface, that makes it possible to abort running jobs. These management tools
can help administrators of larger amboy systems gain insights into the current
behavior of the system, and promote safe and gentle operational interventions.

See the documentation of the `REST package
<https://godoc.org/github.com/mongodb/amboy/rest>`_

Logger
~~~~~~

The Logger package provides amboy.Queue backed implementation of the grip
logging system's sender interface for asynchronous log message delivery. These
jobs do not support remote-backed queues.

Patterns
--------

The following patterns have emerged during our use of Amboy.

Base Job
~~~~~~~~

Embed the `job.Base <https://godoc.org/github.com/mongodb/amboy/job/#Base>`_
type in your Job implementations. This provides a number of helpers for
basic job definition, in addition to implementations of all general methods in
the interface. With the Base, you only need to implement a ``Run()`` method and
whatever application logic is required for the job.

The only case where embedding the Base type *may* be contraindicated is in
conjunction with the REST interface, as the Base type may require more
complicated initialization processes.

Change Queue Implementations for Different Deployment Architectures

If your core application operations are implemented in terms of Jobs, then you can: execute them independently of queues by calling the Run() method, use a locally backed queue for synchronous operation for short running queues, and use a limited size queue or remote-backed queue as part of a long running service.

Please submit pull requests or issues <https://github.com/mongodb/amboy>_ with additional examples of amboy use.

API and Documentation

See the API documentation <https://godoc.org/github.com/mongodb/amboy> for more information about amboy interfaces and internals.

Notice for External Users

Amboy is being continuously developed for Evergreen <https://github.com/evergreen-ci/evergreen>. This is not a stable library and upgrades are at your own risk - it may be changed to add, remove, or modify functionality in a way that breaks backward compatibility.

Development

Getting Started


Amboy uses Go modules. To download the modules ::

    make mod-tidy

All project automation is managed by a makefile, with all output captured in the
`build` directory. Consider the following operations: ::

   make compile                 # runs a test compile
   make test                    # tests all packages
   make test-<package>          # runs the tests only for a specific packages
   make lint                    # lints all packages
   make lint-<package>          # lints a specific package
   make html-coverage           # generates the HTML coverage report for all packages
   make html-coverage-<package> # generates the HTML coverage report for a specific package

The buildsystem also has a number of flags, which may be useful for more
iterative development workflows: ::

  RUN_TEST=<TestName>   # specify a test name or regex to run a subset of tests
  RUN_COUNT=<num>       # run a test more than once to isolate an intermittent failure
  RACE_DETECTOR=true    # run specified tests with the race detector enabled. 

Issues
~~~~~~

Please file all issues in the `EVG project
<https://jira.mongodb.org/browse/EVG>`_ in the `MongoDB Jira
<https://jira.mongodb.org/>`_ instance.

# Packages

No description provided by the author
No description provided by the author
No description provided by the author
Edges Dependencies have methods to add or access Edges of the job.
Package job provides tools and generic implementations of jobs for amboy Queues.
Package logger is a set of implementations to support amboy.Queue backed grip/send.Senders for asynchronous and (generally) non-blocking log message delivery.
Package management provides increased observability and control of the state of amboy queues.
Local Workers Pool The LocalWorkers is a simple worker pool implementation that spawns a collection of (n) workers and dispatches jobs to worker threads, that consume work items from the Queue's Next() method.
Package queue provides several implementations of the amboy.Queue and amboy.RemoteQueue interfaces capable of processing amboy.Job implementations.
Package registry contains infrastructure to support the persistence of Job definitions.
No description provided by the author

# Functions

CollateWriteErrors collates errors into a [writeErrors].
EnqueueManyUniqueJobs is a generic wrapper for adding jobs to a queue (using the PutMany() method) ignoring duplicate job errors.
EnqueueUniqueJob is a generic wrapper for adding jobs to queues (using the Put() method), but that ignores duplicate job errors.
GroupQueueOperationFactory produces a QueueOperation that aggregates and runs one or more QueueOperations.
IntervalGroupQueueOperation schedules jobs on a queue group with similar semantics as IntervalQueueOperation.
IntervalQueueOperation runs a queue scheduling operation on a regular interval, starting at specific time.
IsDuplicateJobError checks if an error is due to a duplicate job in the queue.
IsDuplicateJobScopeError checks if an error is due to a duplicate job scope in the queue.
IsJobNotFound checks if an error was due to not being able to find the job in the queue.
MakeDuplicateJobError constructs a duplicate job error from an existing error of any type, for use by queue implementations.
MakeDuplicateJobScopeError constructs a duplicate job scope error from an existing error of any type, for use by queue implementations.
MakeJobNotFoundError constructs an error from an existing one, indicating that a job could not be found in the queue.
NewDuplicateJobError creates a new error to represent a duplicate job error, for use by queue implementations.
NewDuplicateJobErrorf creates a new error to represent a duplicate job error with a formatted message, for use by queue implementations.
NewDuplicateJobScopeError creates a new error object to represent a duplicate job scope error, for use by queue implementations.
NewDuplicateJobScopeErrorf creates a new error object to represent a duplicate job scope error with a formatted message, for use by queue implementations.
NewJobInfo creates a JobInfo from a Job.
NewJobNotFoundError creates a new error indicating that a job could not be found in the queue.
NewJobNotFoundErrorf creates a new error with a formatted message, indicating that a job could not be found in the queue.
PopulateQueue adds Jobs from a channel to a Queue and returns an error with the aggregated results of these operations.
Report returns a QueueReport status for the state of a Queue.
ResolveErrors takes a Queue and iterates over the completed Jobs' results, returning a single aggregated error for all of the Queue's Jobs.
RunJob executes a single job directly, without a Queue, with similar semantics as it would execute in a Queue: MaxTime is respected, and it uses similar logging as is present in the queue, with errors propagated functionally.
ScheduleJobFactory produces a QueueOpertion that calls a single function which returns a Job and puts that job into the queue.
ScheduleJobsFromGeneratorFactory produces a queue operation that calls a single generator function which returns channel of Jobs and puts those jobs into the queue.
ScheduleManyJobsFactory produces a queue operation that calls a single function which returns a slice of jobs and puts those jobs into the queue.
Wait takes a queue and blocks until all job are completed or the context is canceled.
WaitInterval provides the Wait operation and accepts a context for cancellation while also waiting for an interval between stats calls.
WaitIntervalNum waits for a certain number of jobs to complete.
WaitJob blocks until the job, based on its ID, is marked complete in the queue, or the context is canceled.
WaitJobInterval takes a job and queue object and waits for the job to be marked complete.
WithRetryableQueue is a convenience function to perform an operation if the Queue is a RetryableQueue; otherwise, it is a no-op.

# Constants

Supported values of the Format type, which represent different supported serialization methods.
Supported values of the Format type, which represent different supported serialization methods.
Supported values of the Format type, which represent different supported serialization methods.
LockTimeout describes the default period of time that a queue will respect a stale lock from another queue before beginning work on a job.

# Structs

GroupQueueOperation describes a single queue population operation for a group queue.
JobInfo provides a view of information for a Job.
JobNotFoundError represents an error indicating that a job could not be found in a queue.
JobRetryInfo stores configuration and information for a job that can retry.
JobRetryOptions represents configuration options for a job that can retry.
JobStatusInfo contains information about the current status of a job and is reported by the Status and set by the SetStatus methods in the Job interface.
JobTimeInfo stores timing information for a job and is used by both the Runner and Job implementations to track how long jobs take to execute.
JobType contains information about the type of a job, which queues can use to serialize objects.
QueueInfo describes runtime information associated with a Queue.
QueueOperationConfig describes the behavior of the periodic interval schedulers.
QueueReport holds the IDs of Jobs in a Queue based on their current state.
QueueStats is a simple structure that the Stats() method in the Queue interface returns and tracks the state of the queue, and provides a common format for different Queue implementations to report on their state.
RetryHandlerOptions configures the behavior of a RetryHandler.

# Interfaces

AbortableRunner provides a superset of the Runner interface but allows callers to abort jobs by ID.
Job describes a unit of work.
Queue describes a very simple Job queue interface that allows users to define Job objects, add them to a worker queue, and execute jobs from that queue.
QueueGroup describes a group of queues.
QueueOptions represent options for an individual queue in a queue group.
RetryableQueue is the same as a Queue but supports additional operations for retryable jobs.
RetryHandler provides a means to retry jobs (see JobRetryOptions) within a RetryableQueue.
Runner describes a simple worker interface for executing jobs in the context of a Queue.

# Type aliases

Format defines a sequence of constants used to distinguish between different serialization formats for job objects used in the amboy.ConvertTo and amboy.ConvertFrom functions, which support the functionality of the Export and Import methods in the job interface.
QueueOperation is a named function literal for use in the PeriodicQueueOperation function.