Categorygithub.com/c-bata/goptuna
modulepackage
0.9.0
Repository: https://github.com/c-bata/goptuna.git
Documentation: pkg.go.dev

# README

Goptuna

Software License GoDoc Go Report Card

Decentralized hyperparameter optimization framework, inspired by Optuna [1]. This library is particularly designed for machine learning, but everything will be able to optimize if you can define the objective function (e.g. Optimizing the number of goroutines of your server and the memory buffer size of the caching systems).

Supported algorithms:

Goptuna supports various state-of-the-art Bayesian optimization, evolution strategies and Multi-armed bandit algorithms. All algorithms are implemented in pure Go and continuously benchmarked on GitHub Actions.

  • Random search
  • TPE: Tree-structured Parzen Estimators [2]
  • CMA-ES: Covariance Matrix Adaptation Evolution Strategy [3]
  • IPOP-CMA-ES: CMA-ES with increasing population size [4]
  • BIPOP-CMA-ES: BI-population CMA-ES [5]
  • Median Stopping Rule [6]
  • ASHA: Asynchronous Successive Halving Algorithm (Optuna flavored version) [1,7,8]
  • Quasi-monte carlo sampling based on Sobol sequence [10, 11]

Projects using Goptuna:

Installation

You can integrate Goptuna in wide variety of Go projects because of its portability of pure Go.

$ go get -u github.com/c-bata/goptuna

Usage

Goptuna supports Define-by-Run style API like Optuna. You can dynamically construct the search spaces.

Basic usage

package main

import (
    "log"
    "math"

    "github.com/c-bata/goptuna"
    "github.com/c-bata/goptuna/tpe"
)

// ① Define an objective function which returns a value you want to minimize.
func objective(trial goptuna.Trial) (float64, error) {
    // ② Define the search space via Suggest APIs.
    x1, _ := trial.SuggestFloat("x1", -10, 10)
    x2, _ := trial.SuggestFloat("x2", -10, 10)
    return math.Pow(x1-2, 2) + math.Pow(x2+5, 2), nil
}

func main() {
    // ③ Create a study which manages each experiment.
    study, err := goptuna.CreateStudy(
        "goptuna-example",
        goptuna.StudyOptionSampler(tpe.NewSampler()))
    if err != nil { ... }

    // ④ Evaluate your objective function.
    err = study.Optimize(objective, 100)
    if err != nil { ... }

    // ⑤ Print the best evaluation parameters.
    v, _ := study.GetBestValue()
    p, _ := study.GetBestParams()
    log.Printf("Best value=%f (x1=%f, x2=%f)",
        v, p["x1"].(float64), p["x2"].(float64))
}

Link: Go Playground

Furthermore, I recommend you to use RDB storage backend for following purposes.

  • Continue from where we stopped in the previous optimizations.
  • Scale studies to tens of workers that connecting to the same RDB storage.
  • Check optimization results via a built-in dashboard.

Built-in Web Dashboard

You can check optimization results by built-in web dashboard.

  • SQLite3: $ goptuna dashboard --storage sqlite:///example.db (See here for details).
  • MySQL: $ goptuna dashboard --storage mysql://goptuna:[email protected]:3306/yourdb (See here for details)
Manage optimization resultsInteractive live-updating graphs
state-of-the-art-algorithmsvisualization

Advanced Usage

Parallel optimization with multiple goroutine workers

Optimize method of goptuna.Study object is designed as the goroutine safe. So you can easily optimize your objective function using multiple goroutine workers.

package main

import ...

func main() {
    study, _ := goptuna.CreateStudy(...)

    eg, ctx := errgroup.WithContext(context.Background())
    study.WithContext(ctx)
    for i := 0; i < 5; i++ {
        eg.Go(func() error {
            return study.Optimize(objective, 100)
        })
    }
    if err := eg.Wait(); err != nil { ... }
    ...
}

full source code

Distributed optimization using MySQL

There is no complicated setup to use RDB storage backend. First, setup MySQL server like following to share the optimization result.

$ docker pull mysql:8.0
$ docker run \
  -d \
  --rm \
  -p 3306:3306 \
  -e MYSQL_USER=goptuna \
  -e MYSQL_DATABASE=goptuna \
  -e MYSQL_PASSWORD=password \
  -e MYSQL_ALLOW_EMPTY_PASSWORD=yes \
  --name goptuna-mysql \
  mysql:8.0

Then, create a study object using Goptuna CLI.

$ goptuna create-study --storage mysql://goptuna:password@localhost:3306/yourdb --study yourstudy
yourstudy
$ mysql --host 127.0.0.1 --port 3306 --user goptuna -ppassword -e "SELECT * FROM studies;"
+----------+------------+-----------+
| study_id | study_name | direction |
+----------+------------+-----------+
|        1 | yourstudy  | MINIMIZE  |
+----------+------------+-----------+
1 row in set (0.00 sec)

Finally, run the Goptuna workers which contains following code. You can execute distributed optimization by just executing this script from multiple server instances.

package main

import ...

func main() {
    db, _ := gorm.Open(mysql.Open("goptuna:password@tcp(localhost:3306)/yourdb?parseTime=true"), &gorm.Config{
        Logger: logger.Default.LogMode(logger.Silent),
    })
    storage := rdb.NewStorage(db)
    defer db.Close()

    study, _ := goptuna.LoadStudy(
        "yourstudy",
        goptuna.StudyOptionStorage(storage),
        ...,
    )
    _ = study.Optimize(objective, 50)
    ...
}

Full source code is available here.

Receive notifications of each trials

You can receive notifications of each trials via channel. It can be used for logging and any notification systems.

package main

import ...

func main() {
    trialchan := make(chan goptuna.FrozenTrial, 8)
    study, _ := goptuna.CreateStudy(
        ...
        goptuna.StudyOptionIgnoreObjectiveErr(true),
        goptuna.StudyOptionSetTrialNotifyChannel(trialchan),
    )

    var wg sync.WaitGroup
    wg.Add(2)
    go func() {
        defer wg.Done()
        err = study.Optimize(objective, 100)
        close(trialchan)
    }()
    go func() {
        defer wg.Done()
        for t := range trialchan {
            log.Println("trial", t)
        }
    }()
    wg.Wait()
    if err != nil { ... }
    ...
}

full source code

Links

References:

Presentations:

Blog posts:

Status:

License

This software is licensed under the MIT license, see LICENSE for more information.

# Packages

No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author
No description provided by the author

# Functions

CreateStudy creates a new Study object.
DeleteStudy delete a study object.
DistributionIsSingle whether the distribution contains just a single value.
DistributionToJSON serialize a distribution to JSON format.
IntersectionSearchSpace return return the intersection search space of the Study.
JSONToDistribution deserialize a distribution in JSON format.
LoadStudy loads an existing study.
NewBlackHoleStorage returns BlackHoleStorage.
NewInMemoryStorage returns new memory storage.
NewRandomSampler implements random search algorithm.
RandomSamplerOptionSeed sets seed number.
StudyOptionDefineSearchSpace to use RelativeSampler from the first trial.
StudyOptionDirection change the direction of optimize.
StudyOptionIgnoreError is an option to continue even if it receive error while running Optimize method.
StudyOptionLoadIfExists to load the study if exists.
StudyOptionLogger sets Logger.
StudyOptionPruner sets the pruner object.
StudyOptionRelativeSampler sets the relative sampler object.
StudyOptionSampler sets the sampler object.
StudyOptionStorage sets the storage object.
StudyOptionTrialNotifyChannel to subscribe the finished trials.
ToExternalRepresentation converts to external representation.

# Constants

CategoricalDistributionName is the identifier name of CategoricalDistribution.
DiscreteUniformDistributionName is the identifier name of DiscreteUniformDistribution.
InMemoryStorageStudyID is a study id for in memory storage backend.
InMemoryStorageStudyUUID is a UUID for in memory storage backend.
IntUniformDistributionName is the identifier name of IntUniformDistribution.
LoggerLevelDebug logs are typically voluminous, and are usually disabled in production.
LoggerLevelError logs are high-priority.
LoggerLevelInfo is the default logging priority.
LoggerLevelWarn logs are more important than Info, but don't need individual human review.
LogUniformDistributionName is the identifier name of LogUniformDistribution.
StepIntUniformDistributionName is the identifier name of IntUniformDistribution.
StudyDirectionMaximize maximizes objective function value.
StudyDirectionMinimize minimizes objective function value.
TrialStateComplete means Trial has been finished without any error.
TrialStateFail means Trial has failed due to an uncaught error.
TrialStatePruned means Trial has been pruned.
TrialStateRunning means Trial is running.
TrialStateWaiting means Trial has been stopped, but may be resuming.
UniformDistributionName is the identifier name of UniformDistribution.

# Variables

DefaultStudyNamePrefix is a prefix of the default study name.
ErrDeleteNonFinishedTrial means that non finished trial is deleted.
ErrInvalidStudyID represents invalid study id.
ErrInvalidTrialID represents invalid trial id.
ErrNoCompletedTrials represents no trials are completed yet.
ErrTrialAlreadyDeleted means that trial is already deleted.
ErrTrialCannotBeUpdated represents trial cannot be updated.
ErrTrialPruned represents the pruned.
ErrTrialsPartiallyDeleted means that trials are partially deleted.
ErrUnknownDistribution returns the distribution is unknown.
ErrUnsupportedSearchSpace represents sampler does not support a given search space.
NewRandomSearchSampler implements random search algorithm.
RandomSearchSamplerOptionSeed sets seed number.
StudyOptionSetDirection change the direction of optimize Deprecated: please use StudyOptionDirection instead.
StudyOptionSetLogger sets Logger.
StudyOptionSetTrialNotifyChannel to subscribe the finished trials.

# Structs

BlackHoleStorage is an in-memory storage, but designed for over 100k trials.
CategoricalDistribution is a distribution for categorical parameters.
DiscreteUniformDistribution is a discretized uniform distribution in the linear domain.
FrozenTrial holds the status and results of a Trial.
InMemoryStorage stores data in memory of the Go process.
IntUniformDistribution is a uniform distribution on integers.
LogUniformDistribution is a uniform distribution in the log domain.
RandomSampler for random search.
StdLogger wraps 'log' standard library.
StepIntUniformDistribution is a uniform distribution on integers.
Study corresponds to an optimization task, i.e., a set of trials.
StudySummary holds basic attributes and aggregated results of Study.
Trial is a process of evaluating an objective function.
UniformDistribution is a uniform distribution in the linear domain.

# Interfaces

Distribution represents a parameter that can be optimized.
Logger is the interface for logging messages.
Pruner is a interface for early stopping algorithms.
RelativeSampler is the interface for sampling algorithms that use relationship between parameters such as Gaussian Process and CMA-ES.
Sampler is the interface for sampling algorithms that do not use relationship between parameters such as random sampling and TPE.
Storage interface abstract a backend database and provide library internal interfaces to read/write history of studies and trials.

# Type aliases

FuncObjective is a type of objective function.
LoggerLevel represents Level is a logging priority.
RandomSamplerOption is a type of function to set options.
RandomSearchSampler for random search Deprecated: this is renamed to RandomSampler.
RandomSearchSamplerOption is a type of function to set change the option.
StudyDirection represents the direction of the optimization.
StudyOption to pass the custom option.
TrialState is a state of Trial.