Categorygithub.com/aromatt/cachenv
repositorypackage
0.0.0-20241217190405-305832e8c696
Repository: https://github.com/aromatt/cachenv.git
Documentation: pkg.go.dev

# README

cachenv

Versatile memoizing cache for program invocations, with a virtualenv-like interface.

Note: this repo is still in early development; the README is aspirational.

Overview

cachenv is a lightweight tool that provides caching for your commands, scripts, and pipelines. In fact, any program that calls exec() can use cachenv.

It's like function memoization, but at the process boundary. This is useful for a variety of things, especially testing, rapid iteration, and ad hoc data processing.

The workflow mirrors that of virtualenv: you create an environment, activate it, and work within it.

Use Cases

  • Creating consistent, dependency-free testing environments
  • Rapidly iterating on scripts or notebooks that rely on external services or data processing
  • Efficiently constructing CLI pipelines that involve large inputs or expensive filters/aggregations
  • Eliminating redundant calls to metered APIs

How It Works

Just like virtualenv, cachenv inserts symlinks at the front of your PATH.

Calls to cached programs are intercepted, and the cache is checked (the cache key is a hash of the program name and arguments). On cache hits, the output is returned immediately. On misses, the original program is executed with the provided arguments, and the cache is updated.

cachenv

Usage

Initialize and activate a new cachenv:

$ cachenv init .cachenv
Created activate script at .cachenv/activate

$ source .cachenv/activate

Enable memoization for ls:

(.cachenv) $ cachenv add ls
Command 'ls' added to memoized commands.
Refreshed symlink for ls

Enjoy memoization for ls:

(.cachenv) $ ls
foo

(.cachenv) $ touch bar
(.cachenv) $ ls
foo

Try diff mode:

(.cachenv) $ cachenv diff ls
0a1
> bar

Features

ImplementedFeatureDescription
Comprehensive CachingCaptures stdout, stderr, and exit codes, providing a complete snapshot of a program's behavior given a particular input.
Selective MemoizationSupports precise configuration to selectively enable caching based on program name, arguments, and/or input patterns.
Streaming ModeSupports caching at the line level, keyed by stdin.
File AwarenessCan optionally distinguish cache entries based on the contents of files provided as arguments (e.g., for grep foo bar.txt, refresh the cache when the content of bar.txt changes).
Pipeline CompatibilityNaturally integrates with command pipelines, allowing a mix of cached and live executions within complex command chains. Also includes optimizations which can effectively cache entire pipelines.
Diff ModeCan show changes in a program's behavior against a cached snapshot.
Cross-Environment PortabilityEnables cache sharing and reuse across different machines and operating systems.