# README
cachenv
Versatile memoizing cache for program invocations, with a virtualenv
-like
interface.
Note: this repo is still in early development; the README is aspirational.
Overview
cachenv
is a lightweight tool that provides caching for your commands,
scripts, and pipelines. In fact, any program that calls exec()
can use cachenv
.
It's like function memoization, but at the process boundary. This is useful for a variety of things, especially testing, rapid iteration, and ad hoc data processing.
The workflow mirrors that of virtualenv: you create an environment, activate it, and work within it.
Use Cases
- Creating consistent, dependency-free testing environments
- Rapidly iterating on scripts or notebooks that rely on external services or data processing
- Efficiently constructing CLI pipelines that involve large inputs or expensive filters/aggregations
- Eliminating redundant calls to metered APIs
How It Works
Just like virtualenv
, cachenv
inserts symlinks at the front of your PATH
.
Calls to cached programs are intercepted, and the cache is checked (the cache key is a hash of the program name and arguments). On cache hits, the output is returned immediately. On misses, the original program is executed with the provided arguments, and the cache is updated.
Usage
Initialize and activate a new cachenv:
$ cachenv init .cachenv
Created activate script at .cachenv/activate
$ source .cachenv/activate
Enable memoization for ls
:
(.cachenv) $ cachenv add ls
Command 'ls' added to memoized commands.
Refreshed symlink for ls
Enjoy memoization for ls
:
(.cachenv) $ ls
foo
(.cachenv) $ touch bar
(.cachenv) $ ls
foo
Try diff mode:
(.cachenv) $ cachenv diff ls
0a1
> bar
Features
Implemented | Feature | Description |
---|---|---|
✅ | Comprehensive Caching | Captures stdout, stderr, and exit codes, providing a complete snapshot of a program's behavior given a particular input. |
Selective Memoization | Supports precise configuration to selectively enable caching based on program name, arguments, and/or input patterns. | |
Streaming Mode | Supports caching at the line level, keyed by stdin. | |
File Awareness | Can optionally distinguish cache entries based on the contents of files
provided as arguments (e.g., for grep foo bar.txt , refresh
the cache when the content of bar.txt changes). | |
Pipeline Compatibility | Naturally integrates with command pipelines, allowing a mix of cached and live executions within complex command chains. Also includes optimizations which can effectively cache entire pipelines. | |
✅ | Diff Mode | Can show changes in a program's behavior against a cached snapshot. |
✅ | Cross-Environment Portability | Enables cache sharing and reuse across different machines and operating systems. |