# README
Dynamic Batching for Deep Learning Serving
Ventu already implement this protocol, so it can be used as the worker for deep learning inference.
Features
- dynamic batching with batch size and latency
- invalid request won't affects others in the same batch
- communicate with workers through Unix domain socket or TCP
- load balancing
If you are interested in the design, check my blog Deep Learning Serving Framework.
Configs
go run service/app.go --help
Usage app:
-address string
socket file or host:port (default "batch.socket")
-batch int
max batch size (default 32)
-capacity int
max jobs in the queue (default 1024)
-host string
host address (default "0.0.0.0")
-latency int
max latency (millisecond) (default 10)
-port int
service port (default 8080)
-protocol string
unix or tcp (default "unix")
-timeout int
timeout for a job (millisecond) (default 5000)
Demo
go run service/app.go
python examples/app.py
python examples/client.py
# Packages
No description provided by the author
# Functions
NewBatching creates a Batching instance.
# Constants
ErrorIDsKey defines the key for error IDs.
IntByteLength defines the byte length of `length` for data.
UUIDLength defines the string bits length.
# Type aliases
String2Bytes structure used in socket communication protocol.