# README
Chaos test for dsync
This directory contains various sorts of 'chaos' tests for dsync. For instance it simulates the locking process is uneffected while servers that participate in the locking process go down and come back up again.
Tests
This is a list of cases being tested:
testNotEnoughServersForQuorum
: verifies that when quorum cannot be achieved that locking will blocktestServerGoingDown
: tests that a lock is granted when all servers are up, after too many servers die that a new lock will block and once servers are up again, the lock is grantedtestSingleServerOverQuorumDownDuringLock
: verifies that if a server goes down while a lock is held, and comes back later another lock on the same name is not granted too earlytestSingleStaleLock
: verifies that, despite a single stale lock, a new lock can still be acquired on same resourcetestMultipleStaleLocks
: verifies that (before maintenance kicks in) multiple stale locks will prevent a new lock from being granted; and (after maintenance has happened) multiple stale locks not will prevent a new lock from being grantedtestClientThatHasLockCrashes
: verifies that (after a lock maintenance loop) multiple stale locks will not prevent a new lock on same resourcetestTwoClientsThatHaveReadLocksCrash
: like testClientThatHasLockCrashes but with two clients having read lockstestWriterStarvation
: tests that a separate implementation using a pair of two DRWMutexes can prevent writer starvation (due to too many read locks)
Known error cases
testMultipleServersOverQuorumDownDuringLockKnownError
: verifies that if multiple servers go down while a lock is held, and come back later another lock on the same name is granted too early
Building
$ cd chaos
$ go build
Running
$ ./chaos
If it warns about the following
Found more than one chaos process. Killing all and exiting
it has found more than one chaos process (most likely a left over from a previous run of the program), simply repeat the ./chaos
command to try again.
# Functions
No description provided by the author
# Constants
No description provided by the author
For this test framework we have short lock timings Production should use eg following values
const LockMaintenanceLoop = 1 * time.Minute const LockCheckValidityInterval = 2 * time.Minute
.
# Structs
No description provided by the author
ReconnectRPCClient is a wrapper type for rpc.Client which provides reconnect on first failure.
# Interfaces
No description provided by the author