# README
DB Design
Logical Sharding in MongoDB
Overview
WARNING If you plan to change the logical-shard-count
to a higher number,
consult the team first!
Logical sharding distributes data across multiple logical partitions or shards to enhance performance and scalability by reducing write contention. This method is especially beneficial in environments with high write throughput, where many operations update the same data points concurrently.
Implementation
Sharding is achieved by randomly select a shard number to represent the document's ID
(or by appending a shard number to each document's ID, such as {{docId}}:{{shardNumber}}
)
This approach effectively spreads writes across multiple documents, reducing bottlenecks.
Example
func (db *Database) generateOverallStatsId() string {
return fmt.Sprint(rand.Intn(int(db.cfg.LogicalShardCount)))
}
or
func (db *Database) generateXXXStatsId(docId string) string {
randomShardNum := uint64(rand.Intn(int(db.cfg.LogicalShardCount)))
return fmt.Sprintf("%s:%d", docId, randomShardNum)
}
Considerations
Query Complexity
Logical sharding increases query complexity.
Operations like GetOverallStats
must now access multiple shards,
leading to costlier queries as shard count increases.
Configuration Sensitivity
Increasing the LogicalShardCount can further complicate queries. Crucially, once increased, the shard count should never be decreased to avoid disrupting the sharding logic and causing data inconsistencies.
Stats Locking
Overview
Stats Locking refers to the use of a dedicated collection, stats_lock
, in our service.
This collection helps manage state and prevent duplication in statistical calculations,
which is critical in a system designed to handle potential duplicate event messages.
Implementation
The primary key for each document in the stats_lock
collection is
formatted as {{staking-tx-hash-hex}}:{{state}},
where we currently support two states: active and unbonded. Each document tracks
whether specific calculations have been performed,
preventing repetitive operations in case of message duplication.
This mechanism ensures that each stats calculation, whether adding (+) or subtracting (-), is performed only once per transaction, leveraging MongoDB transactions for consistency and reliability.
Future Extension
To accommodate additional calculations in the future,
the stats_lock
collection may need to be extended with new boolean fields
corresponding to each new calculation type.
This approach will maintain the integrity and scalability of the stats locking system.
Example
Consider xyz
as the staking transaction hash hex.
-
Service Reception:
- The service receives an
ActiveStakingEvent
which should add values to the Total Value Locked (TVL).
- The service receives an
-
Breakdown of Stats Calculation:
- When processing overall stats (e.g., total TVL), our
stats_lock
collection will record item with primary key ofxyz:active
withoverall_stats=true
,finality_provider=false
. - When processing finality provider related stats (e.g., per finality provider TVL),
our
stats_lock
collection will update the record with primary key ofxyz:active
withoverall_stats=true
,finality_provider=true
.
- When processing overall stats (e.g., total TVL), our
-
Handling Duplicates:
- If our system receives the same
ActiveStakingEvent
again, the stats calculation won't be reprocessed. This is because the system checks the boolean values foroverall_stats
andfinality_provider
individually, ensuring that each calculation is performed only once.
- If our system receives the same