# README
MongoDB Dump Scheduler
Overview
This Go application automates the process of creating MongoDB dumps, compressing them, and uploading them to an AWS S3 bucket. It runs on a schedule (default: every 30 minutes) and is designed to be deployed locally, on an AWS EC2 instance, or in a containerized environment (e.g., Docker).
Features
- MongoDB Dump: Creates MongoDB dumps using the
mongodump
utility. - Compression: Compresses dump files into
.tar.gz
format, with the option to usepigz
for parallel compression. - AWS S3 Upload: Uploads compressed dump files to an AWS S3 bucket.
- Scheduling: Configures the dump process to run at regular intervals (default: every 30 minutes).
- Automatic Cleanup: Deletes local dump files after successful upload to S3.
Prerequisites
To run this application, you need to have the following libraries and tools installed:
1. Go (Golang)
Make sure Go is installed on your system (version 1.16+ recommended). You can check this with:
go version
If you need to install Go, follow the official Go installation guide.
2. MongoDB Database Tools (mongodump)
Install mongodump
from the MongoDB database tools package:
- Ubuntu/Debian:
wget -qO - https://www.mongodb.org/static/pgp/server-6.0.asc | sudo apt-key add - echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list sudo apt-get update sudo apt-get install -y mongodb-database-tools
Verify the installation:
mongodump --version
3. AWS CLI (for local development)
If you're running the application locally, install and configure the AWS CLI to manage AWS credentials:
sudo apt-get install awscli -y
aws configure
4. pigz (Optional)
For faster, parallel compression of dump files, install pigz
:
sudo apt-get install pigz -y
5. AWS IAM Role (for EC2 deployments)
If you're deploying the application on an EC2 instance, you'll need to create an IAM role with S3 permissions and attach it to your instance. The role should have access to upload files to the S3 bucket where the dumps will be stored.
Environment Variables
You must configure the application using environment variables. These variables can be set in a .env
file (for local development) or as system environment variables (for production).
Example .env
File
MONGO_URL="mongodb+srv://<username>:<password>@<cluster-url>/<db-name>?tls=true&authSource=admin&replicaSet=<replica-set>"
AWS_REGION="us-east-1"
S3_BUCKET="your-s3-bucket-name"
DUMP_DIR="/tmp/cyclix-dumps"
Required Environment Variables
- MONGO_URL: MongoDB connection string for accessing your database.
- AWS_REGION: The AWS region where your S3 bucket is located.
- S3_BUCKET: The name of your S3 bucket where dump files will be uploaded.
- DUMP_DIR: Local directory for temporarily storing MongoDB dumps before uploading them to S3.
Setup and Installation
1. Clone the Repository
git clone https://github.com/yourusername/mongo-dump-scheduler.git
cd mongo-dump-scheduler
2. Install Dependencies
Ensure all Go dependencies are installed:
go mod tidy
3. Set Up AWS Credentials
-
For Local Development: Set up AWS credentials using
aws configure
or environment variables:export AWS_ACCESS_KEY_ID="YOUR_ACCESS_KEY_ID" export AWS_SECRET_ACCESS_KEY="YOUR_SECRET_ACCESS_KEY" export AWS_REGION="your-aws-region" export DUMP_DIR="your-folder-dir"
-
For EC2 Deployment: Ensure your EC2 instance is assigned an IAM role with appropriate S3 permissions.
4. Build the Application
make build
5. Run the Application
make run
Usage
Running Locally
-
Ensure AWS credentials are set up (via
aws configure
or environment variables). -
Start the application:
make run
This will start the scheduler, which will create MongoDB dumps and upload them to S3 every 30 minutes by default.
Running on AWS EC2
-
Attach the IAM role to the EC2 instance.
-
SSH into your EC2 instance, navigate to the project directory, and run:
make build make run
docker build -t mongo-dump-scheduler .
Logs and Monitoring
The application logs key events (e.g., dump creation, S3 upload success/failure) to the console.
- Local Logs: View directly in the terminal.
- EC2 Logs: Use CloudWatch Logs or SSH into the instance to view logs.
Testing
Local Testing
- Ensure MongoDB is running with some sample data.
- Run the application and verify that dump files are created in the
DUMP_DIR
and uploaded to S3. - Check S3 to confirm the dumps are uploaded successfully.
Testing with Large Dumps
- Populate your MongoDB with a large dataset.
- Monitor the compression and upload processes for performance.
Error Handling
- MongoDB Dump Errors: Ensure
mongodump
is installed and accessible in yourPATH
. - S3 Upload Errors: Ensure AWS credentials are set and have necessary permissions for the S3 bucket.
Libraries and Tools
This project uses several Go libraries and external utilities:
Go Libraries
- AWS SDK for Go v2: Used to interact with AWS services like S3. AWS SDK for Go v2
- robfig/cron: Used for scheduling tasks like creating MongoDB dumps. robfig/cron
- Logrus: Provides structured logging. Logrus
- godotenv: Loads environment variables from a
.env
file. godotenv
External Utilities
- mongodump: From MongoDB Database Tools, used to create MongoDB backups. MongoDB Database Tools
- tar and pigz: Used to compress dump files into
.tar.gz
format, withpigz
providing parallel compression.