Skip to main content

Database Backup Strategies for Small Teams

·1001 words·5 mins· loading · loading ·
Author
Maksim P.
DevOps Engineer / SRE

TL;DR
#

  • Start with managed database backups if using RDS/Cloud SQL—they’re automatic and battle-tested
  • For self-hosted databases, use native tools (pg_dump, mysqldump) + cron + S3 lifecycle policies
  • Test restores quarterly. A backup you can’t restore from is just wasted storage
  • 3-2-1 rule still applies: 3 copies, 2 different media types, 1 offsite
  • Don’t overthink it—a simple working backup beats a complex one that fails silently

Who this is for
#

Teams of 3-10 engineers running production databases without a dedicated DBA. You know backups matter but aren’t sure if you’re overdoing it or setting yourself up for data loss. This covers both managed cloud databases and self-hosted setups.

The backup pyramid: match complexity to risk
#

Not all data is created equal. Your backup strategy should match your actual recovery needs:

Tier 1: “We’d be annoyed but fine”

  • Development databases
  • Analytics data you can regenerate
  • Solution: Daily snapshots, 7-day retention

Tier 2: “This would hurt but we’d recover”

  • Production data with paper trails elsewhere
  • User-generated content with recent backups
  • Solution: Hourly snapshots, 30-day retention, tested quarterly

Tier 3: “Company-ending if lost”

  • Financial records
  • Core user data
  • Compliance-regulated data
  • Solution: Continuous replication, point-in-time recovery, 90+ day retention, monthly restore drills

Most small teams treat everything as Tier 3. This wastes time and money. Be honest about what actually matters.

Option 1: Managed database backups (boring is good)
#

If you’re on RDS, Cloud SQL, or Azure Database, use their built-in backups. Yes, it costs more than self-hosting. No, you shouldn’t care at your scale.

AWS RDS example:

  • Automated backups: enabled by default, 7-day retention
  • Manual snapshots: unlimited retention, ~$0.095/GB/month
  • Point-in-time recovery: restore to any second within retention window
  • Cross-region backup: one click in console

Cost for 100GB production database:

  • Automated backups: free (within retention period)
  • One monthly snapshot kept for a year: ~$9.50/month
  • Cross-region replication: ~$20/month

That’s $30/month for peace of mind. Your engineering time costs more.

Option 2: Self-hosted database backups
#

Running databases on EC2/VMs? You’ll need to roll your own. Here’s a production-ready setup that won’t wake you at 3 AM.

The simple approach: cron + native tools + S3
#

#!/bin/bash
# /opt/scripts/backup-postgres.sh

set -euo pipefail

# Configuration
DB_NAME="production"
S3_BUCKET="mycompany-db-backups"
BACKUP_PREFIX="postgres"
PGPASSWORD="your-password-here"  # Better: use .pgpass or IAM auth

# Generate backup filename with timestamp
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="${BACKUP_PREFIX}_${DB_NAME}_${TIMESTAMP}.sql.gz"
BACKUP_PATH="/tmp/${BACKUP_FILE}"

# Create backup
echo "Starting backup of ${DB_NAME}..."
pg_dump -h localhost -U postgres -d ${DB_NAME} | gzip > ${BACKUP_PATH}

# Upload to S3
echo "Uploading to S3..."
aws s3 cp ${BACKUP_PATH} s3://${S3_BUCKET}/daily/${BACKUP_FILE}

# Cleanup local file
rm ${BACKUP_PATH}

# Prune old backups (keep 30 days)
echo "Pruning old backups..."
aws s3 ls s3://${S3_BUCKET}/daily/ | \
  awk '{print $4}' | \
  sort -r | \
  tail -n +31 | \
  xargs -I {} aws s3 rm s3://${S3_BUCKET}/daily/{}

echo "Backup complete: ${BACKUP_FILE}"

Add to crontab for hourly backups:

0 * * * * /opt/scripts/backup-postgres.sh >> /var/log/db-backup.log 2>&1

This gives you hourly backups with 30-day retention. Total cost for 100GB database with daily 10% change rate: ~$5/month in S3 storage.

Level up: Add monitoring
#

The scariest backup failure is the silent one. Add simple monitoring:

  1. Backup freshness check: Alert if latest backup is >25 hours old
  2. Backup size check: Alert if backup size drops >20% (corruption indicator)
  3. Test restore: Monthly cron job that restores to a test instance

Most monitoring tools (Datadog, New Relic, even CloudWatch) can check S3 object age. Use them.

Option 3: Continuous replication (when you need point-in-time)
#

Need to recover to a specific transaction? You want continuous archiving:

PostgreSQL: WAL archiving to S3

  • Set archive_mode = on and wal_level = replica
  • Use archive_command to ship WAL files to S3
  • Combine with daily base backups
  • Recovery: restore base backup, replay WAL files

MySQL: Binary log shipping

  • Enable binary logging
  • Ship logs to S3 with mysqlbinlog
  • Similar recovery process

This is more complex but gives you point-in-time recovery. Only worth it for Tier 3 data.

The 3-2-1 rule for small teams
#

The classic backup rule: 3 copies, 2 different storage types, 1 offsite. Here’s how it maps to modern infrastructure:

  1. Primary: Your production database
  2. Secondary: S3 in same region (different storage type)
  3. Tertiary: S3 in different region or Glacier (offsite)

For most small teams, this translates to:

  • Daily backups to S3 (automated lifecycle to Glacier after 30 days)
  • Monthly snapshot to different region
  • Quarterly backup to different cloud provider (if paranoid)

Testing restores: the part everyone skips
#

A backup you’ve never restored is Schrödinger’s backup—simultaneously working and broken until observed.

Quarterly restore checklist:

  1. Pick a random daily backup from last month
  2. Restore to test instance
  3. Run basic smoke tests (row counts, recent data present)
  4. Document time to restore (your RTO)
  5. Delete test instance

Put it in the calendar. Make it someone’s OKR. Track it like deploys. Whatever it takes to actually do it.

When to upgrade your approach
#

Your simple backup strategy stops being simple when:

  • Restores take >4 hours (business can’t wait)
  • You’re backing up >1TB (restore time becomes painful)
  • Compliance requires specific retention/encryption
  • You need cross-region HA (not just DR)
  • Multiple databases need coordinated backups

At that point, look at:

  • Dedicated backup tools (Percona XtraBackup, pgBackRest)
  • Managed services (AWS Backup, Veeam)
  • Database-native solutions (RDS Multi-AZ, Cloud SQL HA)

But don’t jump there prematurely. Most teams under 10 engineers don’t need enterprise backup solutions.

Common mistakes to avoid
#

Backing up the replica instead of primary: Replicas can lag or diverge. Always backup from the source of truth.

Not testing encryption: That encrypted backup is useless if you lose the key. Store keys separately from backups.

Forgetting the schema: mysqldump --no-data for schema-only backups. Version control these.

Ignoring backup windows: That 3 AM backup might coincide with batch jobs. Check your backup impact.

Over-retaining: 7 years of daily backups for a startup MVP is hoarding, not strategy.

Related reads #

Reply by Email