Deployment Strategies

Proven deployment patterns and strategies for different environments and use cases

deployment ci-cd devops strategies

Deployment Strategies

Comprehensive guide to deployment strategies and patterns for OSpec projects, from simple single-server deployments to complex multi-region orchestrations.

Deployment Strategy Selection

Strategy Comparison Matrix

Strategy	Downtime	Risk	Complexity	Rollback Speed	Cost
Recreate	High	Low	Low	Fast	Low
Rolling	Zero	Medium	Medium	Medium	Medium
Blue-Green	Zero	Low	Medium	Instant	High
Canary	Zero	Low	High	Fast	Medium
A/B Testing	Zero	Low	High	Fast	High

When to Use Each Strategy

# Strategy selection guide
deployment_strategy_selection:
  recreate:
    use_when:
      - "development_environments"
      - "cost_is_primary_concern"
      - "simple_applications"
      - "downtime_acceptable"
    
  rolling:
    use_when:
      - "stateless_applications"
      - "resource_constraints"
      - "gradual_rollouts_needed"
      
  blue_green:
    use_when:
      - "zero_downtime_required"
      - "instant_rollback_needed"
      - "testing_in_production_like_environment"
      
  canary:
    use_when:
      - "high_risk_deployments"
      - "user_impact_testing_required"
      - "gradual_feature_rollout"
      
  ab_testing:
    use_when:
      - "feature_validation_needed"
      - "business_metrics_important"
      - "user_experience_optimization"

Basic Deployment Patterns

Recreate Deployment

# Simple recreate deployment
deployment:
  strategy: "recreate"
  
  steps:
    - name: "stop_application"
      action: "service_stop"
      timeout_seconds: 30
      
    - name: "deploy_new_version"
      action: "deploy"
      source: ""
      
    - name: "start_application"
      action: "service_start"
      health_check: true
      
  configuration:
    maintenance_page: true
    notification:
      start: "Deployment started - service unavailable"
      complete: "Deployment complete - service restored"
      
  rollback:
    automatic: true
    conditions:
      - "health_check_failed"
      - "startup_timeout"

Rolling Deployment

# Rolling deployment configuration
deployment:
  strategy: "rolling"
  
  configuration:
    batch_size: 25  # Percentage of instances to update at once
    max_unavailable: 1
    health_check_grace_period: 60
    
  phases:
    preparation:
      - "validate_artifact"
      - "run_pre_deployment_tests"
      - "backup_current_version"
      
    rolling_update:
      - name: "update_batch"
        batch_size: "%"
        wait_for_health: true
        rollback_on_failure: true
        
    verification:
      - "run_smoke_tests"
      - "validate_metrics"
      - "user_acceptance_validation"
      
  health_checks:
    http_endpoint: "/health"
    expected_status: 200
    timeout_seconds: 30
    retry_attempts: 3
    
  rollback:
    automatic: true
    conditions:
      - "health_check_failure_rate > 10%"
      - "error_rate > 5%"
      - "response_time_p95 > 2000ms"

Advanced Deployment Patterns

Blue-Green Deployment

# Blue-Green deployment setup
deployment:
  strategy: "blue_green"
  
  environments:
    blue:
      status: "active"
      instances: 3
      load_balancer_weight: 100
      
    green:
      status: "staging"
      instances: 3
      load_balancer_weight: 0
      
  deployment_process:
    - name: "deploy_to_green"
      target: "green"
      parallel_deployment: true
      
    - name: "warm_up_green"
      actions:
        - "initialize_caches"
        - "warm_up_connections"
        - "run_health_checks"
        
    - name: "smoke_test_green"
      tests:
        - "api_functionality"
        - "database_connectivity"
        - "third_party_integrations"
        
    - name: "switch_traffic"
      method: "load_balancer_switch"
      duration_seconds: 10
      
    - name: "monitor_green"
      duration_seconds: 300
      metrics:
        - "error_rate"
        - "response_time"
        - "throughput"
        
  rollback:
    method: "instant_switch"
    conditions:
      - "error_rate > 1%"
      - "response_time_degradation > 50%"
      - "manual_trigger"
      
  cleanup:
    keep_blue_duration: "1 hour"
    automated_cleanup: true

Canary Deployment

# Canary deployment configuration
deployment:
  strategy: "canary"
  
  canary_configuration:
    initial_traffic: 5    # Start with 5% of traffic
    increment: 10         # Increase by 10% each step
    max_traffic: 100      # Full deployment
    step_duration: 300    # 5 minutes per step
    
  traffic_routing:
    method: "weighted_routing"
    load_balancer: "aws_alb"
    
    rules:
      - condition: "user_segment == 'beta'"
        traffic_percentage: 100
        
      - condition: "request_header['canary'] == 'true'"
        traffic_percentage: 100
        
      - condition: "random_percentage"
        traffic_percentage: ""
        
  monitoring:
    success_criteria:
      error_rate_threshold: 0.5    # 0.5%
      latency_p95_threshold: 500   # ms
      business_metric_threshold: 0.95  # conversion rate
      
    failure_criteria:
      error_rate_threshold: 2      # 2%
      latency_p95_threshold: 1000  # ms
      alert_count_threshold: 5
      
  progression_rules:
    automatic_progression: true
    progression_conditions:
      - "error_rate < 0.5%"
      - "latency_p95 < 500ms"
      - "no_critical_alerts"
      
    hold_conditions:
      - "business_hours_only"
      - "no_other_deployments"
      
  rollback:
    automatic: true
    rollback_conditions:
      - "error_rate > 2%"
      - "latency_p95 > 1000ms"
      - "business_metric_drop > 10%"

Feature Flag Deployment

# Feature flag based deployment
deployment:
  strategy: "feature_flags"
  
  feature_flag_service: "launchdarkly"  # or split, optimizely
  
  feature_rollout:
    - name: "new_checkout_flow"
      type: "boolean"
      default_value: false
      
      rollout_strategy:
        - stage: "internal_testing"
          percentage: 0
          targeting:
            - "user.email endsWith '@company.com'"
            
        - stage: "beta_users"
          percentage: 10
          targeting:
            - "user.segment == 'beta'"
            
        - stage: "gradual_rollout"
          percentage: 25
          increment: 25
          schedule: "daily"
          
        - stage: "full_rollout"
          percentage: 100
          
  monitoring_integration:
    metrics_tracking: true
    experiment_analysis: true
    
    kill_switch:
      enabled: true
      conditions:
        - "error_rate > 2%"
        - "conversion_rate_drop > 15%"
        - "manual_intervention"

Environment-Specific Strategies

Development Environment

# Development deployment
development:
  deployment:
    strategy: "recreate"
    automation_level: "full"
    
  triggers:
    - event: "git_push"
      branch: "develop"
      auto_deploy: true
      
  features:
    hot_reload: true
    debug_mode: true
    seed_data: true
    
  notification:
    channels: ["slack_dev_channel"]
    events: ["deploy_start", "deploy_complete", "deploy_failed"]

Staging Environment

# Staging deployment
staging:
  deployment:
    strategy: "rolling"
    automation_level: "semi_automated"
    
  triggers:
    - event: "git_push"
      branch: "release/*"
      requires_approval: false
      
    - event: "pull_request_merge"
      target_branch: "main"
      requires_approval: true
      
  testing:
    automated_tests: true
    performance_tests: true
    security_scans: true
    
  data_management:
    production_like_data: true
    data_anonymization: true
    regular_refresh: "weekly"

Production Environment

# Production deployment
production:
  deployment:
    strategy: "blue_green"
    automation_level: "supervised"
    
  approval_process:
    required_approvers: 2
    approver_roles: ["tech_lead", "product_owner"]
    approval_timeout: "4 hours"
    
  safety_measures:
    deployment_window: "business_hours"
    blackout_periods:
      - "friday_afternoons"
      - "holiday_periods"
      - "high_traffic_events"
      
  monitoring:
    enhanced_monitoring: true
    real_user_monitoring: true
    business_metrics: true
    
  rollback_plan:
    automated_rollback: true
    rollback_testing: "monthly"
    communication_plan: true

Database Deployment Patterns

Schema Migrations

# Database deployment strategies
database_deployment:
  migration_strategy: "expand_contract"
  
  phases:
    expand:
      - "add_new_columns"
      - "create_new_tables"
      - "add_indexes"
      - "update_views"
      
    migrate:
      - "dual_writes"
      - "backfill_data"
      - "validate_data_consistency"
      
    contract:
      - "remove_old_columns"
      - "drop_unused_tables"
      - "cleanup_indexes"
      
  rollback_strategy:
    expand_phase: "drop_new_objects"
    migrate_phase: "stop_dual_writes"
    contract_phase: "recreate_dropped_objects"
    
  safety_measures:
    backup_before_migration: true
    migration_testing: "staging_environment"
    data_validation: "automated"

Zero-Downtime Database Updates

# Zero-downtime database deployment
database_zero_downtime:
  techniques:
    shadow_tables:
      enabled: true
      sync_mechanism: "triggers"
      validation: "checksum_comparison"
      
    online_schema_change:
      tool: "gh-ost"  # or pt-online-schema-change
      chunk_size: 1000
      throttling: "replication_lag_based"
      
    read_replicas:
      promotion_strategy: "automated"
      data_consistency_check: true
      
  application_compatibility:
    backward_compatible: true
    feature_flags: "database_changes"
    gradual_cutover: true

Multi-Region Deployments

Global Deployment Strategy

# Multi-region deployment
multi_region:
  regions:
    primary: "us-east-1"
    secondary: ["us-west-2", "eu-west-1", "ap-southeast-1"]
    
  deployment_order:
    - "deploy_to_staging_all_regions"
    - "validate_staging_deployment"
    - "deploy_to_primary_production"
    - "validate_primary_deployment"
    - "deploy_to_secondary_regions"
    - "validate_global_deployment"
    
  traffic_management:
    dns_failover: true
    health_check_endpoints: ["/health", "/ready"]
    failover_threshold: 3
    
  data_consistency:
    replication_strategy: "master_slave"
    consistency_model: "eventual"
    conflict_resolution: "last_write_wins"
    
  disaster_recovery:
    rpo_target: "15 minutes"
    rto_target: "30 minutes"
    automated_failover: true

CI/CD Pipeline Integration

GitHub Actions Pipeline

# .github/workflows/deploy.yml
name: Deployment Pipeline

on:
  push:
    branches: [main]
  workflow_dispatch:
    inputs:
      environment:
        description: 'Target environment'
        required: true
        default: 'staging'
        type: choice
        options: ['staging', 'production']

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run tests
        run: |
          npm ci
          npm run test
          npm run test:e2e
  
  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Build application
        run: |
          npm ci
          npm run build
      - name: Build Docker image
        run: |
          docker build -t app:$ .
          docker tag app:$ app:latest
  
  deploy-staging:
    needs: build
    if: github.ref == 'refs/heads/main'
    runs-on: ubuntu-latest
    environment: staging
    steps:
      - name: Deploy to staging
        run: |
          kubectl set image deployment/app app=app:$
          kubectl rollout status deployment/app
  
  deploy-production:
    needs: [build, deploy-staging]
    if: github.event_name == 'workflow_dispatch' && github.event.inputs.environment == 'production'
    runs-on: ubuntu-latest
    environment: production
    steps:
      - name: Deploy to production
        run: |
          # Blue-green deployment
          kubectl apply -f k8s/green-deployment.yaml
          kubectl set image deployment/app-green app=app:$
          kubectl rollout status deployment/app-green
          
          # Switch traffic
          kubectl patch service app -p '{"spec":{"selector":{"version":"green"}}}'
          
          # Monitor for 5 minutes
          sleep 300
          
          # Cleanup old blue deployment
          kubectl delete deployment app-blue || true

GitLab CI Pipeline

# .gitlab-ci.yml
stages:
  - test
  - build
  - deploy-staging
  - deploy-production

variables:
  DOCKER_IMAGE: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

test:
  stage: test
  script:
    - npm ci
    - npm run test
    - npm run test:integration

build:
  stage: build
  script:
    - docker build -t $DOCKER_IMAGE .
    - docker push $DOCKER_IMAGE
  only:
    - main

deploy-staging:
  stage: deploy-staging
  script:
    - helm upgrade --install app-staging ./helm-chart 
      --set image.tag=$CI_COMMIT_SHA
      --set environment=staging
  environment:
    name: staging
    url: https://staging.example.com
  only:
    - main

deploy-production:
  stage: deploy-production
  script:
    - |
      # Canary deployment
      helm upgrade --install app-production ./helm-chart \
        --set image.tag=$CI_COMMIT_SHA \
        --set environment=production \
        --set canary.enabled=true \
        --set canary.weight=10
      
      # Monitor canary
      ./scripts/monitor-canary.sh
      
      # Full rollout if successful
      helm upgrade app-production ./helm-chart \
        --set canary.enabled=false
  environment:
    name: production
    url: https://example.com
  when: manual
  only:
    - main

Monitoring and Observability

Deployment Monitoring

# Deployment monitoring configuration
deployment_monitoring:
  metrics:
    deployment_frequency: true
    deployment_success_rate: true
    lead_time: true
    recovery_time: true
    
  alerts:
    deployment_failure:
      severity: "critical"
      channels: ["pagerduty", "slack"]
      
    slow_deployment:
      threshold: "30 minutes"
      severity: "warning"
      
    rollback_triggered:
      severity: "high"
      escalation: true
      
  dashboards:
    deployment_pipeline: true
    environment_health: true
    release_metrics: true
    
  post_deployment:
    health_check_duration: "15 minutes"
    performance_baseline: "24 hours"
    user_feedback_collection: true

Rollback Monitoring

# Rollback and recovery monitoring
rollback_monitoring:
  automatic_rollback:
    health_check_failures: 3
    error_rate_threshold: 5  # percent
    response_time_threshold: 2000  # ms
    
  manual_rollback:
    approval_required: false
    notification_required: true
    incident_creation: true
    
  post_rollback:
    root_cause_analysis: true
    incident_review: "within_24_hours"
    process_improvement: true

Best Practices

Deployment Checklist

Pre-Deployment

Code review completed and approved
All tests passing (unit, integration, e2e)
Security scans completed with no critical issues
Performance tests show no regression
Database migrations tested in staging
Rollback plan documented and tested
Monitoring and alerts configured
Team notification sent

During Deployment

Deployment started within planned window
Health checks passing at each step
Metrics monitoring active
Error rates within acceptable limits
Performance metrics stable
User experience validation

Post-Deployment

Smoke tests completed successfully
Business metrics validated
Performance baseline established
Monitoring alerts verified
Documentation updated
Team notification of completion
Post-deployment review scheduled

Common Anti-Patterns to Avoid

Big Bang Deployments
- Deploy everything at once
- No gradual rollout
- High risk of total failure
No Rollback Plan
- Assume deployment will always succeed
- No tested rollback procedure
- Extended downtime during issues
Manual Deployment Steps
- Prone to human error
- Not reproducible
- Difficult to scale
Insufficient Testing
- Skip testing in production-like environment
- No performance validation
- Missing edge case testing
Poor Monitoring
- No deployment-specific monitoring
- Late detection of issues
- No business impact visibility

Success Metrics

# Deployment success metrics
deployment_metrics:
  dora_metrics:
    deployment_frequency: "daily"
    lead_time_for_changes: "< 1 hour"
    change_failure_rate: "< 15%"
    time_to_restore_service: "< 1 hour"
    
  business_metrics:
    deployment_success_rate: "> 95%"
    rollback_rate: "< 5%"
    mean_time_to_deployment: "< 30 minutes"
    
  quality_metrics:
    post_deployment_incidents: "< 2%"
    customer_satisfaction: "> 4.5/5"
    team_confidence: "> 8/10"

Remember: The best deployment strategy depends on your specific requirements, risk tolerance, and technical constraints. Start simple and evolve your deployment practices as your application and team mature.