AWS CI/CD 实战系列 09:进阶技巧 —— 多环境管理、蓝绿部署、Artifact 跨区域复制与并行构建

系列导读: 前八篇我们从零搭建了 CI/CD 流水线、踩了坑、锁了安全、建了监控。本篇是系列收官之作,聚焦四个进阶场景:多环境参数管理、蓝绿零停机部署、Artifact 跨区域复制、并行构建加速。让你的 CI/CD 从"能用"进化到"好用"。


进阶能力全景图


技巧 1:多环境参数管理

问题

dev / staging / prod 三个环境的数据库地址、端口、密钥各不相同。如果在 buildspec 里硬编码,每次切换环境都要改代码。

解决方案:Parameter Store 分层管理

参数目录结构

/mfmsapp/
  ├── dev/
  │   ├── DB_HOST        = mfmsapp-dev.xxxx.rds.amazonaws.com
  │   ├── DB_PORT        = 5432
  │   ├── DB_NAME        = mfmsapp_dev
  │   ├── LOG_LEVEL      = debug
  │   └── FEATURE_FLAGS  = {"new_ui": true, "beta_api": true}
  ├── staging/
  │   ├── DB_HOST        = mfmsapp-staging.xxxx.rds.amazonaws.com
  │   ├── DB_PORT        = 5432
  │   ├── DB_NAME        = mfmsapp_staging
  │   ├── LOG_LEVEL      = info
  │   └── FEATURE_FLAGS  = {"new_ui": true, "beta_api": false}
  └── prod/
      ├── DB_HOST        = mfmsapp-prod.xxxx.rds.amazonaws.com
      ├── DB_PORT        = 5432
      ├── DB_NAME        = mfmsapp_prod
      ├── LOG_LEVEL      = warn
      └── FEATURE_FLAGS  = {"new_ui": false, "beta_api": false}

批量创建参数

#!/bin/bash
# scripts/setup-parameters.sh

ENV=$1  # dev / staging / prod

params=(
  "DB_HOST:mfmsapp-${ENV}.xxxx.rds.amazonaws.com:String"
  "DB_PORT:5432:String"
  "DB_NAME:mfmsapp_${ENV}:String"
  "LOG_LEVEL:debug:String"
)

if [ "$ENV" = "prod" ]; then
  params[3]="LOG_LEVEL:warn:String"
fi

for entry in "${params[@]}"; do
  IFS=':' read -r name value type <<< "$entry"
  aws ssm put-parameter \
    --name "/mfmsapp/${ENV}/${name}" \
    --value "$value" \
    --type "$type" \
    --overwrite
  echo "✅ Set /mfmsapp/${ENV}/${name}"
done

buildspec 中动态加载参数

version: 0.2

env:
  parameter-store:
    DB_HOST: /mfmsapp/${ENVIRONMENT}/DB_HOST
    DB_PORT: /mfmsapp/${ENVIRONMENT}/DB_PORT
    DB_NAME: /mfmsapp/${ENVIRONMENT}/DB_NAME
    LOG_LEVEL: /mfmsapp/${ENVIRONMENT}/LOG_LEVEL
  variables:
    ENVIRONMENT: prod  # 通过 Pipeline 变量覆盖

phases:
  build:
    commands:
      - echo "Deploying to $ENVIRONMENT environment"
      - echo "Database: $DB_HOST:$DB_PORT/$DB_NAME"
      - go build -ldflags "-X main.env=$ENVIRONMENT -X main.dbHost=$DB_HOST" -o mfmsapp

Pipeline 中传递环境变量

{
  "stageName": "Build",
  "actions": [
    {
      "name": "Build",
      "actionTypeId": {
        "category": "Build",
        "owner": "AWS",
        "provider": "CodeBuild",
        "version": "1"
      },
      "configuration": {
        "ProjectName": "mfmsapp-build",
        "EnvironmentVariablesOverride": "[{\"name\":\"ENVIRONMENT\",\"value\":\"prod\",\"type\":\"PLAINTEXT\"}]"
      }
    }
  ]
}

技巧 2:蓝绿部署零停机升级

问题

当前的 CodeDeployDefault.OneAtATime 策略在替换实例时会有短暂的服务中断。对于生产环境,需要零停机

解决方案:蓝绿部署

步骤 1:创建蓝绿部署组

{
  "applicationName": "mfmsapp",
  "deploymentGroupName": "mfmsapp-prod-bluegreen",
  "deploymentConfigName": "CodeDeployDefault.AllAtOnce",
  "ec2TagFilters": [
    {
      "Key": "Environment",
      "Value": "prod",
      "Type": "KEY_AND_VALUE"
    }
  ],
  "autoScalingGroups": ["mfmsapp-asg-blue", "mfmsapp-asg-green"],
  "deploymentStyle": {
    "deploymentType": "BLUE_GREEN",
    "deploymentOption": "WITH_TRAFFIC_CONTROL"
  },
  "blueGreenDeploymentConfiguration": {
    "terminateBlueInstancesOnDeploymentSuccess": {
      "action": "TERMINATE",
      "terminationWaitTimeInMinutes": 30
    },
    "deploymentReadyOption": {
      "actionOnTimeout": "CONTINUE_DEPLOYMENT",
      "waitTimeInMinutes": 0
    }
  },
  "loadBalancerInfo": {
    "targetGroupInfoList": [
      {"name": "mfmsapp-tg-blue"},
      {"name": "mfmsapp-tg-green"}
    ]
  }
}

步骤 2:appspec.yml 中的蓝绿生命周期

version: 0.0
os: linux
files:
  - source: /mfmsapp
    destination: /opt/mfmsapp
hooks:
  BeforeInstall:
    - location: scripts/before_install.sh
      timeout: 60
  AfterInstall:
    - location: scripts/after_install.sh
      timeout: 60
      runas: root
  ApplicationStart:
    - location: scripts/start.sh
      timeout: 60
      runas: root
  ValidateService:
    - location: scripts/health_check.sh
      timeout: 120
      runas: root
  BeforeAllowTraffic:
    - location: scripts/before_traffic.sh
      timeout: 60
  AfterAllowTraffic:
    - location: scripts/after_traffic.sh
      timeout: 60

步骤 3:流量切换脚本

#!/bin/bash
# scripts/before_traffic.sh
# 在流量切换前执行的检查

echo "[TRAFFIC] Pre-traffic validation starting..."

# 检查应用进程是否运行
if ! pgrep -x "mfmsapp" > /dev/null; then
    echo "[TRAFFIC] ❌ mfmsapp process not running"
    exit 1
fi

# 检查端口是否监听
if ! ss -tlnp | grep ":8080" > /dev/null; then
    echo "[TRAFFIC] ❌ Port 8080 not listening"
    exit 1
fi

# 检查健康端点
HTTP_CODE=$(curl -s -o /dev/null -w "%{http_code}" http://localhost:8080/health)
if [ "$HTTP_CODE" != "200" ]; then
    echo "[TRAFFIC] ❌ Health check failed: HTTP $HTTP_CODE"
    exit 1
fi

echo "[TRAFFIC] ✅ All pre-traffic checks passed"
exit 0

步骤 4:蓝绿部署的优势

特性

原地部署 (In-Place)

蓝绿部署 (Blue-Green)

停机时间

数秒 ~ 数分钟

零停机

回滚速度

需要重新部署旧版本

切换流量即可,秒级

资源成本

1x

2x(需要两套环境)

风险

所有实例同时更新

先验证再切流量

适用场景

开发/测试环境

生产环境


技巧 3:Artifact 跨区域复制

问题

主区域是 ap-northeast-1(东京),灾备区域是 us-west-2(俄勒冈)。需要在灾备区域也能快速部署。

解决方案:S3 跨区域复制 (CRR)

步骤 1:创建目标区域 Artifact 桶

# 在 us-west-2 创建目标桶
aws s3api create-bucket \
  --bucket mfmsapp-pipeline-artifacts-123456789012-us-west-2 \
  --region us-west-2 \
  --create-bucket-configuration LocationConstraint=us-west-2

步骤 2:配置跨区域复制

{
  "Role": "arn:aws:iam::123456789012:role/S3ReplicationRole",
  "Rules": [
    {
      "ID": "ReplicateArtifacts",
      "Status": "Enabled",
      "Prefix": "mfmsapp/",
      "Destination": {
        "Bucket": "arn:aws:s3:::mfmsapp-pipeline-artifacts-123456789012-us-west-2",
        "StorageClass": "STANDARD_IA",
        "EncryptionConfiguration": {
          "ReplicaKmsKeyID": "arn:aws:kms:us-west-2:123456789012:key/replica-kms-key-id"
        }
      },
      "SourceSelectionCriteria": {
        "SseKmsEncryptedObjects": {
          "Status": "Enabled"
        }
      }
    }
  ]
}
aws s3api put-bucket-replication \
  --bucket mfmsapp-pipeline-artifacts-123456789012 \
  --replication-configuration file://replication-config.json

步骤 3:灾备区域 CodePipeline

在 us-west-2 创建独立的 CodePipeline,使用复制过来的 Artifact:

{
  "pipeline": {
    "name": "mfmsapp-pipeline-dr",
    "roleArn": "arn:aws:iam::123456789012:role/mfmsapp-codepipeline-role-us-west-2",
    "artifactStore": {
      "type": "S3",
      "location": "mfmsapp-pipeline-artifacts-123456789012-us-west-2",
      "encryptionKey": {
        "id": "arn:aws:kms:us-west-2:123456789012:key/replica-kms-key-id",
        "type": "KMS"
      }
    },
    "stages": [
      {
        "name": "Source",
        "actions": [
          {
            "name": "ArtifactFromPrimary",
            "actionTypeId": {
              "category": "Source",
              "owner": "AWS",
              "provider": "S3",
              "version": "1"
            },
            "configuration": {
              "S3Bucket": "mfmsapp-pipeline-artifacts-123456789012-us-west-2",
              "S3ObjectKey": "mfmsapp/latest/BuildOutput.zip",
              "PollForSourceChanges": "true"
            }
          }
        ]
      },
      {
        "name": "Deploy",
        "actions": [
          {
            "name": "DeployToDR",
            "actionTypeId": {
              "category": "Deploy",
              "owner": "AWS",
              "provider": "CodeDeploy",
              "version": "1"
            },
            "configuration": {
              "ApplicationName": "mfmsapp-dr",
              "DeploymentGroupName": "mfmsapp-dr-prod"
            }
          }
        ]
      }
    ]
  }
}

技巧 4:并行构建加速

问题

构建阶段依次执行单元测试、Lint、安全扫描,总耗时 8 分钟。每个步骤独立,可以并行。

解决方案:CodePipeline 并行 Action

{
  "stageName": "Quality",
  "actions": [
    {
      "name": "UnitTest",
      "runOrder": 1,
      "actionTypeId": {
        "category": "Build",
        "owner": "AWS",
        "provider": "CodeBuild",
        "version": "1"
      },
      "configuration": {
        "ProjectName": "mfmsapp-unittest"
      }
    },
    {
      "name": "Lint",
      "runOrder": 1,
      "actionTypeId": {
        "category": "Build",
        "owner": "AWS",
        "provider": "CodeBuild",
        "version": "1"
      },
      "configuration": {
        "ProjectName": "mfmsapp-lint"
      }
    },
    {
      "name": "SecurityScan",
      "runOrder": 1,
      "actionTypeId": {
        "category": "Build",
        "owner": "AWS",
        "provider": "CodeBuild",
        "version": "1"
      },
      "configuration": {
        "ProjectName": "mfmsapp-security-scan"
      }
    }
  ]
}

关键:三个 Action 的 runOrder 都是 1,CodePipeline 会并行执行它们。

并行构建配置示例

# buildspec-unittest.yml
version: 0.2
phases:
  build:
    commands:
      - go test -v -race -coverprofile=coverage.out ./...
      - go tool cover -func=coverage.out
artifacts:
  files:
    - coverage.out

# buildspec-lint.yml
version: 0.2
phases:
  install:
    commands:
      - go install github.com/golangci/golangci-lint/cmd/golangci-lint@latest
  build:
    commands:
      - golangci-lint run ./...

# buildspec-security.yml
version: 0.2
phases:
  install:
    commands:
      - go install github.com/securego/gosec/v2/cmd/gosec@latest
  build:
    commands:
      - gosec -fmt json -out results.json ./...
artifacts:
  files:
    - results.json

并行 vs 串行对比

方式

单元测试

Lint

安全扫描

总耗时

串行 (runOrder: 1,2,3)

3 min

2 min

3 min

8 min

并行 (runOrder: 1,1,1)

3 min

2 min

3 min

3 min

提速 62.5%,从 8 分钟降到 3 分钟。


完整流水线架构

综合以上技巧,mfmsapp 的生产级流水线:

graph LR
    subgraph "Stage 1: Source"
        A[CodeCommit] --> B[Webhook / Poll]
    end
    
    subgraph "Stage 2: Build"
        B --> C[Build Binary]
    end
    
    subgraph "Stage 3: Quality (并行)"
        C --> D[Unit Test]
        C --> E[Lint]
        C --> F[Security Scan]
    end
    
    subgraph "Stage 4: Deploy Dev"
        D --> G[Deploy to Dev]
        E --> G
        F --> G
    end
    
    subgraph "Stage 5: Approve"
        G --> H[Manual Approval]
    end
    
    subgraph "Stage 6: Deploy Prod"
        H --> I[Blue-Green Deploy]
    end
    
    subgraph "Stage 7: DR"
        I --> J[S3 Cross-Region Replicate]
        J --> K[Deploy to DR Region]
    end
    
    style D fill:#3F8624,stroke:#232F3E,color:#fff
    style E fill:#3F8624,stroke:#232F3E,color:#fff
    style F fill:#3F8624,stroke:#232F3E,color:#fff
    style I fill:#1A73E8,stroke:#232F3E,color:#fff

系列总结

9 篇文章,从零到生产级 CI/CD 的完整旅程:

篇数

主题

核心收获

01

架构总览

S3 vs CodeCommit 模式选型

02

S3 触发搭建

完整流水线实操

03

S3 避坑指南

权限、事件重复、Artifact 路径

04

CodeCommit 搭建

Git 工作流集成

05

CodeCommit 避坑

触发延迟、缓存、多分支

06

版本演进实战

v1→v2→v3 全链路演示

07

权限安全

IAM 最小权限、KMS 加密

08

监控告警

CloudWatch + SNS + 回滚

09

进阶技巧

多环境、蓝绿、跨区域、并行

mfmsapp 从一个简单的内存 REST API,进化为支持多环境、蓝绿部署、跨区域灾备、完整监控的生产级应用。这条 CI/CD 路径,适用于大多数中小型项目的 AWS 部署场景。


相关文档


系列完结。 如果这篇系列对你有帮助,欢迎分享给正在搭建 AWS CI/CD 的同事。有问题欢迎评论区交流!