There is nothing more stressful than watching a production deployment fail while your users are actively clicking buttons. In my experience, the biggest risk isn’t the bug itself—it’s the time it takes to fix it while the site is down. That’s why I switched to blue-green deployments.
In this blue green deployment GitHub Actions tutorial, I’ll show you how to set up two identical environments so you can test your new version in isolation before flipping a switch to send 100% of your traffic to the new build. If something goes wrong? You just flip the switch back.
Prerequisites
- A GitHub repository with your application code.
- A hosting provider that supports multiple environments or slots (e.g., AWS, DigitalOcean, or Azure).
- A Load Balancer or DNS provider with an API (like Cloudflare or Nginx).
- Basic knowledge of GitHub Actions for Docker deployment to handle the containerization part of the process.
Step 1: Defining Your Environments
First, you need two identical environments. I call them ‘Blue’ and ‘Green’. At any given time, one is ‘Live’ (Production) and the other is ‘Idle’ (Staging/Preview).
In my current setup, I use environment variables in GitHub Secrets to track which environment is active. I create a secret called ACTIVE_ENVIRONMENT which is either blue or green.
Step 2: Building the Workflow
The core of this strategy is a GitHub Actions workflow that identifies the idle environment and deploys the code there first. Here is the YAML configuration I use:
name: Blue-Green Deployment
on:
push:
branches: [ main ]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Determine Idle Environment
id: env_check
run: |
if [ "${{ secrets.ACTIVE_ENVIRONMENT }}" == "blue" ]; then
echo "IDLE=green" >> $GITHUB_OUTPUT
else
echo "IDLE=blue" >> $GITHUB_OUTPUT
fi
- name: Deploy to Idle Environment
run: |
echo "Deploying to ${{ steps.env_check.outputs.IDLE }}..."
# Replace this with your actual deploy command
# Example: ./deploy.sh --env ${{ steps.env_check.outputs.IDLE }}
- name: Run Smoke Tests
run: |
echo "Testing ${{ steps.env_check.outputs.IDLE }} environment..."
curl -f https://${{ steps.env_check.outputs.IDLE }}.myapp.com/health
- name: Switch Traffic
run: |
echo "Switching traffic to ${{ steps.env_check.outputs.IDLE }}..."
# API call to your Load Balancer to update the target
curl -X POST https://api.loadbalancer.com/switch?target=${{ steps.env_check.outputs.IDLE }}
As shown in the image below, the logic depends on a strict check of the active slot before any code is pushed. This prevents you from accidentally overwriting the live site.
Step 3: Implementing the Traffic Switch
The ‘Magic’ happens in the traffic switch. Depending on your stack, this could be a DNS record update, an Nginx config reload, or a Kubernetes ingress change. If you are scaling a Node.js app on DigitalOcean, you might use a floating IP or a Load Balancer target group update.
How the switch works in practice:
- Blue is Live: Traffic flows to Blue. GitHub Actions deploys to Green.
- Verification: You visit
green.myapp.comto verify the build. - The Flip: The Load Balancer is told: “Now send all traffic to Green.”
- Post-Flip: Blue becomes the ‘Idle’ environment, ready for the next release.
Pro Tips for Stable Deployments
After implementing this in several projects, I’ve found a few optimizations that save a lot of headaches:
- Database Migrations: This is the hardest part of blue-green. Ensure your database changes are backward compatible. Your DB must support both the Blue and Green versions of the app simultaneously.
- Session Persistence: Use a Redis store for sessions so users aren’t logged out when the traffic flips.
- Canary Integration: Instead of a 100% flip, start by sending 5% of traffic to the Green environment to monitor for errors.
Troubleshooting Common Issues
If your traffic switch isn’t working, check these three things first:
- DNS Caching: If you’re switching via DNS, remember that TTL (Time to Live) can cause some users to stay on the old version for minutes. Use a Load Balancer instead for near-instant shifts.
- Environment Variable Mismatch: Ensure both environments share the same production secrets but have unique
ENV_NAMEtags. - Health Check Failures: If the workflow fails at the “Run Smoke Tests” step, it means your app crashed on startup in the idle environment. Check your logs before the traffic switch occurs.
What’s Next?
Once you’ve mastered this, you can explore more complex orchestration. If you’re managing a larger codebase, you might want to look into deploying monorepos to Vercel which handles some of this abstraction for you.
Ready to automate your pipeline? Start by implementing the environment check step in your next GitHub Action workflow!