API keys are the core access tokens of the modern web ecosystem. Stripe secret keys, Twilio auth tokens, AWS access keys, Firebase service accounts. Every one of them is a serious risk if it leaks.
But most teams don’t rotate keys. “If it works, don’t touch it.” A key leaks, nobody notices for a while, and unauthorized usage piles up.
This post walks through API key rotation strategies in four different scenarios.
Why rotation matters
Key leak scenarios:
- Accidental commit to a Git repository
- Developer laptop stolen
- Keys pasted into Slack or Discord
- Plaintext in a log file
- Employee left, their access is still active
- Environment variable exposed in a build system
- Third-party SDK compromised
Any of these can happen. Rotation limits the compromise window.
Scenario 1: Compromised key (emergency rotation)
A key leak surfaces. Immediate action required.
Process:
- Revoke the old key. Disable immediately in the provider dashboard.
- Generate a new key. Same privilege level.
- Deploy the new key. Update production secrets.
- Verify functionality. Smoke tests pass.
- Audit usage logs. Any suspicious activity on the old key’s history?
- Post-mortem. How did the leak happen, and what do we do to prevent it?
Total: 15 minutes to 2 hours. Speed is critical.
Tools needed:
- Secrets management system (Vault, AWS Secrets Manager)
- CI/CD pipeline (for deploying the new secret)
- Monitoring dashboard (to verify functionality)
An unprepared team doing this manually will burn hours. Automation is essential.
Scenario 2: Scheduled rotation (proactive)
Regular rotation on a schedule. Even without a leak.
Typical intervals:
- High-privilege keys: 30 to 60 days
- Standard keys: 90 days
- Low-risk keys: 180 days
Process:
- Generate a new key. Old key still active.
- Deploy the new key to a secondary slot. Code supports primary plus secondary key.
- Verify the secondary works. Send test requests with the new key.
- Swap primary and secondary. New key is now primary.
- Wait a safety period. 24 to 48 hours for dependent systems to propagate.
- Revoke the old key.
Zero downtime. Both keys active during the transition.
Code pattern:
def get_api_key():
primary = os.getenv('API_KEY_PRIMARY')
secondary = os.getenv('API_KEY_SECONDARY')
# Prefer primary, fallback to secondary
return primary or secondarySome APIs support multi-key natively (AWS IAM allows multiple access keys per user). If they don’t, you orchestrate manually.
Scenario 3: Employee turnover
A developer leaves the team. They have keys they created.
Problem:
- Which keys belonged to the former employee?
- Have those keys been deprecated?
- Are automated jobs depending on them?
Best practice:
- Shared service accounts. Use a team shared account instead of an individual developer key. When a developer leaves, no rotation needed.
- Ownership tracking. Every key has an “owner” metadata field. When a developer leaves, you have the list of their keys ready.
- Automatic rotation on offboarding. “Rotate all keys owned by X” is a checklist item on offboarding.
- Inventory maintenance. Quarterly key audit. Investigate orphan keys with no owner.
Scenario 4: Multi-environment rotation
Dev, staging, production. Each has a different key. Rotation requires environment isolation.
Pattern:
- Dev environment first. Rotate, test.
- Staging next. Broader verification.
- Production last. After confidence is built.
24 to 48 hours between environments. If there’s a problem, you catch it before moving on.
Prevent cross-environment confusion:
- Key naming:
stripe_prod_key,stripe_staging_key - Visual distinction in dashboards (green production, blue staging)
- Production key access limited to 2 or 3 people
Secrets management tools
Manual rotation is tedious and error-prone. Tools:
HashiCorp Vault:
- Dynamic secret generation
- Automatic rotation policies
- Audit logs
- Self-hosted or managed
AWS Secrets Manager:
- AWS-native
- Automatic rotation for supported services (RDS, DocumentDB)
- IAM integration
Doppler, Infisical:
- Developer-friendly SaaS
- Environment variable management
- CI/CD integrations
Google Secret Manager:
- GCP-native
- Version management
- Access control
Pick based on scale. Doppler is fine for a 10-developer startup. Enterprise goes to Vault.
Key hygiene practices
1. Never commit to Git.
.gitignore:
.env
.env.local
secrets.yml
*.pemPre-commit hook to detect sensitive content. Tools: git-secrets, truffleHog.
2. Environment variables only.
Never literal keys in code. Read from the environment:
API_KEY = os.getenv('API_KEY')
if not API_KEY:
raise Exception('API_KEY not set')3. Limited scope.
If read-only access is enough, don’t use a write key. Stripe offers “restricted API keys” tied to specific endpoints.
4. Keys never appear in logs.
Don’t let API keys show up in log statements:
logger.info(f"Calling API...") # OK
logger.info(f"Using key: {api_key}") # BAD5. Audit logs on.
API key usage logs in the provider dashboard. Anomaly detection for unexpected IP, unusual time, or unusual volume.
Rotation automation
Automated scheduled rotation:
# Lambda function (triggered by EventBridge, weekly)
def rotate_stripe_key():
# 1. Generate a new key via Stripe API
new_key = stripe.create_api_key(...)
# 2. Store in Secrets Manager (secondary slot)
secrets_manager.update_secret(
SecretId='stripe_key',
SecretString=json.dumps({
'primary': get_secret('stripe_key')['primary'],
'secondary': new_key
})
)
# 3. Trigger deployment (update running instances)
trigger_deployment()
# 4. Wait for propagation
time.sleep(3600)
# 5. Swap primary and secondary
secrets_manager.update_secret(
SecretId='stripe_key',
SecretString=json.dumps({
'primary': new_key,
'secondary': None
})
)
# 6. Revoke the old key (after grace period)
time.sleep(86400)
stripe.revoke_api_key(old_key)This pipeline runs quarterly. No team intervention.
Challenges
Problems you’ll hit during rotation:
1. Cached keys. The old key is still in cache. Propagation is delayed.
Fix: short cache TTL. Explicit invalidation.
2. Third-party integrations. The key is embedded in customer webhooks. You have to notify every customer.
Fix: advance notice (30 days). A clear migration guide.
3. Long-running batch jobs. A job has been running for 2 hours when the key is rotated. Job fails.
Fix: grace period (overlap window). Detect long-running jobs.
4. Documentation. Docs, code samples and examples still reference the old key.
Fix: variable substitution. No hard-coded keys in docs.
Monitoring rotation health
Track:
- Last rotation date per key
- Keys overdue for rotation
- Failed rotation attempts
- Anomalous key usage patterns
- Leaked keys detected (GitHub secret scanning)
Dashboard: Datadog, Grafana. Alert when rotation is overdue by 30 or more days.
Takeaway
API key rotation is a discipline. Compromise mitigation, proactive security, employee turnover hygiene.
Four scenarios: emergency, scheduled, employee transition, multi-environment. Each has different urgency and process.
Secrets management tools are essential at scale. Manual rotation is manageable up to 3 to 5 keys; beyond that, automation is mandatory.
Key rotation is invisible work. But when a leak does happen, the difference is life-saving.