Secrets in Code: How to Build a Detection Pipeline That Catches Leaks
The Scale of the Problem
Leaked secrets in source code repositories continue to be a leading cause of data breaches. API keys, database credentials, cloud access tokens, and private keys regularly end up committed to both private and public repositories. Studies consistently show that thousands of new secrets are leaked on GitHub alone every day.
The root cause isn't carelessness - it's workflow. Developers working quickly may hardcode a credential for testing, forget to remove it, and commit it alongside their feature code. Without automated detection, these secrets persist in git history even after deletion from the current codebase.
Building Your Detection Pipeline
Layer 1: Pre-Commit Hooks
The earliest and cheapest place to catch secrets is before they're committed. Tools like gitleaks or truffleHog can run as pre-commit hooks, scanning staged changes for patterns that match known secret formats.
This catches obvious cases - AWS access keys, private RSA keys, common API key formats - before they enter the repository history.
Layer 2: CI/CD Pipeline Scanning
Pre-commit hooks can be bypassed (developers can skip them). Your CI/CD pipeline should include a mandatory secrets scanning step that blocks merges when secrets are detected.
Integrate scanning into your pull request workflow. When a PR contains a detected secret, it should fail the check and notify the developer with clear instructions on how to remediate.
Layer 3: Repository Monitoring
For existing repositories, run a full historical scan to identify secrets that were committed in the past. Remember: deleting a file from the current branch doesn't remove it from git history. You need tools that scan the entire commit history.
Layer 4: Runtime Detection
Monitor your production environment for signs of credential misuse. Cloud providers like AWS offer tools such as CloudTrail and GuardDuty that can detect unusual API key usage patterns.
Remediation Process
When a secret is detected:
- Rotate immediately - The exposed credential must be revoked and replaced, regardless of whether it was in a public or private repository
- Assess exposure - Determine whether the secret was accessed or used by unauthorized parties
- Clean git history - Use tools like
git filter-branchor BFG Repo-Cleaner to remove the secret from commit history - Root cause - Why was the secret hardcoded? Was it a process failure? Lack of a secrets manager? Address the underlying cause.
Prevention
The best detection pipeline catches secrets early, but prevention is even better:
- Use a secrets manager - HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault
- Environment variables - Never hardcode credentials; use environment-specific configuration
- Documentation - Make it easy for developers to do the right thing with clear guides on how to use the secrets manager