Files
ca-marketplace-scraper/.ruler/05-TESTING-DOCTRINE.md
Dmytro Stanchiev 7cf21546e2 chore: ai agent config
Signed-off-by: Dmytro Stanchiev <git@dmytros.dev>
2026-04-21 20:37:55 -04:00

461 lines
8.0 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Production Testing Doctrine
_Project-Agnostic Engineering Standard_
---
# 1. Purpose of Testing
Testing exists to:
- Prevent regressions
- Protect critical business behavior
- Enforce invariants
- Guard boundaries
- Provide safe refactoring
- Reduce production incidents
Testing does not exist to:
- Increase coverage numbers
- Satisfy tooling requirements
- Mirror implementation linebyline
- Create a false sense of security
If a test does not reduce real-world risk, it should not exist.
---
# 2. Core Principles
---
## 2.1 Determinism Is Non-Negotiable
A test must:
- Produce the same result every run
- Not depend on execution order
- Not depend on global state
- Not depend on wall-clock time
- Not depend on external networks
- Not depend on randomness (unless seeded)
A flaky test is worse than no test.
If a test fails intermittently:
- Fix it immediately
- Or delete it
There is no third option.
---
## 2.2 Isolation of Behavior
Tests should verify behavior in isolation from unrelated systems.
The smaller the scope of the test, the more reliable and faster it is.
We separate:
- Pure logic
- System interactions
- External integrations
- Full-system behavior
Confusing these layers results in slow, fragile suites.
---
## 2.3 Risk-Based Testing
Testing effort should scale with risk.
High-risk areas:
- Financial logic
- Security and access control
- Data mutation
- Distributed coordination
- Concurrency
- Migration and transformation logic
Low-risk areas:
- Static rendering
- Formatting helpers
- Simple data mapping
Testing must prioritize business-critical systems.
---
## 2.4 Tests Are Part of the System
Tests must follow the same standards as production code:
- Clean structure
- Clear naming
- Maintainable
- Reviewed in PRs
- Refactored when necessary
Test code quality reflects engineering quality.
---
# 3. Testing Layers (Architecture-Neutral)
These layers apply universally.
---
# 3.1 Unit Tests (Logic Layer)
Definition:
Tests that validate pure behavior without system dependencies.
Must:
- Run fast
- Avoid I/O
- Avoid network
- Avoid persistent state
- Avoid framework bootstrapping
Should test:
- Business rules
- Domain invariants
- Edge cases
- Validation
- Transformation logic
Reasoning:
If logic cannot be tested without infrastructure, it is coupled too tightly.
---
# 3.2 Integration Tests (System Boundary Layer)
Definition:
Tests that validate interactions between internal components.
May include:
- Datastores
- Filesystems
- Queues
- Caches
- Framework wiring
- Service boundaries
Must:
- Use real internal components
- Reset state between runs
- Avoid real external services
Reasoning:
Most production bugs occur at boundaries, not in pure functions.
---
# 3.3 External Integration Tests
Definition:
Tests that validate interaction with third-party systems.
Policy:
- Prefer mocking or simulation
- Use sandbox environments only when necessary
- Never depend on live production services
Reasoning:
External systems are outside your control and introduce nondeterminism.
---
# 3.4 End-to-End Tests (System-Level)
Definition:
Tests that validate complete workflows from entry to outcome.
Must:
- Cover only critical flows
- Be minimal in number
- Run in isolated environments
- Avoid unnecessary duplication of lower-level tests
End-to-end tests are expensive and fragile. Use them surgically.
---
# 4. State Management Policy
---
## 4.1 No Shared State Between Tests
Every test must assume a blank environment.
Options:
- Fresh environment per test
- Transaction rollback
- Full reset between runs
- Isolated test containers
No test may depend on side effects from another test.
---
## 4.2 Reproducible Environments
Tests must run consistently:
- Locally
- In CI
- In parallel
- Across operating systems (if supported)
Environment drift is unacceptable.
---
# 5. Mocking Policy
---
## 5.1 Mock External Systems
Mock:
- Third-party APIs
- Payment providers
- Email systems
- External storage
- Network services outside system boundary
Reasoning:
You do not control them.
---
## 5.2 Do Not Mock Core Logic
Never mock:
- Business rules
- Authorization checks
- Data validation
- Domain logic
Mocking internal logic invalidates the test.
---
## 5.3 Avoid Over-Mocking
Over-mocking:
- Couples tests to implementation
- Breaks refactoring
- Creates fragile tests
Mock only what crosses system boundaries.
---
# 6. Error & Edge Case Policy
Every public interface must have tests for:
- Valid input
- Invalid input
- Unauthorized or restricted access (if applicable)
- Boundary values
- Failure paths
- Concurrency conflicts (if applicable)
Most real-world failures happen outside happy paths.
---
# 7. Security Testing Doctrine
All systems must test:
- Access control enforcement
- Privilege boundaries
- Input validation
- Injection resistance (where applicable)
- Role escalation prevention
Security-sensitive logic must have near-complete coverage.
---
# 8. Concurrency & Race Conditions
If the system involves:
- Multi-threading
- Distributed nodes
- Async processing
- Queues
- Parallel writes
Then tests must include:
- Concurrent execution scenarios
- Conflict handling
- Idempotency verification
- Retry logic behavior
These bugs rarely appear in simple test cases.
---
# 9. Migration & Data Evolution
If the system stores data over time:
- Schema migrations must be tested
- Data transformation must be verified
- Backward compatibility must be validated
- Downgrade scenarios (if supported) must be considered
Silent data corruption is catastrophic.
---
# 10. CI Enforcement
Tests must run automatically:
- On every pull request
- On main branch
- Before release
CI must:
- Fail fast
- Prevent merges on failure
- Run in clean environments
- Be reproducible
If tests only run locally, they are not part of the system.
---
# 11. Coverage Philosophy
Coverage is a diagnostic tool, not a goal.
Required:
- High coverage on business-critical modules
- Full coverage on security boundaries
- Full coverage on financial logic
Optional:
- High coverage on trivial UI or formatting
100% coverage does not imply correctness.
Low coverage in critical areas is unacceptable.
---
# 12. Performance of the Test Suite
The test suite must:
- Run quickly enough to encourage frequent execution
- Support parallelization
- Avoid arbitrary sleeps
- Avoid unnecessary bootstrapping
Slow tests reduce engineering velocity and discourage use.
---
# 13. Red Flags (Immediate Rejection)
- Tests that sometimes fail
- Tests that depend on execution order
- Snapshot abuse
- Arbitrary timeouts to “fix” flakiness
- Global mutable state
- Randomized data without seed
- Testing implementation details instead of behavior
- Excessive E2E replacing proper layering
- Mocking core domain logic
- Tests that assert only truthy values
---
# 14. Refactoring Policy
Tests must enable refactoring.
If changing internal structure breaks many tests without changing behavior:
- The tests are coupled incorrectly.
Behavioral contracts should remain stable under refactor.
---
# 15. Production Observability Complements Testing
Testing does not replace:
- Logging
- Monitoring
- Alerting
- Metrics
- Tracing
Tests prevent known failures.
Observability detects unknown ones.
Both are required.
---
# 16. The Engineering Mindset
Before writing any test, ask:
1. What failure would hurt the business most?
2. What invariant must never break?
3. What boundary is being crossed?
4. What assumptions are being made?
5. Can this test fail nondeterministically?
6. Is this testing behavior or implementation?
If the test does not meaningfully reduce risk, reconsider it.
---
# 17. Definition of Production-Grade Testing
A system with production-grade testing:
- Can be refactored safely
- Rarely ships regressions
- Catches security violations before release
- Detects data integrity failures early
- Has a stable, trusted CI pipeline
- Has a fast feedback loop
- Is boringly reliable
Engineers trust the test suite.
They do not ignore it.
They do not fear it.
They rely on it.
That is the standard.