docs: align cookie setup with env-only auth

This commit is contained in:
2026-04-21 21:53:42 -04:00
parent d65d81dbd1
commit b6e9501448
6 changed files with 716 additions and 100 deletions

View File

@@ -0,0 +1,131 @@
# Cookie Env-Only Design
## Summary
Remove all file-based and request-provided cookie inputs across the repo.
The only supported authentication input becomes a raw `Cookie` header string supplied through scraper-specific environment variables such as `FACEBOOK_COOKIE` and `EBAY_COOKIE`.
## Goals
- Remove cookie file fallback from shared and marketplace-specific code.
- Remove request-level cookie overrides from public scraper entrypoints.
- Remove deprecated cookie-path parameters from Facebook APIs.
- Keep cookie parsing deterministic and limited to raw header-string input.
- Update tests and docs so the public contract matches the implementation.
## Non-Goals
- Changing scraper behavior unrelated to authentication input.
- Adding new cookie formats or migration helpers.
- Preserving backward compatibility for cookie files, JSON cookie arrays, or request overrides.
## Current State
The current shared cookie utilities support three sources in priority order:
1. Request parameter
2. Environment variable
3. Cookie file
`packages/core/src/utils/cookies.ts` includes file loading, JSON array parsing, and auto-detection between JSON and header-string formats.
Facebook also exposes deprecated `cookiePath` arguments that still reach shared loading logic.
Docs in `cookies/AGENTS.md` still describe file-based setup and request-level overrides.
## Chosen Approach
Use the hard-reset approach.
Delete the shared multi-source cookie-loading model and reduce the cookie surface to env-header parsing only.
This is a larger diff than a surgical removal, but it avoids leaving behind abstractions that imply unsupported inputs still exist.
## Design
### Shared Cookie Utilities
`packages/core/src/utils/cookies.ts` will keep only the pieces needed for env-header-based auth:
- `Cookie` type
- A reduced cookie config shape containing only `name`, `domain`, and `envVar`
- `parseCookieString()` for raw `Cookie` header strings
- `formatCookiesForHeader()` for domain filtering and request formatting
- An env-only loader that reads `process.env[config.envVar]`, parses it, and throws a targeted error when missing or invalid
The following shared utilities will be removed:
- JSON cookie-array parsing
- Auto-detection between JSON and header-string formats
- File loading helpers
- Optional loaders whose behavior depends on file fallback or request input
### Marketplace Scrapers
Marketplace scrapers that require auth will read cookies only from their env vars.
For Facebook this means:
- Remove `_cookiePath` / `cookiePath` parameters from helper and public functions
- Remove any docs/comments that mention parameter > env > file precedence
- Update auth failure messaging to name only `FACEBOOK_COOKIE`
For eBay this means:
- Remove any remaining fallback/file-oriented behavior from shared calls and error strings
- Keep the existing env-var auth path, but make it the only path
### Public API Surface
Exports from `packages/core/src/index.ts` should reflect the new contract.
If exported functions currently advertise cookie-source or cookie-path arguments, their signatures will be tightened so callers cannot pass unsupported inputs.
Downstream adapter packages should continue calling core through the simplified signatures without adding their own cookie-loading behavior.
### Error Handling
There are now only two auth failure modes:
1. The required env var is missing or empty.
2. The env var does not contain any valid `name=value` cookie pairs.
Errors should be blunt and specific:
- identify the missing env var by name
- state that the value must be a raw `Cookie` header string
- stop mentioning request parameters, cookie paths, JSON arrays, or `./cookies/*.json`
### Testing Strategy
Follow TDD.
Start by changing or adding core tests so the old file/request behavior is no longer accepted.
Coverage targets:
1. Valid env header strings still parse into cookies correctly.
2. Missing env vars fail with the new env-only error.
3. Invalid env strings fail without falling back to files or request data.
4. Facebook APIs no longer expose or honor cookie-path/request-cookie behavior.
5. Existing tests that depended on missing files or JSON cookie arrays are rewritten to the env-only contract.
Verification target after implementation:
- `bun test packages/core/test`
- `bun run ci`
- `bun run build` if any cross-package signature changes require downstream verification
## Documentation Changes
Update cookie-related docs to match the new contract:
- remove file-based setup instructions
- remove request-parameter cookie examples
- document env vars as the only supported auth input
- show raw `Cookie` header-string examples only
## Risks
- External callers using request cookie overrides will break at compile time or runtime, depending on how they consume the package.
- Recent work added support for custom Facebook cookie paths, so removing that path intentionally reverses a newly introduced behavior.
- Tests that currently model missing-file behavior must be rewritten rather than preserved.
## Rollout Notes
This is an intentional contract break.
The code, tests, and docs should all land together so there is no mixed messaging about supported cookie sources.