Files
ca-marketplace-scraper/docs/superpowers/specs/2026-04-21-cookie-env-only-design.md

5.1 KiB

Cookie Env-Only Design

Summary

Remove all file-based and request-provided cookie inputs across the repo. The only supported authentication input becomes a raw Cookie header string supplied through scraper-specific environment variables such as FACEBOOK_COOKIE and EBAY_COOKIE.

Goals

  • Remove cookie file fallback from shared and marketplace-specific code.
  • Remove request-level cookie overrides from public scraper entrypoints.
  • Remove deprecated cookie-path parameters from Facebook APIs.
  • Keep cookie parsing deterministic and limited to raw header-string input.
  • Update tests and docs so the public contract matches the implementation.

Non-Goals

  • Changing scraper behavior unrelated to authentication input.
  • Adding new cookie formats or migration helpers.
  • Preserving backward compatibility for cookie files, JSON cookie arrays, or request overrides.

Current State

The current shared cookie utilities support three sources in priority order:

  1. Request parameter
  2. Environment variable
  3. Cookie file

packages/core/src/utils/cookies.ts includes file loading, JSON array parsing, and auto-detection between JSON and header-string formats. Facebook also exposes deprecated cookiePath arguments that still reach shared loading logic. Docs in cookies/AGENTS.md still describe file-based setup and request-level overrides.

Chosen Approach

Use the hard-reset approach. Delete the shared multi-source cookie-loading model and reduce the cookie surface to env-header parsing only. This is a larger diff than a surgical removal, but it avoids leaving behind abstractions that imply unsupported inputs still exist.

Design

packages/core/src/utils/cookies.ts will keep only the pieces needed for env-header-based auth:

  • Cookie type
  • A reduced cookie config shape containing only name, domain, and envVar
  • parseCookieString() for raw Cookie header strings
  • formatCookiesForHeader() for domain filtering and request formatting
  • An env-only loader that reads process.env[config.envVar], parses it, and throws a targeted error when missing or invalid

The following shared utilities will be removed:

  • JSON cookie-array parsing
  • Auto-detection between JSON and header-string formats
  • File loading helpers
  • Optional loaders whose behavior depends on file fallback or request input

Marketplace Scrapers

Marketplace scrapers that require auth will read cookies only from their env vars.

For Facebook this means:

  • Remove _cookiePath / cookiePath parameters from helper and public functions
  • Remove any docs/comments that mention parameter > env > file precedence
  • Update auth failure messaging to name only FACEBOOK_COOKIE

For eBay this means:

  • Remove any remaining fallback/file-oriented behavior from shared calls and error strings
  • Keep the existing env-var auth path, but make it the only path

Public API Surface

Exports from packages/core/src/index.ts should reflect the new contract. If exported functions currently advertise cookie-source or cookie-path arguments, their signatures will be tightened so callers cannot pass unsupported inputs.

Downstream adapter packages should continue calling core through the simplified signatures without adding their own cookie-loading behavior.

Error Handling

There are now only two auth failure modes:

  1. The required env var is missing or empty.
  2. The env var does not contain any valid name=value cookie pairs.

Errors should be blunt and specific:

  • identify the missing env var by name
  • state that the value must be a raw Cookie header string
  • stop mentioning request parameters, cookie paths, JSON arrays, or ./cookies/*.json

Testing Strategy

Follow TDD. Start by changing or adding core tests so the old file/request behavior is no longer accepted.

Coverage targets:

  1. Valid env header strings still parse into cookies correctly.
  2. Missing env vars fail with the new env-only error.
  3. Invalid env strings fail without falling back to files or request data.
  4. Facebook APIs no longer expose or honor cookie-path/request-cookie behavior.
  5. Existing tests that depended on missing files or JSON cookie arrays are rewritten to the env-only contract.

Verification target after implementation:

  • bun test packages/core/test
  • bun run ci
  • bun run build if any cross-package signature changes require downstream verification

Documentation Changes

Update cookie-related docs to match the new contract:

  • remove file-based setup instructions
  • remove request-parameter cookie examples
  • document env vars as the only supported auth input
  • show raw Cookie header-string examples only

Risks

  • External callers using request cookie overrides will break at compile time or runtime, depending on how they consume the package.
  • Recent work added support for custom Facebook cookie paths, so removing that path intentionally reverses a newly introduced behavior.
  • Tests that currently model missing-file behavior must be rewritten rather than preserved.

Rollout Notes

This is an intentional contract break. The code, tests, and docs should all land together so there is no mixed messaging about supported cookie sources.