chore: format markdown

Signed-off-by: Dmytro Stanchiev <git@dmytros.dev>
This commit is contained in:
2026-05-01 11:42:54 -04:00
parent d2c3c07e7d
commit 7ab33d0b02
15 changed files with 925 additions and 417 deletions

View File

@@ -1,12 +1,13 @@
# Design: Adopt opencode Monorepo Config
**Date:** 2025-07-14
**Status:** Approved
**Date:** 2025-07-14\
**Status:** Approved\
**Approach:** Full adoption (A)
## Context
Current repo (`marketplace-scrapers-monorepo`) has basic bun workspaces with 3 packages (`core`, `api-server`, `mcp-server`). Reference: `anomalyco/opencode` monorepo patterns.
Current repo (`marketplace-scrapers-monorepo`) has basic bun workspaces with 3 packages
(`core`, `api-server`, `mcp-server`). Reference: `anomalyco/opencode` monorepo patterns.
**Gaps vs opencode:**
- No Turbo (task orchestration, caching, dep graph)
@@ -20,7 +21,8 @@ Current repo (`marketplace-scrapers-monorepo`) has basic bun workspaces with 3 p
### 1. Root `package.json`
- Add `workspaces.catalog` block with shared deps:
- `@typescript/native-preview`, `@types/bun`, `@types/unidecode`, `@types/cli-progress`
- `@typescript/native-preview`, `@types/bun`, `@types/unidecode`,
`@types/cli-progress`
- Add `turbo` to `devDependencies`
- Add `@tsconfig/bun` to `devDependencies` + catalog
- Update root scripts: `typecheck` and `build` delegate to `turbo run`
@@ -93,7 +95,8 @@ exact = true
root = "./do-not-run-tests-from-root"
```
Exact installs = reproducible. Root test guard prevents accidental root-level test runs.
Exact installs = reproducible.
Root test guard prevents accidental root-level test runs.
### 6. Package `exports` field
@@ -102,7 +105,8 @@ Replace `main`/`module` with `exports` in all 3 packages:
"exports": { ".": "./src/index.ts" }
```
Remove `main` and `module` fields. Bun resolves `.ts` directly.
Remove `main` and `module` fields.
Bun resolves `.ts` directly.
### 7. Catalog references in per-package `package.json`
@@ -115,7 +119,7 @@ Replace pinned versions with `"catalog:"` for shared deps:
## Files Changed
| File | Action |
|---|---|
| --- | --- |
| `package.json` | Update (catalog, turbo dep, scripts) |
| `turbo.json` | Create |
| `tsconfig.json` | Create |

View File

@@ -3,7 +3,9 @@
## Summary
Remove all file-based and request-provided cookie inputs across the repo.
The only supported authentication input becomes a raw `Cookie` header string supplied through scraper-specific environment variables such as `FACEBOOK_COOKIE` and `EBAY_COOKIE`.
The only supported authentication input becomes a raw `Cookie` header string supplied
through scraper-specific environment variables such as `FACEBOOK_COOKIE` and
`EBAY_COOKIE`.
## Goals
@@ -17,7 +19,8 @@ The only supported authentication input becomes a raw `Cookie` header string sup
- Changing scraper behavior unrelated to authentication input.
- Adding new cookie formats or migration helpers.
- Preserving backward compatibility for cookie files, JSON cookie arrays, or request overrides.
- Preserving backward compatibility for cookie files, JSON cookie arrays, or request
overrides.
## Current State
@@ -27,27 +30,33 @@ The current shared cookie utilities support three sources in priority order:
2. Environment variable
3. Cookie file
`packages/core/src/utils/cookies.ts` includes file loading, JSON array parsing, and auto-detection between JSON and header-string formats.
Facebook also exposes deprecated `cookiePath` arguments that still reach shared loading logic.
Docs in `cookies/AGENTS.md` still describe file-based setup and request-level overrides.
`packages/core/src/utils/cookies.ts` includes file loading, JSON array parsing, and
auto-detection between JSON and header-string formats.
Facebook also exposes deprecated `cookiePath` arguments that still reach shared loading
logic. Docs in `cookies/AGENTS.md` still describe file-based setup and request-level
overrides.
## Chosen Approach
Use the hard-reset approach.
Delete the shared multi-source cookie-loading model and reduce the cookie surface to env-header parsing only.
This is a larger diff than a surgical removal, but it avoids leaving behind abstractions that imply unsupported inputs still exist.
Delete the shared multi-source cookie-loading model and reduce the cookie surface to
env-header parsing only.
This is a larger diff than a surgical removal, but it avoids leaving behind abstractions
that imply unsupported inputs still exist.
## Design
### Shared Cookie Utilities
`packages/core/src/utils/cookies.ts` will keep only the pieces needed for env-header-based auth:
`packages/core/src/utils/cookies.ts` will keep only the pieces needed for
env-header-based auth:
- `Cookie` type
- A reduced cookie config shape containing only `name`, `domain`, and `envVar`
- `parseCookieString()` for raw `Cookie` header strings
- `formatCookiesForHeader()` for domain filtering and request formatting
- An env-only loader that reads `process.env[config.envVar]`, parses it, and throws a targeted error when missing or invalid
- An env-only loader that reads `process.env[config.envVar]`, parses it, and throws a
targeted error when missing or invalid
The following shared utilities will be removed:
@@ -68,15 +77,18 @@ For Facebook this means:
For eBay this means:
- Remove any remaining fallback/file-oriented behavior from shared calls and error strings
- Remove any remaining fallback/file-oriented behavior from shared calls and error
strings
- Keep the existing env-var auth path, but make it the only path
### Public API Surface
Exports from `packages/core/src/index.ts` should reflect the new contract.
If exported functions currently advertise cookie-source or cookie-path arguments, their signatures will be tightened so callers cannot pass unsupported inputs.
If exported functions currently advertise cookie-source or cookie-path arguments, their
signatures will be tightened so callers cannot pass unsupported inputs.
Downstream adapter packages should continue calling core through the simplified signatures without adding their own cookie-loading behavior.
Downstream adapter packages should continue calling core through the simplified
signatures without adding their own cookie-loading behavior.
### Error Handling
@@ -93,8 +105,8 @@ Errors should be blunt and specific:
### Testing Strategy
Follow TDD.
Start by changing or adding core tests so the old file/request behavior is no longer accepted.
Follow TDD. Start by changing or adding core tests so the old file/request behavior is
no longer accepted.
Coverage targets:
@@ -102,7 +114,8 @@ Coverage targets:
2. Missing env vars fail with the new env-only error.
3. Invalid env strings fail without falling back to files or request data.
4. Facebook APIs no longer expose or honor cookie-path/request-cookie behavior.
5. Existing tests that depended on missing files or JSON cookie arrays are rewritten to the env-only contract.
5. Existing tests that depended on missing files or JSON cookie arrays are rewritten to
the env-only contract.
Verification target after implementation:
@@ -121,11 +134,15 @@ Update cookie-related docs to match the new contract:
## Risks
- External callers using request cookie overrides will break at compile time or runtime, depending on how they consume the package.
- Recent work added support for custom Facebook cookie paths, so removing that path intentionally reverses a newly introduced behavior.
- Tests that currently model missing-file behavior must be rewritten rather than preserved.
- External callers using request cookie overrides will break at compile time or runtime,
depending on how they consume the package.
- Recent work added support for custom Facebook cookie paths, so removing that path
intentionally reverses a newly introduced behavior.
- Tests that currently model missing-file behavior must be rewritten rather than
preserved.
## Rollout Notes
This is an intentional contract break.
The code, tests, and docs should all land together so there is no mixed messaging about supported cookie sources.
The code, tests, and docs should all land together so there is no mixed messaging about
supported cookie sources.

View File

@@ -2,35 +2,46 @@
## Summary
Replace the legacy Facebook Marketplace scraper with a route-aware implementation built around current Comet bootstrap markers and route-specific extraction.
The new scraper will keep authenticated direct HTTP fetches as the primary transport, but it will stop treating legacy `require`, `__bbox`, and `marketplace_product_details_page` structures as the main parsing contract.
Replace the legacy Facebook Marketplace scraper with a route-aware implementation built
around current Comet bootstrap markers and route-specific extraction.
The new scraper will keep authenticated direct HTTP fetches as the primary transport,
but it will stop treating legacy `require`, `__bbox`, and
`marketplace_product_details_page` structures as the main parsing contract.
## Goals
- Replace both Facebook search and item-detail extraction with a current-shape parser.
- Keep authenticated direct HTTP requests as the primary fetch strategy.
- Parse route-specific Comet bootstrap/state payloads before falling back to rendered-HTML extraction.
- Parse route-specific Comet bootstrap/state payloads before falling back to
rendered-HTML extraction.
- Detect auth-gated, unavailable, and unknown responses explicitly.
- Update tests so they model current route markers and failure modes instead of legacy page objects.
- Update tests so they model current route markers and failure modes instead of legacy
page objects.
## Non-Goals
- Reworking non-Facebook scrapers.
- Converting the scraper to browser-only automation.
- Preserving old parser behavior for `marketplace_product_details_page` or `__bbox`-driven item extraction.
- Reverse-engineering every internal Facebook bootstrap payload shape exhaustively before implementation.
- Preserving old parser behavior for `marketplace_product_details_page` or
`__bbox`-driven item extraction.
- Reverse-engineering every internal Facebook bootstrap payload shape exhaustively
before implementation.
## Current State
The current implementation in `packages/core/src/scrapers/facebook.ts` still uses authenticated HTTP requests, which remains correct.
The search path parses embedded script JSON and looks for `marketplace_search.feed_units.edges`.
The item-detail path is centered on legacy extraction paths such as:
The current implementation in `packages/core/src/scrapers/facebook.ts` still uses
authenticated HTTP requests, which remains correct.
The search path parses embedded script JSON and looks for
`marketplace_search.feed_units.edges`. The item-detail path is centered on legacy
extraction paths such as:
- `parsed.require[0][3].__bbox.result.data.viewer.marketplace_product_details_page.target`
- nested `__bbox.require[...]` variations
- recursive search through `parsed.require`
Live evidence gathered earlier in this session and by the isolated research subagent shows that current Facebook Marketplace pages are Comet route-driven and expose markers such as:
Live evidence gathered earlier in this session and by the isolated research subagent
shows that current Facebook Marketplace pages are Comet route-driven and expose markers
such as:
- `XCometMarketplaceSearchController`
- `XCometMarketplacePermalinkController`
@@ -41,7 +52,9 @@ Live evidence gathered earlier in this session and by the isolated research suba
- `data-sjs`
- `data-btmanifest`
The same live investigation also showed that authenticated item pages no longer expose the old `marketplace_product_details_page` marker reliably, while live search still returns usable results.
The same live investigation also showed that authenticated item pages no longer expose
the old `marketplace_product_details_page` marker reliably, while live search still
returns usable results.
## Chosen Approach
@@ -52,9 +65,11 @@ The scraper will:
1. Fetch authenticated HTML directly.
2. Classify the response using current route and auth markers.
3. Parse inline bootstrap/state payloads using route-specific probes.
4. Fall back to rendered-HTML extraction only when bootstrap markers are present but the payload cannot be decoded into the expected search or item shape.
4. Fall back to rendered-HTML extraction only when bootstrap markers are present but the
payload cannot be decoded into the expected search or item shape.
This keeps the cheaper direct-HTTP transport while shifting the parser contract from legacy page-object names to current Comet route structure.
This keeps the cheaper direct-HTTP transport while shifting the parser contract from
legacy page-object names to current Comet route structure.
## Design
@@ -88,7 +103,8 @@ Primary behavior:
- fetch the Marketplace search HTML with auth cookies
- confirm the response class is `search`
- extract inline bootstrap/state blobs from script tags and page attributes
- probe for route-specific search payloads associated with `XCometMarketplaceSearchController`
- probe for route-specific search payloads associated with
`XCometMarketplaceSearchController`
- map decoded search results into summary listing records
Search summary fields should remain aligned with the current public output shape:
@@ -102,7 +118,8 @@ Search summary fields should remain aligned with the current public output shape
Fallback behavior:
- if search route markers are present but structured payload decoding fails, extract listing summaries from rendered HTML anchors and text patterns
- if search route markers are present but structured payload decoding fails, extract
listing summaries from rendered HTML anchors and text patterns
- use item links matching `/marketplace/item/<id>` as the anchor for fallback extraction
- treat fallback results as summary-only data, not rich detail data
@@ -132,9 +149,12 @@ Priority item fields:
Fallback behavior:
- if permalink route markers are present but no stable payload object is decodable, extract data from rendered HTML text structure
- prioritize title, price, condition, description, location text, and seller module content
- return partial item data when core user-facing fields are present rather than failing solely because deeper commerce metadata is missing
- if permalink route markers are present but no stable payload object is decodable,
extract data from rendered HTML text structure
- prioritize title, price, condition, description, location text, and seller module
content
- return partial item data when core user-facing fields are present rather than failing
solely because deeper commerce metadata is missing
### Bootstrap Parsing Strategy
@@ -151,11 +171,14 @@ Candidate discovery inputs:
- `ServerJS` / `Bootloader` inline blobs
- route controller names
Candidate scoring for search should favor objects that contain repeated result-card semantics, item IDs, listing links, titles, prices, or location summaries.
Candidate scoring for item pages should favor objects that contain singular listing semantics, title, price, condition, description, location, seller, or permalink context.
Candidate scoring for search should favor objects that contain repeated result-card
semantics, item IDs, listing links, titles, prices, or location summaries.
Candidate scoring for item pages should favor objects that contain singular listing
semantics, title, price, condition, description, location, seller, or permalink context.
The parser should not depend on one hard-coded object name surviving forever.
Instead, it should look for route-specific semantic clusters and choose the strongest candidate.
Instead, it should look for route-specific semantic clusters and choose the strongest
candidate.
### Legacy Removal
@@ -166,7 +189,9 @@ Specifically:
- delete legacy-first `require` / `__bbox` navigation tables
- delete tests whose only purpose is to preserve those legacy paths
If a minimal legacy compatibility branch remains, it must be a last-resort fallback behind the new route-aware parser and should not shape test fixtures or design decisions.
If a minimal legacy compatibility branch remains, it must be a last-resort fallback
behind the new route-aware parser and should not shape test fixtures or design
decisions.
### Error Handling
@@ -178,7 +203,8 @@ Facebook responses should now fail with explicit route-aware outcomes:
4. Search or item route detected, but no decodable data found.
5. Unknown response shape.
Error messages should name the actual class of failure instead of implying that every parse miss is caused by expired cookies.
Error messages should name the actual class of failure instead of implying that every
parse miss is caused by expired cookies.
### Testing Strategy
@@ -190,11 +216,15 @@ Coverage targets:
1. Search responses classify correctly from current Comet controller markers.
2. Item responses classify correctly from current Comet controller markers.
3. Login-gated and unavailable responses are detected before parsing.
4. Search bootstrap parsing produces summary listing results from current-shape fixtures.
4. Search bootstrap parsing produces summary listing results from current-shape
fixtures.
5. Item bootstrap parsing produces rich listing details from current-shape fixtures.
6. Search fallback extraction works when route markers exist but structured payload decoding fails.
7. Item fallback extraction works when route markers exist but structured payload decoding fails.
8. Old legacy-only item fixtures are removed or rewritten so they no longer define the contract.
6. Search fallback extraction works when route markers exist but structured payload
decoding fails.
7. Item fallback extraction works when route markers exist but structured payload
decoding fails.
8. Old legacy-only item fixtures are removed or rewritten so they no longer define the
contract.
Verification target after implementation:
@@ -204,23 +234,30 @@ Verification target after implementation:
## Public API Surface
Keep the current public function names unless the rewrite proves that a signature change is required:
Keep the current public function names unless the rewrite proves that a signature change
is required:
- `fetchFacebookItems(...)`
- `fetchFacebookItem(...)`
- `extractFacebookMarketplaceData(...)`
- `extractFacebookItemData(...)`
The internals should change substantially, but callers should not need a new integration surface for this rewrite.
The internals should change substantially, but callers should not need a new integration
surface for this rewrite.
## Risks
- Facebook may change bootstrap payload naming again, so route/controller markers are more stable than exact nested object paths but still not guaranteed.
- Search and item pages may each contain multiple partial payloads, making candidate ranking important.
- Fallback rendered-HTML extraction may be noisier than bootstrap decoding and needs clear precedence rules.
- Live fixtures can drift from production quickly, so tests must model route semantics rather than exact one-off payloads where possible.
- Facebook may change bootstrap payload naming again, so route/controller markers are
more stable than exact nested object paths but still not guaranteed.
- Search and item pages may each contain multiple partial payloads, making candidate
ranking important.
- Fallback rendered-HTML extraction may be noisier than bootstrap decoding and needs
clear precedence rules.
- Live fixtures can drift from production quickly, so tests must model route semantics
rather than exact one-off payloads where possible.
## Rollout Notes
The code, fixtures, and tests should change together.
There should be no mixed state where the implementation is Comet-aware but the tests still encode `marketplace_product_details_page` as the primary contract.
There should be no mixed state where the implementation is Comet-aware but the tests
still encode `marketplace_product_details_page` as the primary contract.

View File

@@ -2,15 +2,18 @@
## Summary
Add an optional shared result mode across Facebook, eBay, and Kijiji that moves suspiciously cheap listings out of the main results into a separate `unstableResults` bucket.
Listings are considered unstable when their price is more than 20% below the median price of the scraper's priced search results.
Add an optional shared result mode across Facebook, eBay, and Kijiji that moves
suspiciously cheap listings out of the main results into a separate `unstableResults`
bucket. Listings are considered unstable when their price is more than 20% below the
median price of the scrapers priced search results.
## Goals
- Support the same optional unstable-listing mode across all scrapers.
- Keep current default scraper and route behavior unchanged unless the mode is enabled.
- Hide unstable listings from the main results while still returning them separately.
- Implement the rule once in shared core code instead of duplicating marketplace-specific logic.
- Implement the rule once in shared core code instead of duplicating
marketplace-specific logic.
- Document the option in MCP tool descriptions so callers can discover it.
## Non-Goals
@@ -24,7 +27,8 @@ Listings are considered unstable when their price is more than 20% below the med
`packages/core` currently returns plain arrays from scraper search functions.
`packages/api-server` forwards those scraper results directly from marketplace routes.
`packages/mcp-server` documents search tools per marketplace, but does not expose or describe any result-stability mode.
`packages/mcp-server` documents search tools per marketplace, but does not expose or
describe any result-stability mode.
There is no shared result-classification utility today.
Price filtering exists in some scrapers, but not a cross-marketplace median-based split.
@@ -33,11 +37,14 @@ Price filtering exists in some scrapers, but not a cross-marketplace median-base
Use a shared core utility plus per-route and per-tool opt-in.
The shared utility will accept parsed listings, compute the median from valid positive prices, and split the data into `results` and `unstableResults`.
Each scraper will opt into that utility when the caller enables unstable-listing mode.
API routes and MCP tools will expose the same optional mode so the feature is consistently available everywhere scraper search is surfaced.
The shared utility will accept parsed listings, compute the median from valid positive
prices, and split the data into `results` and `unstableResults`. Each scraper will opt
into that utility when the caller enables unstable-listing mode.
API routes and MCP tools will expose the same optional mode so the feature is
consistently available everywhere scraper search is surfaced.
This keeps the heuristic centralized, minimizes duplicated logic, and preserves existing consumers by leaving the default path unchanged.
This keeps the heuristic centralized, minimizes duplicated logic, and preserves existing
consumers by leaving the default path unchanged.
## Design
@@ -48,14 +55,16 @@ Add a shared utility in `packages/core` for listing stability classification.
Responsibilities:
- accept parsed listing arrays with `listingPrice.cents`
- ignore listings whose price is missing, non-numeric, or non-positive when computing the median
- ignore listings whose price is missing, non-numeric, or non-positive when computing
the median
- compute the median price from valid priced listings
- classify listings as unstable when `listingPrice.cents < median * 0.8`
- return an object with:
- `results`: listings that remain in the main bucket
- `unstableResults`: listings moved out of the main bucket
Listings excluded from median computation because their price is missing or non-positive remain in `results` unchanged.
Listings excluded from median computation because their price is missing or non-positive
remain in `results` unchanged.
### Scraper Integration
@@ -68,7 +77,8 @@ Default behavior:
Opt-in behavior:
- run the shared classification utility after parsing search results
- classify before final result limiting so unstable items do not consume main-result slots
- classify before final result limiting so unstable items do not consume main-result
slots
- return an object shaped like:
```ts
@@ -82,7 +92,8 @@ Each scraper will use its existing concrete listing subtype for these arrays.
### API Surface
Marketplace API routes will expose an optional query parameter for unstable-listing mode.
Marketplace API routes will expose an optional query parameter for unstable-listing
mode.
Requirements:
@@ -90,7 +101,8 @@ Requirements:
- when enabled, return the object payload with `results` and `unstableResults`
- use the same semantics across Facebook, eBay, and Kijiji routes
The exact parameter name should be consistent across routes and intentionally describe the behavior, for example `unstableFilter=true`.
The exact parameter name should be consistent across routes and intentionally describe
the behavior, for example `unstableFilter=true`.
### MCP Surface
@@ -100,34 +112,43 @@ Tool descriptions should explicitly document:
- that the option is optional
- that it moves listings priced more than 20% below the median into `unstableResults`
- that enabling it changes the response shape from a plain list to an object with `results` and `unstableResults`
- that enabling it changes the response shape from a plain list to an object with
`results` and `unstableResults`
- that the behavior is available for Facebook, eBay, and Kijiji search tools
The wording should be aligned across all three tools so the feature reads as one shared capability.
The wording should be aligned across all three tools so the feature reads as one shared
capability.
### Error Handling
The unstable-listing mode should be best-effort and non-failing.
- If there are no valid positive prices, return all listings in `results` and an empty `unstableResults` array.
- If there are no valid positive prices, return all listings in `results` and an empty
`unstableResults` array.
- If there is only one valid priced listing, do not classify it as unstable.
- Parsing failures remain governed by existing scraper behavior; the classification layer should not introduce new scraper-specific errors.
- Parsing failures remain governed by existing scraper behavior; the classification
layer should not introduce new scraper-specific errors.
### Testing Strategy
Follow TDD.
Start with shared utility tests, then wire the option through scraper and route tests.
Follow TDD. Start with shared utility tests, then wire the option through scraper and
route tests.
Coverage targets:
1. Median calculation for odd-sized valid price sets.
2. Median calculation for even-sized valid price sets.
3. Strict cutoff behavior where only listings with `price < median * 0.8` move to `unstableResults`.
4. Missing, invalid, zero, or negative prices are excluded from median computation and remain in `results`.
3. Strict cutoff behavior where only listings with `price < median * 0.8` move to
`unstableResults`.
4. Missing, invalid, zero, or negative prices are excluded from median computation and
remain in `results`.
5. Default scraper behavior still returns plain arrays when the option is disabled.
6. Enabled scraper behavior returns `{ results, unstableResults }` for Facebook, eBay, and Kijiji.
7. API routes preserve existing response shapes by default and switch to the object payload only when enabled.
8. MCP tool metadata documents the new optional mode for all three marketplace search tools.
6. Enabled scraper behavior returns `{ results, unstableResults }` for Facebook, eBay,
and Kijiji.
7. API routes preserve existing response shapes by default and switch to the object
payload only when enabled.
8. MCP tool metadata documents the new optional mode for all three marketplace search
tools.
Verification target after implementation:
@@ -138,11 +159,15 @@ Verification target after implementation:
## Risks
- The optional mode introduces a union return shape for scraper callers, which can ripple into downstream TypeScript signatures.
- Applying classification before final limiting changes which items appear in the main bucket compared with a naive post-limit split.
- Kijiji and eBay may have different mixes of priced and unpriced results, so excluding non-positive prices from the median must remain explicit and tested.
- The optional mode introduces a union return shape for scraper callers, which can
ripple into downstream TypeScript signatures.
- Applying classification before final limiting changes which items appear in the main
bucket compared with a naive post-limit split.
- Kijiji and eBay may have different mixes of priced and unpriced results, so excluding
non-positive prices from the median must remain explicit and tested.
## Rollout Notes
Land the shared classifier, scraper wiring, route wiring, tests, and MCP description updates together.
That avoids a partial rollout where the feature exists in one surface but is undocumented or inconsistent elsewhere.
Land the shared classifier, scraper wiring, route wiring, tests, and MCP description
updates together. That avoids a partial rollout where the feature exists in one surface
but is undocumented or inconsistent elsewhere.

View File

@@ -2,25 +2,32 @@
## Summary
Add explicit live endpoint tests for each core scraper parser path. These tests are excluded from normal deterministic test commands and run only through a dedicated package script.
Add explicit live endpoint tests for each core scraper parser path.
These tests are excluded from normal deterministic test commands and run only through a
dedicated package script.
## Scope
- Add one live suite per parser: eBay, Kijiji, Facebook.
- Place suites under `packages/core/test/live/` so normal `bun test packages/core/test/*.test.ts` patterns do not include them accidentally.
- Place suites under `packages/core/test/live/` so normal
`bun test packages/core/test/*.test.ts` patterns do not include them accidentally.
- Add a root `test:live` script that runs all live suites together.
- Keep existing mocked tests unchanged.
## Behavior
- Each suite calls the public scraper entry point for that marketplace with a narrow query and low max item count.
- Assertions verify scrape output shape and parser viability, not exact listing identity.
- Each suite calls the public scraper entry point for that marketplace with a narrow
query and low max item count.
- Assertions verify scrape output shape and parser viability, not exact listing
identity.
- eBay and Kijiji require live network access and fail on endpoint/parser breakage.
- Facebook is strict: missing or expired `FACEBOOK_COOKIE` fails the live suite instead of skipping.
- Facebook is strict: missing or expired `FACEBOOK_COOKIE` fails the live suite instead
of skipping.
## Test Data
- Use stable broad Canadian queries such as `iphone` or `laptop` to reduce empty-result risk.
- Use stable broad Canadian queries such as `iphone` or `laptop` to reduce empty-result
risk.
- Use low limits to avoid unnecessary load and rate-limit pressure.
- Avoid exact prices, titles, listing IDs, or ordering assumptions.