Compare commits

..

29 Commits

Author SHA1 Message Date
49e90d45f8 docs: expose unstable mode in mcp tools 2026-04-28 19:03:42 -04:00
b6456047a6 feat: add maxItems support to ebay scraper 2026-04-27 10:56:23 -04:00
02b3f805b2 fix: use explicit conditional calls and validate negative params 2026-04-27 10:46:06 -04:00
a1af5d2630 fix: align ebay route with spec and validate params 2026-04-27 09:56:39 -04:00
77b9fc9934 fix: validate route params and reduce duplication 2026-04-27 09:45:47 -04:00
a802035ca4 fix: correct empty-result and maxItems handling in routes 2026-04-27 09:34:08 -04:00
974190de6b fix: preserve maxItems limit in unstable mode 2026-04-27 08:57:48 -04:00
3c38232cd5 feat: expose unstable mode in api routes 2026-04-27 02:49:35 -04:00
224e83ac4c fix: correct ebay title filtering and type contracts 2026-04-27 02:04:48 -04:00
b73faa35da fix: respect scraper pacing details 2026-04-27 00:13:42 -04:00
0f77155c8d fix: align marketplace price filter parsing 2026-04-23 11:14:57 -04:00
10c2856bf6 fix: tighten item price and pacing behavior 2026-04-23 10:59:33 -04:00
9c8643086a fix: refine scraper output behavior 2026-04-23 10:43:38 -04:00
244a88e63c fix: harden scraper price parsing 2026-04-23 10:31:08 -04:00
807849e257 fix: expose ebay unstable mode typing 2026-04-23 05:47:50 -04:00
eb37e8814e fix: preserve free results and request pacing 2026-04-23 05:40:42 -04:00
13c0fec305 fix: tighten scraper type contracts 2026-04-23 05:28:46 -04:00
08d59ab497 fix: tighten ebay result parsing 2026-04-23 05:13:40 -04:00
0a0723a560 fix: respect filtered result sets in unstable mode 2026-04-23 05:03:26 -04:00
881c2ddf8c fix: finalize scraper unstable mode integration 2026-04-23 00:20:21 -04:00
55faee7dd5 fix: cover scraper pricing edge cases 2026-04-22 23:54:07 -04:00
b5e14e686a fix: tighten scraper edge case handling 2026-04-22 23:46:52 -04:00
6f9d4db419 fix: tighten scraper parsing behavior 2026-04-22 23:41:08 -04:00
08edfa8097 fix: align scraper unstable mode behavior 2026-04-22 23:36:00 -04:00
c7fc8352ac fix: preserve default scraper result contracts 2026-04-22 23:30:17 -04:00
1ee41fb346 feat: add unstable mode to scraper results 2026-04-22 23:23:31 -04:00
8141de5b4b feat: add shared unstable listing classifier 2026-04-22 17:56:26 -04:00
f8975fa91d docs: add unstable listing mode plan 2026-04-22 17:53:45 -04:00
cb5e1e62d2 docs: add unstable listing mode design 2026-04-22 17:51:07 -04:00
19 changed files with 4183 additions and 140 deletions

View File

@@ -0,0 +1,672 @@
# Unstable Listing Mode Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add an optional shared mode across Facebook, eBay, and Kijiji that moves listings priced below 80% of the median into `unstableResults`, while preserving current default response shapes.
**Architecture:** Introduce a shared generic classifier in `packages/core` that splits any listing array into `results` and `unstableResults` using the same median-based rule. Then thread one opt-in flag through the scraper entrypoints, API routes, and MCP tool definitions so all surfaces expose the same behavior without changing existing defaults.
**Tech Stack:** Bun, TypeScript, Bun test, workspace packages, JSON-RPC MCP server
---
## File Map
- Create: `packages/core/src/utils/unstable.ts`
Purpose: shared generic median/cutoff classifier for listing arrays.
- Modify: `packages/core/src/types/common.ts`
Purpose: add shared mode types used by scrapers and adapters.
- Modify: `packages/core/src/index.ts`
Purpose: export the new shared classifier/types.
- Modify: `packages/core/src/scrapers/facebook.ts`
Purpose: add the optional mode flag and return bucketed results when enabled.
- Modify: `packages/core/src/scrapers/ebay.ts`
Purpose: add the optional mode flag and return bucketed results when enabled.
- Modify: `packages/core/src/scrapers/kijiji.ts`
Purpose: add the optional mode flag and return bucketed results when enabled.
- Create: `packages/core/test/unstable-listing-mode.test.ts`
Purpose: lock the shared classifier behavior with direct unit tests.
- Modify: `packages/core/test/facebook-core.test.ts`
Purpose: prove Facebook preserves default arrays and returns buckets when enabled.
- Modify: `packages/core/test/ebay-core.test.ts`
Purpose: prove eBay preserves default arrays and returns buckets when enabled.
- Modify: `packages/core/test/kijiji-core.test.ts`
Purpose: prove Kijiji preserves default arrays and returns buckets when enabled.
- Modify: `packages/api-server/src/routes/facebook.ts`
Purpose: expose a shared opt-in query parameter and preserve default response shape.
- Modify: `packages/api-server/src/routes/ebay.ts`
Purpose: expose the same query parameter and preserve default response shape.
- Modify: `packages/api-server/src/routes/kijiji.ts`
Purpose: expose the same query parameter and preserve default response shape.
- Modify: `packages/api-server/test/routes.test.ts`
Purpose: verify route forwarding and route response-shape switching.
- Modify: `packages/mcp-server/src/protocol/tools.ts`
Purpose: document the optional unstable mode in all search tools.
- Modify: `packages/mcp-server/src/protocol/handler.ts`
Purpose: forward the optional mode to API routes for all search tools.
- Modify: `packages/mcp-server/test/protocol.test.ts`
Purpose: verify MCP tool metadata and forwarded URLs include the new option.
### Task 1: Add the shared unstable-listing classifier
**Files:**
- Create: `packages/core/src/utils/unstable.ts`
- Modify: `packages/core/src/types/common.ts`
- Modify: `packages/core/src/index.ts`
- Test: `packages/core/test/unstable-listing-mode.test.ts`
- [ ] **Step 1: Write the failing test**
Create `packages/core/test/unstable-listing-mode.test.ts` with focused shared-behavior coverage:
```ts
import { describe, expect, test } from "bun:test";
import {
classifyUnstableListings,
type ListingDetails,
} from "../src/index";
function makeListing(title: string, cents?: number): ListingDetails {
return {
url: `https://example.com/${title}`,
title,
listingPrice: {
amountFormatted: cents ? `$${(cents / 100).toFixed(2)}` : "$0.00",
cents: cents ?? 0,
currency: "CAD",
},
listingType: "item",
listingStatus: "ACTIVE",
};
}
describe("classifyUnstableListings", () => {
test("moves listings below 80% of the median into unstableResults", () => {
const output = classifyUnstableListings([
makeListing("cheap", 1000),
makeListing("mid", 2000),
makeListing("high", 3000),
]);
expect(output.results.map((item) => item.title)).toEqual(["mid", "high"]);
expect(output.unstableResults.map((item) => item.title)).toEqual(["cheap"]);
});
test("uses the midpoint median for even-sized priced inputs", () => {
const output = classifyUnstableListings([
makeListing("a", 1000),
makeListing("b", 2000),
makeListing("c", 3000),
makeListing("d", 4000),
]);
expect(output.results.map((item) => item.title)).toEqual(["b", "c", "d"]);
expect(output.unstableResults.map((item) => item.title)).toEqual(["a"]);
});
test("keeps non-positive prices in results while excluding them from median input", () => {
const output = classifyUnstableListings([
makeListing("free", 0),
makeListing("cheap", 1000),
makeListing("mid", 2000),
makeListing("high", 3000),
]);
expect(output.results.map((item) => item.title)).toEqual(["free", "mid", "high"]);
expect(output.unstableResults.map((item) => item.title)).toEqual(["cheap"]);
});
test("returns all listings as results when fewer than two valid prices exist", () => {
const output = classifyUnstableListings([makeListing("only", 2500)]);
expect(output.results.map((item) => item.title)).toEqual(["only"]);
expect(output.unstableResults).toEqual([]);
});
});
```
- [ ] **Step 2: Run test to verify it fails**
Run: `bun test packages/core/test/unstable-listing-mode.test.ts`
Expected: FAIL because `classifyUnstableListings` and the shared mode types do not exist yet.
- [ ] **Step 3: Write minimal implementation**
Add shared types in `packages/core/src/types/common.ts`:
```ts
export interface UnstableListingBuckets<T> {
results: T[];
unstableResults: T[];
}
export interface UnstableListingModeOptions {
hideUnstableResults?: boolean;
}
```
Create `packages/core/src/utils/unstable.ts` with the shared classifier:
```ts
import type { ListingDetails, UnstableListingBuckets } from "../types/common";
function getMedian(values: number[]): number | null {
if (values.length < 2) return null;
const sorted = [...values].sort((a, b) => a - b);
const middle = Math.floor(sorted.length / 2);
if (sorted.length % 2 === 0) {
return (sorted[middle - 1] + sorted[middle]) / 2;
}
return sorted[middle];
}
export function classifyUnstableListings<T extends ListingDetails>(
listings: T[],
): UnstableListingBuckets<T> {
const pricedValues = listings
.map((listing) => listing.listingPrice?.cents)
.filter((cents): cents is number => Number.isFinite(cents) && cents > 0);
const median = getMedian(pricedValues);
if (median == null) {
return { results: listings, unstableResults: [] };
}
const threshold = median * 0.8;
const results: T[] = [];
const unstableResults: T[] = [];
for (const listing of listings) {
const cents = listing.listingPrice?.cents;
if (Number.isFinite(cents) && cents > 0 && cents < threshold) {
unstableResults.push(listing);
continue;
}
results.push(listing);
}
return { results, unstableResults };
}
```
Export the new symbols from `packages/core/src/index.ts`:
```ts
export * from "./types/common";
export { classifyUnstableListings } from "./utils/unstable";
```
- [ ] **Step 4: Run test to verify it passes**
Run: `bun test packages/core/test/unstable-listing-mode.test.ts`
Expected: PASS with 4 passing tests.
- [ ] **Step 5: Commit**
```bash
git add packages/core/src/utils/unstable.ts packages/core/src/types/common.ts packages/core/src/index.ts packages/core/test/unstable-listing-mode.test.ts
git commit -m "feat: add shared unstable listing classifier"
```
### Task 2: Thread the optional mode through all core scrapers
**Files:**
- Modify: `packages/core/src/scrapers/facebook.ts`
- Modify: `packages/core/src/scrapers/ebay.ts`
- Modify: `packages/core/src/scrapers/kijiji.ts`
- Modify: `packages/core/test/facebook-core.test.ts`
- Modify: `packages/core/test/ebay-core.test.ts`
- Modify: `packages/core/test/kijiji-core.test.ts`
- [ ] **Step 1: Write the failing tests**
Add one focused opt-in test per scraper. Use the new shared classifier through the public scraper entrypoints instead of testing internal helpers.
In `packages/core/test/facebook-core.test.ts`, add:
```ts
test("fetchFacebookItems returns stable and unstable buckets when unstable mode is enabled", async () => {
process.env.FACEBOOK_COOKIE = "c_user=123; xs=abc";
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(facebookSearchHtmlFixture),
headers: { get: () => null },
}),
);
const result = await fetchFacebookItems("bike", 1, "toronto", 25, {
hideUnstableResults: true,
});
expect(result).toHaveProperty("results");
expect(result).toHaveProperty("unstableResults");
});
```
In `packages/core/test/ebay-core.test.ts`, add:
```ts
test("fetchEbayItems returns stable and unstable buckets when unstable mode is enabled", async () => {
const result = await fetchEbayItems("bike", 1, {
keywords: ["bike"],
exclusions: [],
strictMode: false,
buyItNowOnly: true,
canadaOnly: true,
}, {
hideUnstableResults: true,
});
expect(result).toHaveProperty("results");
expect(result).toHaveProperty("unstableResults");
});
```
In `packages/core/test/kijiji-core.test.ts`, add:
```ts
test("fetchKijijiItems returns stable and unstable buckets when unstable mode is enabled", async () => {
const result = await fetchKijijiItems(
"bike",
1,
"https://www.kijiji.ca",
{ maxPages: 1 },
{},
{ hideUnstableResults: true },
);
expect(result).toHaveProperty("results");
expect(result).toHaveProperty("unstableResults");
});
```
Also add one default-mode assertion in one existing scraper test file, for example in `packages/core/test/facebook-core.test.ts`:
```ts
test("fetchFacebookItems keeps returning an array by default", async () => {
process.env.FACEBOOK_COOKIE = "c_user=123; xs=abc";
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(facebookSearchHtmlFixture),
headers: { get: () => null },
}),
);
const result = await fetchFacebookItems("bike");
expect(Array.isArray(result)).toBe(true);
});
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `bun test packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts`
Expected: FAIL because the scraper signatures do not yet accept the new option and still always return arrays.
- [ ] **Step 3: Write minimal implementation**
Add a small shared helper type import to each scraper:
```ts
import {
classifyUnstableListings,
type UnstableListingBuckets,
type UnstableListingModeOptions,
} from "../index";
```
In `packages/core/src/scrapers/facebook.ts`, extend the default export signature and branch at the end:
```ts
export default async function fetchFacebookItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND = 1,
LOCATION = "toronto",
MAX_ITEMS = 25,
unstableOptions: UnstableListingModeOptions = {},
): Promise<FacebookListingDetails[] | UnstableListingBuckets<FacebookListingDetails>> {
// existing fetch/parsing logic
const limitedItems = pricedItems.slice(0, MAX_ITEMS);
if (!unstableOptions.hideUnstableResults) {
return limitedItems;
}
const classified = classifyUnstableListings(pricedItems);
return {
results: classified.results.slice(0, MAX_ITEMS),
unstableResults: classified.unstableResults,
};
}
```
In `packages/core/src/scrapers/ebay.ts`, extend the entrypoint the same way:
```ts
export default async function fetchEbayItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND = 1,
options: EbaySearchOptions = {},
unstableOptions: UnstableListingModeOptions = {},
): Promise<EbayListingDetails[] | UnstableListingBuckets<EbayListingDetails>> {
// existing fetch/parsing logic
const limitedResults = maxItems ? listings.slice(0, maxItems) : listings;
if (!unstableOptions.hideUnstableResults) {
return limitedResults;
}
const classified = classifyUnstableListings(listings);
return {
results: maxItems ? classified.results.slice(0, maxItems) : classified.results,
unstableResults: classified.unstableResults,
};
}
```
In `packages/core/src/scrapers/kijiji.ts`, add the same final argument after `listingOptions`:
```ts
export default async function fetchKijijiItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND = 1,
BASE_URL = "https://www.kijiji.ca",
searchOptions: SearchOptions = {},
listingOptions: ListingFetchOptions = {},
unstableOptions: UnstableListingModeOptions = {},
): Promise<DetailedListing[] | UnstableListingBuckets<DetailedListing>> {
// existing fetch/parsing logic
if (!unstableOptions.hideUnstableResults) {
return allListings;
}
return classifyUnstableListings(allListings);
}
```
Keep the default branch untouched in all three files so existing callers still receive arrays.
- [ ] **Step 4: Run tests to verify they pass**
Run: `bun test packages/core/test/unstable-listing-mode.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts`
Expected: PASS, including the new opt-in bucket assertions and the default-array regression assertion.
- [ ] **Step 5: Commit**
```bash
git add packages/core/src/scrapers/facebook.ts packages/core/src/scrapers/ebay.ts packages/core/src/scrapers/kijiji.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts
git commit -m "feat: add unstable mode to scraper results"
```
### Task 3: Expose unstable mode in API routes
**Files:**
- Modify: `packages/api-server/src/routes/facebook.ts`
- Modify: `packages/api-server/src/routes/ebay.ts`
- Modify: `packages/api-server/src/routes/kijiji.ts`
- Modify: `packages/api-server/test/routes.test.ts`
- [ ] **Step 1: Write the failing tests**
Extend `packages/api-server/test/routes.test.ts` with route-forwarding coverage for the new query parameter:
```ts
test("facebookRoute forwards unstableFilter=true to core", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&location=toronto&maxItems=3&unstableFilter=true",
),
);
expect(fetchFacebookItems).toHaveBeenCalledWith(
"laptop",
1,
"toronto",
3,
{ hideUnstableResults: true },
);
});
test("ebayRoute forwards unstableFilter=true to core", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
await ebayRoute(
new Request("http://localhost/api/ebay?q=laptop&unstableFilter=true"),
);
expect(fetchEbayItems).toHaveBeenCalledWith(
"laptop",
1,
{
minPrice: undefined,
maxPrice: undefined,
strictMode: false,
exclusions: [],
keywords: ["laptop"],
buyItNowOnly: true,
canadaOnly: true,
},
{ hideUnstableResults: true },
);
});
test("kijijiRoute forwards unstableFilter=true to core", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
await kijijiRoute(
new Request("http://localhost/api/kijiji?q=laptop&unstableFilter=true"),
);
expect(fetchKijijiItems).toHaveBeenCalledWith(
"laptop",
4,
"https://www.kijiji.ca",
expect.any(Object),
{},
{ hideUnstableResults: true },
);
});
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `bun test packages/api-server/test/routes.test.ts`
Expected: FAIL because the routes do not yet parse or forward `unstableFilter`.
- [ ] **Step 3: Write minimal implementation**
In each route, parse the shared boolean once:
```ts
const hideUnstableResults = reqUrl.searchParams.get("unstableFilter") === "true";
```
Update the core calls to forward the shared option.
In `packages/api-server/src/routes/facebook.ts`:
```ts
const items = await fetchFacebookItems(SEARCH_QUERY, 1, LOCATION, maxItems, {
hideUnstableResults,
});
```
In `packages/api-server/src/routes/ebay.ts`:
```ts
const items = await fetchEbayItems(
SEARCH_QUERY,
1,
{
minPrice,
maxPrice,
strictMode,
exclusions,
keywords,
buyItNowOnly,
canadaOnly,
},
{ hideUnstableResults },
);
```
In `packages/api-server/src/routes/kijiji.ts`:
```ts
const items = await fetchKijijiItems(
SEARCH_QUERY,
4,
"https://www.kijiji.ca",
searchOptions,
{},
{ hideUnstableResults },
);
```
Do not add any response wrapper logic in the routes; simply return whatever the core scraper returns so the default array path remains unchanged.
- [ ] **Step 4: Run tests to verify they pass**
Run: `bun test packages/api-server/test/routes.test.ts`
Expected: PASS, including existing cookie-parameter regression tests and the new unstable-mode forwarding assertions.
- [ ] **Step 5: Commit**
```bash
git add packages/api-server/src/routes/facebook.ts packages/api-server/src/routes/ebay.ts packages/api-server/src/routes/kijiji.ts packages/api-server/test/routes.test.ts
git commit -m "feat: expose unstable mode in api routes"
```
### Task 4: Document and forward unstable mode in MCP tools
**Files:**
- Modify: `packages/mcp-server/src/protocol/tools.ts`
- Modify: `packages/mcp-server/src/protocol/handler.ts`
- Modify: `packages/mcp-server/test/protocol.test.ts`
- [ ] **Step 1: Write the failing tests**
Extend `packages/mcp-server/test/protocol.test.ts` with metadata and forwarding coverage:
```ts
test("search tools document unstable listing mode", () => {
for (const toolName of ["search_kijiji", "search_facebook", "search_ebay"]) {
const tool = tools.find((entry) => entry.name === toolName);
expect(tool?.inputSchema.properties).toHaveProperty("unstableFilter");
expect(tool?.inputSchema.properties.unstableFilter.description).toContain(
"20% below the median",
);
expect(tool?.inputSchema.properties.unstableFilter.description).toContain(
"unstableResults",
);
}
});
test("search_facebook forwards unstableFilter to the API", async () => {
await handleMcpRequest(
new Request("http://localhost", {
method: "POST",
body: JSON.stringify({
jsonrpc: "2.0",
id: 1,
method: "tools/call",
params: {
name: "search_facebook",
arguments: {
query: "laptop",
unstableFilter: true,
},
},
}),
}),
);
const calledUrl = (global.fetch as ReturnType<typeof mock>).mock.calls[0]?.[0];
expect(String(calledUrl)).toContain("unstableFilter=true");
});
```
Mirror the forwarding assertion for `search_kijiji` and `search_ebay` in the same file.
- [ ] **Step 2: Run tests to verify they fail**
Run: `bun test packages/mcp-server/test/protocol.test.ts`
Expected: FAIL because the tools do not yet describe `unstableFilter` and the handler does not append it to API URLs.
- [ ] **Step 3: Write minimal implementation**
In `packages/mcp-server/src/protocol/tools.ts`, add the same optional property to all three tools:
```ts
unstableFilter: {
type: "boolean",
description:
"Optional: move listings priced more than 20% below the median into unstableResults instead of the main results. When enabled, the response shape changes from a plain list to an object with results and unstableResults.",
default: false,
},
```
In `packages/mcp-server/src/protocol/handler.ts`, append the shared flag in each search branch:
```ts
if (args.unstableFilter !== undefined) {
params.append("unstableFilter", args.unstableFilter.toString());
}
```
Add that snippet to the `search_kijiji`, `search_facebook`, and `search_ebay` branches.
- [ ] **Step 4: Run tests to verify they pass**
Run: `bun test packages/mcp-server/test/protocol.test.ts`
Expected: PASS, including the new tool-schema assertions and URL-forwarding assertions.
- [ ] **Step 5: Commit**
```bash
git add packages/mcp-server/src/protocol/tools.ts packages/mcp-server/src/protocol/handler.ts packages/mcp-server/test/protocol.test.ts
git commit -m "docs: expose unstable mode in mcp tools"
```
### Task 5: Verify the full cross-package feature end to end
**Files:**
- No code changes expected.
- [ ] **Step 1: Run the focused package tests**
Run: `bun test packages/core/test/unstable-listing-mode.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts packages/api-server/test/routes.test.ts packages/mcp-server/test/protocol.test.ts`
Expected: PASS with zero failing tests.
- [ ] **Step 2: Run the broader workspace verification**
Run: `bun run ci`
Expected: PASS with clean workspace validation.
- [ ] **Step 3: Commit verification-only follow-ups if needed**
If verification forced any tiny fixes, commit them immediately after the fix with a focused message, for example:
```bash
git add <exact files changed>
git commit -m "fix: align unstable mode verification"
```
If no files changed during verification, skip this commit step.
## Self-Review
- Spec coverage: shared classifier, all three scrapers, API exposure, MCP documentation, and tests are each mapped to a task.
- Placeholder scan: no `TODO`, `TBD`, or "write tests later" placeholders remain.
- Type consistency: the plan uses one shared flag name, `unstableFilter`, and one shared core option, `hideUnstableResults`, across all tasks.

View File

@@ -0,0 +1,148 @@
# Unstable Listing Mode Design
## Summary
Add an optional shared result mode across Facebook, eBay, and Kijiji that moves suspiciously cheap listings out of the main results into a separate `unstableResults` bucket.
Listings are considered unstable when their price is more than 20% below the median price of the scraper's priced search results.
## Goals
- Support the same optional unstable-listing mode across all scrapers.
- Keep current default scraper and route behavior unchanged unless the mode is enabled.
- Hide unstable listings from the main results while still returning them separately.
- Implement the rule once in shared core code instead of duplicating marketplace-specific logic.
- Document the option in MCP tool descriptions so callers can discover it.
## Non-Goals
- Adding marketplace-specific thresholds or heuristics.
- Re-ranking results beyond splitting stable and unstable buckets.
- Classifying free, missing-price, or invalid-price listings as unstable.
- Changing unrelated scraper parsing behavior.
## Current State
`packages/core` currently returns plain arrays from scraper search functions.
`packages/api-server` forwards those scraper results directly from marketplace routes.
`packages/mcp-server` documents search tools per marketplace, but does not expose or describe any result-stability mode.
There is no shared result-classification utility today.
Price filtering exists in some scrapers, but not a cross-marketplace median-based split.
## Chosen Approach
Use a shared core utility plus per-route and per-tool opt-in.
The shared utility will accept parsed listings, compute the median from valid positive prices, and split the data into `results` and `unstableResults`.
Each scraper will opt into that utility when the caller enables unstable-listing mode.
API routes and MCP tools will expose the same optional mode so the feature is consistently available everywhere scraper search is surfaced.
This keeps the heuristic centralized, minimizes duplicated logic, and preserves existing consumers by leaving the default path unchanged.
## Design
### Shared Core Classification
Add a shared utility in `packages/core` for listing stability classification.
Responsibilities:
- accept parsed listing arrays with `listingPrice.cents`
- ignore listings whose price is missing, non-numeric, or non-positive when computing the median
- compute the median price from valid priced listings
- classify listings as unstable when `listingPrice.cents < median * 0.8`
- return an object with:
- `results`: listings that remain in the main bucket
- `unstableResults`: listings moved out of the main bucket
Listings excluded from median computation because their price is missing or non-positive remain in `results` unchanged.
### Scraper Integration
Facebook, eBay, and Kijiji search entrypoints will gain the same optional mode flag.
Default behavior:
- return the current plain array result shape
Opt-in behavior:
- run the shared classification utility after parsing search results
- classify before final result limiting so unstable items do not consume main-result slots
- return an object shaped like:
```ts
{
results: ListingDetails[];
unstableResults: ListingDetails[];
}
```
Each scraper will use its existing concrete listing subtype for these arrays.
### API Surface
Marketplace API routes will expose an optional query parameter for unstable-listing mode.
Requirements:
- keep existing route responses unchanged when the parameter is absent or false
- when enabled, return the object payload with `results` and `unstableResults`
- use the same semantics across Facebook, eBay, and Kijiji routes
The exact parameter name should be consistent across routes and intentionally describe the behavior, for example `unstableFilter=true`.
### MCP Surface
Marketplace MCP tools will expose the same optional mode as an input field.
Tool descriptions should explicitly document:
- that the option is optional
- that it moves listings priced more than 20% below the median into `unstableResults`
- that enabling it changes the response shape from a plain list to an object with `results` and `unstableResults`
- that the behavior is available for Facebook, eBay, and Kijiji search tools
The wording should be aligned across all three tools so the feature reads as one shared capability.
### Error Handling
The unstable-listing mode should be best-effort and non-failing.
- If there are no valid positive prices, return all listings in `results` and an empty `unstableResults` array.
- If there is only one valid priced listing, do not classify it as unstable.
- Parsing failures remain governed by existing scraper behavior; the classification layer should not introduce new scraper-specific errors.
### Testing Strategy
Follow TDD.
Start with shared utility tests, then wire the option through scraper and route tests.
Coverage targets:
1. Median calculation for odd-sized valid price sets.
2. Median calculation for even-sized valid price sets.
3. Strict cutoff behavior where only listings with `price < median * 0.8` move to `unstableResults`.
4. Missing, invalid, zero, or negative prices are excluded from median computation and remain in `results`.
5. Default scraper behavior still returns plain arrays when the option is disabled.
6. Enabled scraper behavior returns `{ results, unstableResults }` for Facebook, eBay, and Kijiji.
7. API routes preserve existing response shapes by default and switch to the object payload only when enabled.
8. MCP tool metadata documents the new optional mode for all three marketplace search tools.
Verification target after implementation:
- `bun test packages/core/test`
- `bun test packages/api-server/test`
- `bun test packages/mcp-server/test` if MCP metadata tests exist or are added
- `bun run ci`
## Risks
- The optional mode introduces a union return shape for scraper callers, which can ripple into downstream TypeScript signatures.
- Applying classification before final limiting changes which items appear in the main bucket compared with a naive post-limit split.
- Kijiji and eBay may have different mixes of priced and unpriced results, so excluding non-positive prices from the median must remain explicit and tested.
## Rollout Notes
Land the shared classifier, scraper wiring, route wiring, tests, and MCP description updates together.
That avoids a partial rollout where the feature exists in one surface but is undocumented or inconsistent elsewhere.

View File

@@ -21,8 +21,20 @@ export async function ebayRoute(req: Request): Promise<Response> {
const minPriceParam = reqUrl.searchParams.get("minPrice"); const minPriceParam = reqUrl.searchParams.get("minPrice");
const minPrice = minPriceParam ? parseInt(minPriceParam, 10) : undefined; const minPrice = minPriceParam ? parseInt(minPriceParam, 10) : undefined;
if (minPriceParam && (Number.isNaN(minPrice) || minPrice < 0)) {
return Response.json(
{ message: "Invalid minPrice parameter" },
{ status: 400 },
);
}
const maxPriceParam = reqUrl.searchParams.get("maxPrice"); const maxPriceParam = reqUrl.searchParams.get("maxPrice");
const maxPrice = maxPriceParam ? parseInt(maxPriceParam, 10) : undefined; const maxPrice = maxPriceParam ? parseInt(maxPriceParam, 10) : undefined;
if (maxPriceParam && (Number.isNaN(maxPrice) || maxPrice < 0)) {
return Response.json(
{ message: "Invalid maxPrice parameter" },
{ status: 400 },
);
}
const strictMode = reqUrl.searchParams.get("strictMode") === "true"; const strictMode = reqUrl.searchParams.get("strictMode") === "true";
const buyItNowOnly = reqUrl.searchParams.get("buyItNowOnly") !== "false"; const buyItNowOnly = reqUrl.searchParams.get("buyItNowOnly") !== "false";
const canadaOnly = reqUrl.searchParams.get("canadaOnly") !== "false"; const canadaOnly = reqUrl.searchParams.get("canadaOnly") !== "false";
@@ -37,7 +49,15 @@ export async function ebayRoute(req: Request): Promise<Response> {
const maxItemsParam = reqUrl.searchParams.get("maxItems"); const maxItemsParam = reqUrl.searchParams.get("maxItems");
const maxItems = maxItemsParam ? parseInt(maxItemsParam, 10) : undefined; const maxItems = maxItemsParam ? parseInt(maxItemsParam, 10) : undefined;
const items = await fetchEbayItems(SEARCH_QUERY, 1, { if (maxItemsParam && (Number.isNaN(maxItems) || maxItems < 0)) {
return Response.json(
{ message: "Invalid maxItems parameter" },
{ status: 400 },
);
}
const hideUnstableResults =
reqUrl.searchParams.get("unstableFilter") === "true";
const opts = {
minPrice, minPrice,
maxPrice, maxPrice,
strictMode, strictMode,
@@ -45,16 +65,25 @@ export async function ebayRoute(req: Request): Promise<Response> {
keywords, keywords,
buyItNowOnly, buyItNowOnly,
canadaOnly, canadaOnly,
}); maxItems,
};
const items = hideUnstableResults
? await fetchEbayItems(SEARCH_QUERY, 1, opts, {
hideUnstableResults: true,
})
: await fetchEbayItems(SEARCH_QUERY, 1, opts);
const results = maxItems ? items.slice(0, maxItems) : items; const isEmpty = hideUnstableResults
? items.results.length === 0 && items.unstableResults.length === 0
: !items || items.length === 0;
if (!results || results.length === 0) if (isEmpty)
return Response.json( return Response.json(
{ message: "Search didn't return any results!" }, { message: "Search didn't return any results!" },
{ status: 404 }, { status: 404 },
); );
return Response.json(results, { status: 200 });
return Response.json(items, { status: 200 });
} catch (error) { } catch (error) {
console.error("eBay scraping error:", error); console.error("eBay scraping error:", error);
const errorMessage = const errorMessage =

View File

@@ -20,10 +20,27 @@ export async function facebookRoute(req: Request): Promise<Response> {
const LOCATION = reqUrl.searchParams.get("location") || "toronto"; const LOCATION = reqUrl.searchParams.get("location") || "toronto";
const maxItemsParam = reqUrl.searchParams.get("maxItems"); const maxItemsParam = reqUrl.searchParams.get("maxItems");
const maxItems = maxItemsParam ? parseInt(maxItemsParam, 10) : 25; const maxItems = maxItemsParam ? parseInt(maxItemsParam, 10) : 25;
if (maxItemsParam && (Number.isNaN(maxItems) || maxItems < 0)) {
return Response.json(
{ message: "Invalid maxItems parameter" },
{ status: 400 },
);
}
const hideUnstableResults =
reqUrl.searchParams.get("unstableFilter") === "true";
try { try {
const items = await fetchFacebookItems(SEARCH_QUERY, 1, LOCATION, maxItems); const items = hideUnstableResults
if (!items || items.length === 0) ? await fetchFacebookItems(SEARCH_QUERY, 1, LOCATION, maxItems, {
hideUnstableResults: true,
})
: await fetchFacebookItems(SEARCH_QUERY, 1, LOCATION, maxItems);
const isEmpty = hideUnstableResults
? items.results.length === 0 && items.unstableResults.length === 0
: !items || items.length === 0;
if (isEmpty)
return Response.json( return Response.json(
{ message: "Search didn't return any results!" }, { message: "Search didn't return any results!" },
{ status: 404 }, { status: 404 },

View File

@@ -19,25 +19,45 @@ export async function kijijiRoute(req: Request): Promise<Response> {
const maxPagesParam = reqUrl.searchParams.get("maxPages"); const maxPagesParam = reqUrl.searchParams.get("maxPages");
const maxPages = maxPagesParam ? parseInt(maxPagesParam, 10) : 5; const maxPages = maxPagesParam ? parseInt(maxPagesParam, 10) : 5;
if (maxPagesParam && (Number.isNaN(maxPages) || maxPages < 0)) {
return Response.json(
{ message: "Invalid maxPages parameter" },
{ status: 400 },
);
}
const priceMinParam = reqUrl.searchParams.get("priceMin"); const priceMinParam = reqUrl.searchParams.get("priceMin");
const priceMin = priceMinParam ? parseInt(priceMinParam, 10) : undefined; const priceMin = priceMinParam ? parseInt(priceMinParam, 10) : undefined;
if (priceMinParam && (Number.isNaN(priceMin) || priceMin < 0)) {
return Response.json(
{ message: "Invalid priceMin parameter" },
{ status: 400 },
);
}
const priceMaxParam = reqUrl.searchParams.get("priceMax"); const priceMaxParam = reqUrl.searchParams.get("priceMax");
const priceMax = priceMaxParam ? parseInt(priceMaxParam, 10) : undefined; const priceMax = priceMaxParam ? parseInt(priceMaxParam, 10) : undefined;
if (priceMaxParam && (Number.isNaN(priceMax) || priceMax < 0)) {
return Response.json(
{ message: "Invalid priceMax parameter" },
{ status: 400 },
);
}
const hideUnstableResults =
reqUrl.searchParams.get("unstableFilter") === "true";
const searchOptions = { const searchOptions = {
location: reqUrl.searchParams.get("location") || undefined, location: reqUrl.searchParams.get("location") || undefined,
category: reqUrl.searchParams.get("category") || undefined, category: reqUrl.searchParams.get("category") || undefined,
keywords: reqUrl.searchParams.get("keywords") || undefined, keywords: reqUrl.searchParams.get("keywords") || undefined,
sortBy: reqUrl.searchParams.get("sortBy") as sortBy: (reqUrl.searchParams.get("sortBy") as
| "relevancy" | "relevancy"
| "date" | "date"
| "price" | "price"
| "distance" | "distance"
| undefined, | undefined) || undefined,
sortOrder: reqUrl.searchParams.get("sortOrder") as sortOrder: (reqUrl.searchParams.get("sortOrder") as
| "desc" | "desc"
| "asc" | "asc"
| undefined, | undefined) || undefined,
maxPages, maxPages,
priceMin, priceMin,
priceMax, priceMax,
@@ -45,14 +65,28 @@ export async function kijijiRoute(req: Request): Promise<Response> {
}; };
try { try {
const items = await fetchKijijiItems( const items = hideUnstableResults
SEARCH_QUERY, ? await fetchKijijiItems(
4, // 4 requests per second for faster scraping SEARCH_QUERY,
"https://www.kijiji.ca", 4, // 4 requests per second for faster scraping
searchOptions, "https://www.kijiji.ca",
{}, searchOptions,
); {},
if (!items) { hideUnstableResults: true },
)
: await fetchKijijiItems(
SEARCH_QUERY,
4, // 4 requests per second for faster scraping
"https://www.kijiji.ca",
searchOptions,
{},
);
const isEmpty = hideUnstableResults
? items.results.length === 0 && items.unstableResults.length === 0
: !items || items.length === 0;
if (isEmpty)
return Response.json( return Response.json(
{ message: "Search didn't return any results!" }, { message: "Search didn't return any results!" },
{ status: 404 }, { status: 404 },

View File

@@ -2,10 +2,12 @@ import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
const fetchFacebookItems = mock(() => Promise.resolve([{ title: "item" }])); const fetchFacebookItems = mock(() => Promise.resolve([{ title: "item" }]));
const fetchEbayItems = mock(() => Promise.resolve([{ title: "item" }])); const fetchEbayItems = mock(() => Promise.resolve([{ title: "item" }]));
const fetchKijijiItems = mock(() => Promise.resolve([{ title: "item" }]));
mock.module("@marketplace-scrapers/core", () => ({ mock.module("@marketplace-scrapers/core", () => ({
fetchFacebookItems, fetchFacebookItems,
fetchEbayItems, fetchEbayItems,
fetchKijijiItems,
})); }));
describe("API routes", () => { describe("API routes", () => {
@@ -18,11 +20,10 @@ describe("API routes", () => {
fetchEbayItems.mockImplementation(() => fetchEbayItems.mockImplementation(() =>
Promise.resolve([{ title: "item" }]), Promise.resolve([{ title: "item" }]),
); );
}); fetchKijijiItems.mockReset();
fetchKijijiItems.mockImplementation(() =>
afterEach(() => { Promise.resolve([{ title: "item" }]),
fetchFacebookItems.mockClear(); );
fetchEbayItems.mockClear();
}); });
test("facebookRoute ignores cookies query parameter", async () => { test("facebookRoute ignores cookies query parameter", async () => {
@@ -56,4 +57,608 @@ describe("API routes", () => {
canadaOnly: true, canadaOnly: true,
}); });
}); });
test("kijijiRoute passes cookies query parameter", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&cookies=s%3D1&maxPages=3",
),
);
expect(fetchKijijiItems).toHaveBeenCalledWith(
"laptop",
4,
"https://www.kijiji.ca",
{
location: undefined,
category: undefined,
keywords: undefined,
sortBy: undefined,
sortOrder: undefined,
maxPages: 3,
priceMin: undefined,
priceMax: undefined,
cookies: "s=1",
},
{},
);
});
test("facebookRoute forwards unstableFilter=true to core", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
fetchFacebookItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "item" }],
unstableResults: [],
}),
);
await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&location=toronto&maxItems=3&unstableFilter=true",
),
);
expect(fetchFacebookItems).toHaveBeenCalledWith("laptop", 1, "toronto", 3, {
hideUnstableResults: true,
});
});
test("ebayRoute forwards unstableFilter=true to core", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
fetchEbayItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "item" }],
unstableResults: [],
}),
);
await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&buyItNowOnly=true&unstableFilter=true",
),
);
expect(fetchEbayItems).toHaveBeenCalledWith("laptop", 1, {
minPrice: undefined,
maxPrice: undefined,
strictMode: false,
exclusions: [],
keywords: ["laptop"],
buyItNowOnly: true,
canadaOnly: true,
}, {
hideUnstableResults: true,
});
});
test("kijijiRoute forwards unstableFilter=true to core", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
fetchKijijiItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "item" }],
unstableResults: [],
}),
);
await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=5&unstableFilter=true",
),
);
expect(fetchKijijiItems).toHaveBeenCalledWith(
"laptop",
4,
"https://www.kijiji.ca",
{
location: undefined,
category: undefined,
keywords: undefined,
sortBy: undefined,
sortOrder: undefined,
maxPages: 5,
priceMin: undefined,
priceMax: undefined,
cookies: undefined,
},
{},
{
hideUnstableResults: true,
},
);
});
test("facebookRoute does not forward unstableFilter when absent", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&location=toronto&maxItems=3",
),
);
expect(fetchFacebookItems).toHaveBeenCalledWith("laptop", 1, "toronto", 3);
});
test("facebookRoute does not forward unstableFilter when false", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&location=toronto&maxItems=3&unstableFilter=false",
),
);
expect(fetchFacebookItems).toHaveBeenCalledWith("laptop", 1, "toronto", 3);
});
test("ebayRoute does not forward unstableFilter when absent", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&buyItNowOnly=true",
),
);
expect(fetchEbayItems).toHaveBeenCalledWith("laptop", 1, {
minPrice: undefined,
maxPrice: undefined,
strictMode: false,
exclusions: [],
keywords: ["laptop"],
buyItNowOnly: true,
canadaOnly: true,
});
});
test("ebayRoute does not forward unstableFilter when false", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&buyItNowOnly=true&unstableFilter=false",
),
);
expect(fetchEbayItems).toHaveBeenCalledWith("laptop", 1, {
minPrice: undefined,
maxPrice: undefined,
strictMode: false,
exclusions: [],
keywords: ["laptop"],
buyItNowOnly: true,
canadaOnly: true,
});
});
test("kijijiRoute does not forward unstableFilter when absent", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=5",
),
);
expect(fetchKijijiItems).toHaveBeenCalledWith(
"laptop",
4,
"https://www.kijiji.ca",
{
location: undefined,
category: undefined,
keywords: undefined,
sortBy: undefined,
sortOrder: undefined,
maxPages: 5,
priceMin: undefined,
priceMax: undefined,
cookies: undefined,
},
{},
);
});
test("kijijiRoute does not forward unstableFilter when false", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=5&unstableFilter=false",
),
);
expect(fetchKijijiItems).toHaveBeenCalledWith(
"laptop",
4,
"https://www.kijiji.ca",
{
location: undefined,
category: undefined,
keywords: undefined,
sortBy: undefined,
sortOrder: undefined,
maxPages: 5,
priceMin: undefined,
priceMax: undefined,
cookies: undefined,
},
{},
);
});
test("facebookRoute returns bucket shape when unstableFilter is enabled", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
fetchFacebookItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "a" }],
unstableResults: [{ title: "b" }],
}),
);
const response = await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&location=toronto&maxItems=3&unstableFilter=true",
),
);
const body = await response.json();
expect(body.results).toHaveLength(1);
expect(body.unstableResults).toHaveLength(1);
expect(body.results[0].title).toBe("a");
expect(body.unstableResults[0].title).toBe("b");
});
test("kijijiRoute returns bucket shape when unstableFilter is enabled", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
fetchKijijiItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "a" }],
unstableResults: [{ title: "b" }],
}),
);
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=5&unstableFilter=true",
),
);
const body = await response.json();
expect(body.results).toHaveLength(1);
expect(body.unstableResults).toHaveLength(1);
expect(body.results[0].title).toBe("a");
expect(body.unstableResults[0].title).toBe("b");
});
test("facebookRoute returns 404 when unstable results are empty", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
fetchFacebookItems.mockImplementation(() =>
Promise.resolve({
results: [],
unstableResults: [],
}),
);
const response = await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&location=toronto&maxItems=3&unstableFilter=true",
),
);
expect(response.status).toBe(404);
const body = await response.json();
expect(body.message).toBe("Search didn't return any results!");
});
test("kijijiRoute returns 404 when unstable results are empty", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
fetchKijijiItems.mockImplementation(() =>
Promise.resolve({
results: [],
unstableResults: [],
}),
);
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=5&unstableFilter=true",
),
);
expect(response.status).toBe(404);
const body = await response.json();
expect(body.message).toBe("Search didn't return any results!");
});
test("ebayRoute forwards maxItems to core in default mode", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
fetchEbayItems.mockImplementation(() =>
Promise.resolve([{ title: "a" }]),
);
await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&maxItems=2",
),
);
expect(fetchEbayItems).toHaveBeenCalledWith("laptop", 1, expect.objectContaining({ maxItems: 2 }));
});
test("ebayRoute passes through scraper payload unchanged in unstable mode", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
fetchEbayItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "a" }, { title: "b" }, { title: "c" }],
unstableResults: [{ title: "d" }, { title: "e" }],
}),
);
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&unstableFilter=true&maxItems=4",
),
);
const body = await response.json();
expect(body.results).toHaveLength(3);
expect(body.unstableResults).toHaveLength(2);
expect(body.results[0].title).toBe("a");
expect(body.unstableResults[0].title).toBe("d");
expect(fetchEbayItems).toHaveBeenCalledWith("laptop", 1, expect.objectContaining({ maxItems: 4 }), {
hideUnstableResults: true,
});
});
test("ebayRoute forwards maxItems to core in unstable mode", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
fetchEbayItems.mockImplementation(() =>
Promise.resolve({
results: [{ title: "a" }],
unstableResults: [{ title: "b" }],
}),
);
await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&unstableFilter=true&maxItems=2",
),
);
expect(fetchEbayItems).toHaveBeenCalledWith("laptop", 1, expect.objectContaining({ maxItems: 2 }), {
hideUnstableResults: true,
});
});
test("ebayRoute returns 404 when unstable results are empty", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
fetchEbayItems.mockImplementation(() =>
Promise.resolve({
results: [],
unstableResults: [],
}),
);
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&unstableFilter=true",
),
);
expect(response.status).toBe(404);
const body = await response.json();
expect(body.message).toBe("Search didn't return any results!");
});
test("ebayRoute returns 400 for invalid maxItems", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&maxItems=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxItems parameter");
});
test("facebookRoute returns 400 for invalid maxItems", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
const response = await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&maxItems=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxItems parameter");
});
test("ebayRoute returns 400 for invalid minPrice", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&minPrice=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid minPrice parameter");
});
test("ebayRoute returns 400 for invalid maxPrice", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&maxPrice=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxPrice parameter");
});
test("kijijiRoute returns 400 for invalid maxPages", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxPages parameter");
});
test("kijijiRoute returns 400 for invalid priceMin", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&priceMin=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid priceMin parameter");
});
test("kijijiRoute returns 400 for invalid priceMax", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&priceMax=abc",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid priceMax parameter");
});
test("facebookRoute returns 400 for negative maxItems", async () => {
const { facebookRoute } = await import("../src/routes/facebook");
const response = await facebookRoute(
new Request(
"http://localhost/api/facebook?q=laptop&maxItems=-1",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxItems parameter");
});
test("ebayRoute returns 400 for negative maxItems", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&maxItems=-1",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxItems parameter");
});
test("ebayRoute returns 400 for negative minPrice", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&minPrice=-5",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid minPrice parameter");
});
test("ebayRoute returns 400 for negative maxPrice", async () => {
const { ebayRoute } = await import("../src/routes/ebay");
const response = await ebayRoute(
new Request(
"http://localhost/api/ebay?q=laptop&maxPrice=-10",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxPrice parameter");
});
test("kijijiRoute returns 400 for negative maxPages", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&maxPages=-2",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid maxPages parameter");
});
test("kijijiRoute returns 400 for negative priceMin", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&priceMin=-5",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid priceMin parameter");
});
test("kijijiRoute returns 400 for negative priceMax", async () => {
const { kijijiRoute } = await import("../src/routes/kijiji");
const response = await kijijiRoute(
new Request(
"http://localhost/api/kijiji?q=laptop&priceMax=-10",
),
);
expect(response.status).toBe(400);
const body = await response.json();
expect(body.message).toBe("Invalid priceMax parameter");
});
}); });

View File

@@ -41,3 +41,4 @@ export * from "./utils/cookies";
export * from "./utils/delay"; export * from "./utils/delay";
export * from "./utils/format"; export * from "./utils/format";
export * from "./utils/http"; export * from "./utils/http";
export * from "./utils/unstable";

View File

@@ -1,4 +1,10 @@
import { parseHTML } from "linkedom"; import { parseHTML } from "linkedom";
import type {
HTMLString,
UnstableListingBuckets,
UnstableListingModeOptions,
} from "../types/common";
import { classifyUnstableListings } from "../utils/unstable";
import { import {
type CookieConfig, type CookieConfig,
ensureCookies, ensureCookies,
@@ -32,6 +38,18 @@ export interface EbayListingDetails {
address?: string | null; address?: string | null;
} }
const EBAY_PRICE_TEXT_RE = /^(?:\s*(?:CA|C|US)\s*\$|\s*[$£¥])/u;
function canonicalizeEbayItemUrl(url: string): string {
try {
const parsed = new URL(url, "https://www.ebay.ca");
const match = parsed.pathname.match(/\/itm\/(?:[^/?#]+\/)?\d+/);
return match ? `${parsed.origin}${match[0]}` : `${parsed.origin}${parsed.pathname}`;
} catch {
return url;
}
}
// ----------------------------- Utilities ----------------------------- // ----------------------------- Utilities -----------------------------
/** /**
@@ -56,7 +74,7 @@ function parseEbayPrice(
const cents = Math.round(dollars * 100); const cents = Math.round(dollars * 100);
// Extract currency - look for common formats like "CAD", "USD", "C $", "$CA", etc. // Extract currency - look for common formats like "CAD", "USD", "C $", "$CA", etc.
let currency = "USD"; // Default let currency = "CAD"; // Default for ebay.ca
if ( if (
cleaned.toUpperCase().includes("CAD") || cleaned.toUpperCase().includes("CAD") ||
@@ -64,8 +82,18 @@ function parseEbayPrice(
cleaned.includes("C $") cleaned.includes("C $")
) { ) {
currency = "CAD"; currency = "CAD";
} else if (cleaned.toUpperCase().includes("USD") || cleaned.includes("$")) { } else if (
cleaned.toUpperCase().includes("USD") ||
cleaned.toUpperCase().includes("US $") ||
cleaned.toUpperCase().includes("US$")
) {
currency = "USD"; currency = "USD";
} else if (cleaned.includes("£")) {
currency = "GBP";
} else if (cleaned.includes("€")) {
currency = "EUR";
} else if (cleaned.includes("¥")) {
currency = "JPY";
} }
return { cents, currency }; return { cents, currency };
@@ -95,6 +123,7 @@ function parseEbayListings(
): EbayListingDetails[] { ): EbayListingDetails[] {
const { document } = parseHTML(htmlString); const { document } = parseHTML(htmlString);
const results: EbayListingDetails[] = []; const results: EbayListingDetails[] = [];
const seenUrls = new Set<string>();
// Find all listing links by looking for eBay item URLs (/itm/) // Find all listing links by looking for eBay item URLs (/itm/)
const linkElements = document.querySelectorAll('a[href*="itm/"]'); const linkElements = document.querySelectorAll('a[href*="itm/"]');
@@ -109,9 +138,12 @@ function parseEbayListings(
if (!href.startsWith("http")) { if (!href.startsWith("http")) {
href = href.startsWith("//") href = href.startsWith("//")
? `https:${href}` ? `https:${href}`
: `https://www.ebay.com${href}`; : `https://www.ebay.ca${href}`;
} }
const canonicalUrl = canonicalizeEbayItemUrl(href);
if (seenUrls.has(canonicalUrl)) continue;
// Find the container - go up several levels to find the item container // Find the container - go up several levels to find the item container
// Modern eBay uses complex nested structures (often 5-10 levels deep) // Modern eBay uses complex nested structures (often 5-10 levels deep)
let container: Element | null = linkElement; let container: Element | null = linkElement;
@@ -173,16 +205,18 @@ function parseEbayListings(
"opens in a new window or tab", "opens in a new window or tab",
]; ];
let shortened = false;
for (const uiString of uiStrings) { for (const uiString of uiStrings) {
const uiIndex = title.indexOf(uiString); const uiIndex = title.indexOf(uiString);
if (uiIndex !== -1) { if (uiIndex !== -1) {
title = title.substring(0, uiIndex).trim(); title = title.substring(0, uiIndex).trim();
shortened = true;
break; // Only remove one UI string per title break; // Only remove one UI string per title
} }
} }
// If the title became empty or too short after cleaning, skip this item // If the title was shortened by UI cleaning and became too short, skip this item
if (title.length < 10) { if (shortened && title.length < 10) {
continue; continue;
} }
} }
@@ -215,7 +249,6 @@ function parseEbayListings(
!text.includes("core") && !text.includes("core") &&
!text.includes("ram") && !text.includes("ram") &&
!text.includes("ssd") && !text.includes("ssd") &&
!/\d{4}/.test(text) && // Avoid years like "2024"
!text.includes('"') // Avoid measurements !text.includes('"') // Avoid measurements
) { ) {
priceElement = el; priceElement = el;
@@ -244,9 +277,8 @@ function parseEbayListings(
const text = el.textContent?.trim(); const text = el.textContent?.trim();
if ( if (
text && text &&
/^\s*[$£¥]/u.test(text) && EBAY_PRICE_TEXT_RE.test(text) &&
text.length < 50 && text.length < 50
!/\d{4}/.test(text)
) { ) {
actualPrices.push(el); actualPrices.push(el);
} }
@@ -323,6 +355,7 @@ function parseEbayListings(
}; };
results.push(listing); results.push(listing);
seenUrls.add(canonicalUrl);
} catch (err) { } catch (err) {
console.warn(`Error parsing eBay listing: ${err}`); console.warn(`Error parsing eBay listing: ${err}`);
} }
@@ -350,6 +383,36 @@ async function loadEbayCookies(): Promise<string | undefined> {
// ----------------------------- Main ----------------------------- // ----------------------------- Main -----------------------------
export default async function fetchEbayItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND: number | undefined,
opts: {
minPrice?: number;
maxPrice?: number;
strictMode?: boolean;
exclusions?: string[];
keywords?: string[];
buyItNowOnly?: boolean;
canadaOnly?: boolean;
maxItems?: number;
} | undefined,
unstableMode: { hideUnstableResults: true },
): Promise<UnstableListingBuckets<EbayListingDetails>>;
export default async function fetchEbayItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND?: number,
opts?: {
minPrice?: number;
maxPrice?: number;
strictMode?: boolean;
exclusions?: string[];
keywords?: string[];
buyItNowOnly?: boolean;
canadaOnly?: boolean;
maxItems?: number;
},
unstableMode?: UnstableListingModeOptions,
): Promise<EbayListingDetails[]>;
export default async function fetchEbayItems( export default async function fetchEbayItems(
SEARCH_QUERY: string, SEARCH_QUERY: string,
REQUESTS_PER_SECOND = 1, REQUESTS_PER_SECOND = 1,
@@ -361,8 +424,12 @@ export default async function fetchEbayItems(
keywords?: string[]; keywords?: string[];
buyItNowOnly?: boolean; buyItNowOnly?: boolean;
canadaOnly?: boolean; canadaOnly?: boolean;
maxItems?: number;
} = {}, } = {},
unstableMode: UnstableListingModeOptions = {},
) { ) {
const requestsPerSecond = REQUESTS_PER_SECOND > 0 ? REQUESTS_PER_SECOND : 1;
const { const {
minPrice = 0, minPrice = 0,
maxPrice = Number.MAX_SAFE_INTEGER, maxPrice = Number.MAX_SAFE_INTEGER,
@@ -371,8 +438,22 @@ export default async function fetchEbayItems(
keywords = [SEARCH_QUERY], // Default to search query if no keywords provided keywords = [SEARCH_QUERY], // Default to search query if no keywords provided
buyItNowOnly = true, buyItNowOnly = true,
canadaOnly = true, canadaOnly = true,
maxItems,
} = opts; } = opts;
const finalizeResults = (
listings: EbayListingDetails[],
): EbayListingDetails[] | UnstableListingBuckets<EbayListingDetails> => {
const limitedListings =
maxItems !== undefined ? listings.slice(0, maxItems) : listings;
if (!unstableMode.hideUnstableResults) {
return limitedListings;
}
return classifyUnstableListings(limitedListings);
};
const cookies = await loadEbayCookies(); const cookies = await loadEbayCookies();
// Build eBay search URL - use Canadian site, Buy It Now filter, and Canada-only preference // Build eBay search URL - use Canadian site, Buy It Now filter, and Canada-only preference
@@ -392,7 +473,7 @@ export default async function fetchEbayItems(
const searchUrl = `https://www.ebay.ca/sch/i.html?${urlParams.toString()}`; const searchUrl = `https://www.ebay.ca/sch/i.html?${urlParams.toString()}`;
const DELAY_MS = Math.max(1, Math.floor(1000 / REQUESTS_PER_SECOND)); const DELAY_MS = Math.max(1, Math.floor(1000 / requestsPerSecond));
console.log(`Fetching eBay search: ${searchUrl}`); console.log(`Fetching eBay search: ${searchUrl}`);
@@ -448,17 +529,17 @@ export default async function fetchEbayItems(
// Filter by price range (additional safety check) // Filter by price range (additional safety check)
const filteredListings = listings.filter((listing) => { const filteredListings = listings.filter((listing) => {
const cents = listing.listingPrice?.cents; const cents = listing.listingPrice?.cents;
return cents && cents >= minPrice && cents <= maxPrice; return typeof cents === "number" && cents >= minPrice && cents <= maxPrice;
}); });
console.log(`Parsed ${filteredListings.length} eBay listings.`); console.log(`Parsed ${filteredListings.length} eBay listings.`);
return filteredListings; return finalizeResults(filteredListings);
} catch (err) { } catch (err) {
if (err instanceof HttpError) { if (err instanceof HttpError) {
console.error( console.error(
`Failed to fetch eBay search (${err.status}): ${err.message}`, `Failed to fetch eBay search (${err.status}): ${err.message}`,
); );
return []; return finalizeResults([]);
} }
throw err; throw err;
} }

View File

@@ -1,6 +1,11 @@
import cliProgress from "cli-progress"; import cliProgress from "cli-progress";
import { parseHTML } from "linkedom"; import { parseHTML } from "linkedom";
import type { HTMLString } from "../types/common"; import type {
HTMLString,
UnstableListingBuckets,
UnstableListingModeOptions,
} from "../types/common";
import { classifyUnstableListings } from "../utils/unstable";
import { import {
type Cookie, type Cookie,
type CookieConfig, type CookieConfig,
@@ -286,6 +291,7 @@ async function fetchHtml(
): Promise<{ html: HTMLString; responseUrl: string }> { ): Promise<{ html: HTMLString; responseUrl: string }> {
const maxRetries = opts?.maxRetries ?? 3; const maxRetries = opts?.maxRetries ?? 3;
const retryBaseMs = opts?.retryBaseMs ?? 500; const retryBaseMs = opts?.retryBaseMs ?? 500;
let lastRateLimitError: HttpError | null = null;
for (let attempt = 0; attempt <= maxRetries; attempt++) { for (let attempt = 0; attempt <= maxRetries; attempt++) {
try { try {
@@ -321,12 +327,20 @@ async function fetchHtml(
if (!res.ok) { if (!res.ok) {
// Respect 429 reset if provided // Respect 429 reset if provided
if (res.status === 429) { if (res.status === 429) {
lastRateLimitError = new HttpError(
`Request failed with status ${res.status}`,
res.status,
url,
);
const resetSeconds = rateLimitReset const resetSeconds = rateLimitReset
? Number(rateLimitReset) ? Number(rateLimitReset)
: Number.NaN; : Number.NaN;
const waitMs = Number.isFinite(resetSeconds) const waitMs = Number.isFinite(resetSeconds)
? Math.max(0, resetSeconds * 1000) ? Math.max(0, resetSeconds * 1000)
: (attempt + 1) * retryBaseMs; : (attempt + 1) * retryBaseMs;
if (attempt >= maxRetries) {
throw lastRateLimitError;
}
await delay(waitMs); await delay(waitMs);
continue; continue;
} }
@@ -356,12 +370,15 @@ async function fetchHtml(
await delay(DELAY_MS); await delay(DELAY_MS);
return { html, responseUrl: res.url || url }; return { html, responseUrl: res.url || url };
} catch (err) { } catch (err) {
if (err instanceof HttpError) {
throw err;
}
if (attempt >= maxRetries) throw err; if (attempt >= maxRetries) throw err;
await delay((attempt + 1) * retryBaseMs); await delay((attempt + 1) * retryBaseMs);
} }
} }
throw new Error("Exhausted retries without response"); throw lastRateLimitError ?? new Error("Exhausted retries without response");
} }
// ----------------------------- Parsing ----------------------------- // ----------------------------- Parsing -----------------------------
@@ -873,35 +890,25 @@ export function parseFacebookAds(
: priceObj.amount; : priceObj.amount;
cents = Math.round(dollars * 100); cents = Math.round(dollars * 100);
} else if (priceObj.amount_with_offset_in_currency != null) { } else if (priceObj.amount_with_offset_in_currency != null) {
// Fallback: try to extract cents from amount_with_offset_in_currency if (!priceObj.formatted_amount) continue;
// This appears to use some exchange rate/multiplier format
const encodedAmount = Number(priceObj.amount_with_offset_in_currency); const match = priceObj.formatted_amount.match(/[\d,]+\.?\d*/);
if (!Number.isNaN(encodedAmount) && encodedAmount > 0) { if (!match) continue;
// Estimate roughly - this field doesn't contain real cents
// Use formatted_amount to get the actual dollar amount const dollars = Number.parseFloat(match[0].replace(/,/g, ""));
if (priceObj.formatted_amount) { if (Number.isNaN(dollars)) continue;
const match = priceObj.formatted_amount.match(/[\d,]+\.?\d*/);
if (match) { cents = Math.round(dollars * 100);
const dollars = Number.parseFloat(match[0].replace(",", "")); } else if (
if (!Number.isNaN(dollars)) { typeof priceObj.formatted_amount === "string" &&
cents = Math.round(dollars * 100); priceObj.formatted_amount.toUpperCase() === "FREE"
} else { ) {
cents = encodedAmount; // fallback cents = 0;
}
} else {
cents = encodedAmount; // fallback
}
} else {
cents = encodedAmount; // fallback
}
} else {
continue; // Invalid price
}
} else { } else {
continue; // No price available continue; // No price available
} }
if (!Number.isFinite(cents) || cents <= 0) continue; if (!Number.isFinite(cents) || cents < 0) continue;
// Extract address from location data if available // Extract address from location data if available
const cityName = const cityName =
@@ -960,7 +967,9 @@ export function parseFacebookAds(
}; };
results.push(listingDetails); results.push(listingDetails);
} catch {} } catch (error) {
console.warn("Failed to parse Facebook ad:", error);
}
} }
return results; return results;
@@ -980,13 +989,13 @@ export function parseFacebookItem(
const url = `https://www.facebook.com/marketplace/item/${item.id}`; const url = `https://www.facebook.com/marketplace/item/${item.id}`;
// Extract price information // Extract price information
let cents = 0; let cents: number | undefined;
let currency = "CAD"; // Default let currency = "CAD"; // Default
let amountFormatted = item.formatted_price?.text || "FREE"; let amountFormatted = item.formatted_price?.text;
if (item.listing_price) { if (item.listing_price) {
currency = item.listing_price.currency || "CAD"; currency = item.listing_price.currency || "CAD";
if (item.listing_price.amount && item.listing_price.amount !== "0.00") { if (item.listing_price.amount != null) {
const amount = Number.parseFloat(item.listing_price.amount); const amount = Number.parseFloat(item.listing_price.amount);
if (!Number.isNaN(amount)) { if (!Number.isNaN(amount)) {
cents = Math.round(amount * 100); cents = Math.round(amount * 100);
@@ -1033,6 +1042,13 @@ export function parseFacebookItem(
listingType = "vehicle"; listingType = "vehicle";
} }
if (cents == null || !amountFormatted) {
if (!listingStatus || listingStatus === "ACTIVE") return null;
cents = 0;
amountFormatted = item.formatted_price?.text || "PRICE_UNAVAILABLE";
}
const listingDetails: FacebookListingDetails = { const listingDetails: FacebookListingDetails = {
url, url,
title, title,
@@ -1060,12 +1076,43 @@ export function parseFacebookItem(
// ----------------------------- Main ----------------------------- // ----------------------------- Main -----------------------------
export default async function fetchFacebookItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND: number | undefined,
LOCATION: string | undefined,
MAX_ITEMS: number | undefined,
unstableMode: { hideUnstableResults: true },
): Promise<UnstableListingBuckets<FacebookListingDetails>>;
export default async function fetchFacebookItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND?: number,
LOCATION?: string,
MAX_ITEMS?: number,
unstableMode?: UnstableListingModeOptions,
): Promise<FacebookListingDetails[]>;
export default async function fetchFacebookItems( export default async function fetchFacebookItems(
SEARCH_QUERY: string, SEARCH_QUERY: string,
REQUESTS_PER_SECOND = 1, REQUESTS_PER_SECOND = 1,
LOCATION = "toronto", LOCATION = "toronto",
MAX_ITEMS = 25, MAX_ITEMS = 25,
unstableMode: UnstableListingModeOptions = {},
) { ) {
const requestsPerSecond = REQUESTS_PER_SECOND > 0 ? REQUESTS_PER_SECOND : 1;
const finalizeResults = (
listings: FacebookListingDetails[],
): FacebookListingDetails[] | UnstableListingBuckets<FacebookListingDetails> => {
if (!unstableMode.hideUnstableResults) {
return listings.slice(0, MAX_ITEMS);
}
const classified = classifyUnstableListings(listings);
return {
results: classified.results.slice(0, MAX_ITEMS),
unstableResults: classified.unstableResults,
};
};
const cookies = await ensureFacebookCookies(); const cookies = await ensureFacebookCookies();
// Format cookies for HTTP header // Format cookies for HTTP header
@@ -1077,7 +1124,7 @@ export default async function fetchFacebookItems(
); );
} }
const DELAY_MS = Math.max(1, Math.floor(1000 / REQUESTS_PER_SECOND)); const DELAY_MS = Math.max(1, Math.floor(1000 / requestsPerSecond));
// Encode search query for URL // Encode search query for URL
const encodedQuery = encodeURIComponent(SEARCH_QUERY); const encodedQuery = encodeURIComponent(SEARCH_QUERY);
@@ -1114,7 +1161,7 @@ export default async function fetchFacebookItems(
"This might indicate invalid or expired cookies. Update FACEBOOK_COOKIE with a fresh raw Cookie header string.", "This might indicate invalid or expired cookies. Update FACEBOOK_COOKIE with a fresh raw Cookie header string.",
); );
} }
return []; return finalizeResults([]);
} }
throw err; throw err;
} }
@@ -1122,49 +1169,49 @@ export default async function fetchFacebookItems(
const classification = classifyFacebookResponse(searchHtml, searchResponseUrl); const classification = classifyFacebookResponse(searchHtml, searchResponseUrl);
if (classification.authGated) { if (classification.authGated) {
console.warn("Facebook marketplace search redirected to login. Cookies may be expired."); console.warn("Facebook marketplace search redirected to login. Cookies may be expired.");
return []; return finalizeResults([]);
} }
if (classification.unavailable) { if (classification.unavailable) {
console.warn("Facebook marketplace search returned an unavailable route."); console.warn("Facebook marketplace search returned an unavailable route.");
return []; return finalizeResults([]);
} }
if (classification.kind !== "search") { if (classification.kind !== "search") {
console.warn( console.warn(
`Facebook marketplace search returned unexpected route kind: ${classification.kind}.`, `Facebook marketplace search returned unexpected route kind: ${classification.kind}.`,
); );
return []; return finalizeResults([]);
} }
const ads = extractFacebookMarketplaceData(searchHtml); const ads = extractFacebookMarketplaceData(searchHtml);
if (!ads || ads.length === 0) { if (!ads || ads.length === 0) {
console.warn("No ads parsed from Facebook marketplace page."); console.warn("No ads parsed from Facebook marketplace page.");
return []; return finalizeResults([]);
} }
console.log(`\nFound ${ads.length} raw ads. Processing...`); console.log(`\nFound ${ads.length} raw ads. Processing...`);
const progressBar = new cliProgress.SingleBar( const isTTY = process.stdout?.isTTY ?? false;
{}, const progressBar = isTTY
cliProgress.Presets.shades_classic, ? new cliProgress.SingleBar({}, cliProgress.Presets.shades_classic)
); : null;
const totalProgress = ads.length; const totalProgress = ads.length;
const currentProgress = 0; progressBar?.start(totalProgress, 0);
progressBar.start(totalProgress, currentProgress);
const items = parseFacebookAds(ads); const items = parseFacebookAds(ads);
// Filter to only priced items (already done in parseFacebookAds) // Filter to only priced items (already done in parseFacebookAds)
const pricedItems = items.filter( const pricedItems = items.filter(
(item) => item.listingPrice?.cents && item.listingPrice.cents > 0, (item) =>
typeof item.listingPrice?.cents === "number" && item.listingPrice.cents >= 0,
); );
progressBar.update(totalProgress); progressBar?.update(totalProgress);
progressBar.stop(); progressBar?.stop();
console.log(`\nParsed ${pricedItems.length} Facebook marketplace listings.`); console.log(`\nParsed ${pricedItems.length} Facebook marketplace listings.`);
return pricedItems.slice(0, MAX_ITEMS); // Limit results return finalizeResults(pricedItems);
} }
/** /**
@@ -1250,13 +1297,15 @@ export async function fetchFacebookItem(
return null; return null;
} }
if (classification.unavailable || itemHtml.includes("This item has been sold")) { const itemData = extractFacebookItemData(itemHtml);
if (classification.unavailable && !itemData) {
logExtractionMetrics(false, itemId); logExtractionMetrics(false, itemId);
console.warn(`Item ${itemId} appears to be sold or removed from marketplace.`); console.warn(`Item ${itemId} appears to be sold or removed from marketplace.`);
return null; return null;
} }
if (classification.kind !== "item") { if (classification.kind !== "item" && !itemData) {
logExtractionMetrics(false, itemId); logExtractionMetrics(false, itemId);
console.warn( console.warn(
`Item ${itemId} returned unexpected route kind: ${classification.kind}.`, `Item ${itemId} returned unexpected route kind: ${classification.kind}.`,
@@ -1264,10 +1313,14 @@ export async function fetchFacebookItem(
return null; return null;
} }
const itemData = extractFacebookItemData(itemHtml);
if (!itemData) { if (!itemData) {
logExtractionMetrics(false, itemId); logExtractionMetrics(false, itemId);
if (itemHtml.includes("This item has been sold")) {
console.warn(`Item ${itemId} appears to be sold or removed from marketplace.`);
return null;
}
console.warn( console.warn(
`No item data found in Facebook marketplace page for item ${itemId}. This may indicate:`, `No item data found in Facebook marketplace page for item ${itemId}. This may indicate:`,
); );

View File

@@ -1,7 +1,12 @@
import cliProgress from "cli-progress"; import cliProgress from "cli-progress";
import { parseHTML } from "linkedom"; import { parseHTML } from "linkedom";
import unidecode from "unidecode"; import unidecode from "unidecode";
import type { HTMLString } from "../types/common"; import type {
HTMLString,
UnstableListingBuckets,
UnstableListingModeOptions,
} from "../types/common";
import { classifyUnstableListings } from "../utils/unstable";
import { import {
type CookieConfig, type CookieConfig,
formatCookiesForHeader, formatCookiesForHeader,
@@ -197,18 +202,37 @@ const SORT_MAPPINGS: Record<string, string> = {
distance: "DISTANCE", distance: "DISTANCE",
}; };
const LOCATION_SLUGS = Object.fromEntries(
Object.entries(LOCATION_MAPPINGS).map(([slug, id]) => [id, slug.replace(/\s+/g, "-")]),
) as Record<number, string>;
const CATEGORY_SLUGS = Object.fromEntries(
Object.entries(CATEGORY_MAPPINGS).map(([slug, id]) => [id, slug.replace(/\s+/g, "-")]),
) as Record<number, string>;
// ----------------------------- Utilities ----------------------------- // ----------------------------- Utilities -----------------------------
const SEPS = new Set([" ", "", "—", "/", ":", ";", ",", ".", "-"]); const SEPS = new Set([" ", "", "—", "/", ":", ";", ",", ".", "-"]);
function normalizeLookupKey(value: string): string {
return value.toLowerCase().replace(/[\s-]+/g, "-");
}
function centsToKijijiPriceParam(cents: number): number {
return Math.floor(cents / 100);
}
/** /**
* Resolve location ID from name or return numeric ID * Resolve location ID from name or return numeric ID
*/ */
export function resolveLocationId(location?: number | string): number { export function resolveLocationId(location?: number | string): number {
if (typeof location === "number") return location; if (typeof location === "number") return location;
if (typeof location === "string") { if (typeof location === "string") {
const normalized = location.toLowerCase().replace(/\s+/g, "-"); const normalized = normalizeLookupKey(location);
return LOCATION_MAPPINGS[normalized] ?? 0; // Default to Canada (0) const mapping = Object.entries(LOCATION_MAPPINGS).find(
([key]) => normalizeLookupKey(key) === normalized,
);
return mapping?.[1] ?? 0; // Default to Canada (0)
} }
return 0; // Default to Canada return 0; // Default to Canada
} }
@@ -219,12 +243,38 @@ export function resolveLocationId(location?: number | string): number {
export function resolveCategoryId(category?: number | string): number { export function resolveCategoryId(category?: number | string): number {
if (typeof category === "number") return category; if (typeof category === "number") return category;
if (typeof category === "string") { if (typeof category === "string") {
const normalized = category.toLowerCase().replace(/\s+/g, "-"); const normalized = normalizeLookupKey(category);
return CATEGORY_MAPPINGS[normalized] ?? 0; // Default to all categories const mapping = Object.entries(CATEGORY_MAPPINGS).find(
([key]) => normalizeLookupKey(key) === normalized,
);
return mapping?.[1] ?? 0; // Default to all categories
} }
return 0; // Default to all categories return 0; // Default to all categories
} }
function matchesPriceFilters(
listing: DetailedListing,
searchOptions: SearchOptions,
): boolean {
const cents = listing.listingPrice?.cents;
if (typeof cents !== "number") return false;
if (
typeof searchOptions.priceMin === "number" &&
cents < searchOptions.priceMin
) {
return false;
}
if (
typeof searchOptions.priceMax === "number" &&
cents > searchOptions.priceMax
) {
return false;
}
return true;
}
/** /**
* Build search URL with enhanced parameters * Build search URL with enhanced parameters
*/ */
@@ -236,23 +286,44 @@ export function buildSearchUrl(
const locationId = resolveLocationId(options.location); const locationId = resolveLocationId(options.location);
const categoryId = resolveCategoryId(options.category); const categoryId = resolveCategoryId(options.category);
const categorySlug = categoryId === 0 ? "buy-sell" : "buy-sell"; const categorySlug = CATEGORY_SLUGS[categoryId] ?? "buy-sell";
const locationSlug = locationId === 0 ? "canada" : "canada"; const locationSlug = LOCATION_SLUGS[locationId] ?? "canada";
let url = `${BASE_URL}/b-${categorySlug}/${locationSlug}/${slugify(keywords)}/k0c${categoryId}l${locationId}`; let url = `${BASE_URL}/b-${categorySlug}/${locationSlug}/${slugify(keywords)}/k0c${categoryId}l${locationId}`;
const sortParam = options.sortBy const sortValue =
? `&sort=${SORT_MAPPINGS[options.sortBy]}` options.sortBy && options.sortBy !== "relevancy"
: ""; ? SORT_MAPPINGS[options.sortBy]
: "relevancyDesc";
const sortOrder = options.sortOrder === "asc" ? "ASC" : "DESC"; const sortOrder = options.sortOrder === "asc" ? "ASC" : "DESC";
const priceMinParam =
typeof options.priceMin === "number"
? `&priceMin=${centsToKijijiPriceParam(options.priceMin)}`
: "";
const priceMaxParam =
typeof options.priceMax === "number"
? `&priceMax=${centsToKijijiPriceParam(options.priceMax)}`
: "";
const pageParam = const pageParam =
options.page && options.page > 1 ? `&page=${options.page}` : ""; options.page && options.page > 1 ? `&page=${options.page}` : "";
url += `?sort=relevancyDesc&view=list${sortParam}&order=${sortOrder}${pageParam}`; url += `?sort=${sortValue}&view=list&order=${sortOrder}${priceMinParam}${priceMaxParam}${pageParam}`;
return url; return url;
} }
function findApolloListingKey(
apolloState: ApolloRecord,
predicate: (value: Record<string, unknown>) => boolean,
): string | undefined {
return Object.keys(apolloState).find((key) => {
if (!key.startsWith("Listing:")) return false;
const value = apolloState[key];
return isRecord(value) && predicate(value);
});
}
/** /**
* Slugifies a string for Kijiji search URLs * Slugifies a string for Kijiji search URLs
*/ */
@@ -391,18 +462,16 @@ async function fetchSellerDetails(
accountType?: string; accountType?: string;
}> { }> {
try { try {
const [reviewData, profileData] = await Promise.all([ const reviewData = await fetchGraphQLData(
fetchGraphQLData( GRAPHQL_QUERIES.getReviewSummary,
GRAPHQL_QUERIES.getReviewSummary, { userId: posterId },
{ userId: posterId }, BASE_URL,
BASE_URL, );
), const profileData = await fetchGraphQLData(
fetchGraphQLData( GRAPHQL_QUERIES.getProfileMetrics,
GRAPHQL_QUERIES.getProfileMetrics, { profileId: posterId },
{ profileId: posterId }, BASE_URL,
BASE_URL, );
),
]);
const reviewResponse = reviewData as GraphQLReviewResponse; const reviewResponse = reviewData as GraphQLReviewResponse;
const profileResponse = profileData as GraphQLProfileResponse; const profileResponse = profileData as GraphQLProfileResponse;
@@ -457,8 +526,7 @@ export function parseSearch(
const results: SearchListing[] = []; const results: SearchListing[] = [];
for (const [key, value] of Object.entries(apolloState)) { for (const [key, value] of Object.entries(apolloState)) {
// Heuristic: Kijiji listing keys usually contain "Listing" if (!key.startsWith("Listing:")) continue;
if (!key.includes("Listing")) continue;
if (!isRecord(value)) continue; if (!isRecord(value)) continue;
const item = value as ApolloSearchItem; const item = value as ApolloSearchItem;
@@ -484,9 +552,9 @@ function _parseListing(
const apolloState = extractApolloState(htmlString); const apolloState = extractApolloState(htmlString);
if (!apolloState) return null; if (!apolloState) return null;
// Find the listing root key const listingKey = findApolloListingKey(
const listingKey = Object.keys(apolloState).find((k) => apolloState,
k.includes("Listing"), (value) => typeof value.url === "string" && typeof value.title === "string",
); );
if (!listingKey) return null; if (!listingKey) return null;
@@ -557,9 +625,12 @@ export async function parseDetailedListing(
const apolloState = extractApolloState(htmlString); const apolloState = extractApolloState(htmlString);
if (!apolloState) return null; if (!apolloState) return null;
// Find the listing root key const listingKey = findApolloListingKey(
const listingKey = Object.keys(apolloState).find((k) => apolloState,
k.includes("Listing"), (value) =>
typeof value.url === "string" &&
typeof value.title === "string" &&
isRecord(value.price),
); );
if (!listingKey) return null; if (!listingKey) return null;
@@ -696,14 +767,43 @@ export async function parseDetailedListing(
// ----------------------------- Main ----------------------------- // ----------------------------- Main -----------------------------
export default async function fetchKijijiItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND: number | undefined,
BASE_URL: string | undefined,
searchOptions: SearchOptions | undefined,
listingOptions: ListingFetchOptions | undefined,
unstableMode: { hideUnstableResults: true },
): Promise<UnstableListingBuckets<DetailedListing>>;
export default async function fetchKijijiItems(
SEARCH_QUERY: string,
REQUESTS_PER_SECOND?: number,
BASE_URL?: string,
searchOptions?: SearchOptions,
listingOptions?: ListingFetchOptions,
unstableMode?: UnstableListingModeOptions,
): Promise<DetailedListing[]>;
export default async function fetchKijijiItems( export default async function fetchKijijiItems(
SEARCH_QUERY: string, SEARCH_QUERY: string,
REQUESTS_PER_SECOND = 1, REQUESTS_PER_SECOND = 1,
BASE_URL = "https://www.kijiji.ca", BASE_URL = "https://www.kijiji.ca",
searchOptions: SearchOptions = {}, searchOptions: SearchOptions = {},
listingOptions: ListingFetchOptions = {}, listingOptions: ListingFetchOptions = {},
unstableMode: UnstableListingModeOptions = {},
) { ) {
const DELAY_MS = Math.max(1, Math.floor(1000 / REQUESTS_PER_SECOND)); const requestsPerSecond = REQUESTS_PER_SECOND > 0 ? REQUESTS_PER_SECOND : 1;
const finalizeResults = (
listings: DetailedListing[],
): DetailedListing[] | UnstableListingBuckets<DetailedListing> => {
if (!unstableMode.hideUnstableResults) {
return listings;
}
return classifyUnstableListings(listings);
};
const DELAY_MS = Math.max(1, Math.floor(1000 / requestsPerSecond));
// Load Kijiji cookies (optional - helps bypass bot detection) // Load Kijiji cookies (optional - helps bypass bot detection)
const cookies = await loadCookiesOptional( const cookies = await loadCookiesOptional(
@@ -716,15 +816,18 @@ export default async function fetchKijijiItems(
: undefined; : undefined;
// Set defaults for configuration // Set defaults for configuration
const finalSearchOptions: Required<SearchOptions> = { const finalSearchOptions: Omit<Required<SearchOptions>, "priceMin" | "priceMax"> & {
priceMin?: number;
priceMax?: number;
} = {
location: searchOptions.location ?? 1700272, // Default to GTA location: searchOptions.location ?? 1700272, // Default to GTA
category: searchOptions.category ?? 0, // Default to all categories category: searchOptions.category ?? 0, // Default to all categories
keywords: searchOptions.keywords ?? SEARCH_QUERY, keywords: searchOptions.keywords ?? SEARCH_QUERY,
sortBy: searchOptions.sortBy ?? "relevancy", sortBy: searchOptions.sortBy ?? "relevancy",
sortOrder: searchOptions.sortOrder ?? "desc", sortOrder: searchOptions.sortOrder ?? "desc",
maxPages: searchOptions.maxPages ?? 5, // Default to 5 pages maxPages: searchOptions.maxPages ?? 5, // Default to 5 pages
priceMin: searchOptions.priceMin as number, priceMin: searchOptions.priceMin,
priceMax: searchOptions.priceMax as number, priceMax: searchOptions.priceMax,
cookies: searchOptions.cookies ?? "", cookies: searchOptions.cookies ?? "",
}; };
@@ -792,15 +895,19 @@ export default async function fetchKijijiItems(
progressBar?.start(totalProgress, currentProgress); progressBar?.start(totalProgress, currentProgress);
// Process in batches for controlled concurrency // Process in batches for controlled concurrency
const CONCURRENT_REQUESTS = REQUESTS_PER_SECOND * 2; // 2x rate for faster processing const CONCURRENT_REQUESTS = Math.max(1, Math.floor(requestsPerSecond));
const results: (DetailedListing | null)[] = []; const results: (DetailedListing | null)[] = [];
for (let i = 0; i < newListingLinks.length; i += CONCURRENT_REQUESTS) { for (let i = 0; i < newListingLinks.length; i += CONCURRENT_REQUESTS) {
const batch = newListingLinks.slice(i, i + CONCURRENT_REQUESTS); const batch = newListingLinks.slice(i, i + CONCURRENT_REQUESTS);
const batchPromises = batch.map(async (link) => { const batchPromises = batch.map(async (link, batchIndex) => {
try { try {
if (batchIndex > 0) {
await new Promise((resolve) => setTimeout(resolve, DELAY_MS * batchIndex));
}
const html = await fetchHtml(link, 0, { const html = await fetchHtml(link, 0, {
// No per-request delay, batch handles rate limit // Staggered starts keep request pacing within REQUESTS_PER_SECOND.
onRateInfo: (remaining, reset) => { onRateInfo: (remaining, reset) => {
if (remaining && reset) { if (remaining && reset) {
console.log( console.log(
@@ -839,12 +946,10 @@ export default async function fetchKijijiItems(
const batchResults = await Promise.all(batchPromises); const batchResults = await Promise.all(batchPromises);
results.push(...batchResults); results.push(...batchResults);
// Wait between batches to respect rate limit
if (i + CONCURRENT_REQUESTS < newListingLinks.length) { if (i + CONCURRENT_REQUESTS < newListingLinks.length) {
await new Promise((resolve) => await new Promise((resolve) => setTimeout(resolve, DELAY_MS));
setTimeout(resolve, DELAY_MS * batch.length),
);
} }
} }
allListings.push( allListings.push(
@@ -859,8 +964,14 @@ export default async function fetchKijijiItems(
} }
} }
console.log(`\nParsed ${allListings.length} detailed listings.`); const filteredListings = allListings.filter((listing) =>
return allListings; matchesPriceFilters(listing, finalSearchOptions),
);
console.log(
`\nParsed ${filteredListings.length} detailed listings.`,
);
return finalizeResults(filteredListings);
} }
// Re-export error classes for convenience // Re-export error classes for convenience

View File

@@ -18,3 +18,12 @@ export interface ListingDetails {
address?: string | null; address?: string | null;
creationDate?: string; creationDate?: string;
} }
export interface UnstableListingBuckets<T> {
results: T[];
unstableResults: T[];
}
export interface UnstableListingModeOptions {
hideUnstableResults?: boolean;
}

View File

@@ -0,0 +1,46 @@
import type { ListingDetails, UnstableListingBuckets } from "../types/common";
function getMedian(values: number[]): number {
const middleIndex = Math.floor(values.length / 2);
if (values.length % 2 === 0) {
return (values[middleIndex - 1] + values[middleIndex]) / 2;
}
return values[middleIndex];
}
export function classifyUnstableListings<T extends ListingDetails>(
listings: T[],
): UnstableListingBuckets<T> {
const validPrices = listings
.map((listing) => listing.listingPrice.cents)
.filter((price) => Number.isFinite(price) && price > 0)
.sort((left, right) => left - right);
if (validPrices.length < 2) {
return {
results: [...listings],
unstableResults: [],
};
}
const threshold = getMedian(validPrices) * 0.8;
const buckets: UnstableListingBuckets<T> = {
results: [],
unstableResults: [],
};
for (const listing of listings) {
const price = listing.listingPrice.cents;
if (Number.isFinite(price) && price > 0 && price < threshold) {
buckets.unstableResults.push(listing);
continue;
}
buckets.results.push(listing);
}
return buckets;
}

View File

@@ -1,5 +1,28 @@
import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test"; import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
import type { EbayListingDetails } from "../src/scrapers/ebay";
import fetchEbayItems from "../src/scrapers/ebay"; import fetchEbayItems from "../src/scrapers/ebay";
import type { UnstableListingBuckets } from "../src/types/common";
type Assert<T extends true> = T;
type IsExact<T, U> =
(<G>() => G extends T ? 1 : 2) extends <G>() => G extends U ? 1 : 2
? (<G>() => G extends U ? 1 : 2) extends <G>() => G extends T ? 1 : 2
? true
: false
: false;
const getDefaultEbayItems = async () => fetchEbayItems("laptop");
const getUnstableEbayItems = async () =>
fetchEbayItems("laptop", 1000, {}, { hideUnstableResults: true });
type _EbayDefaultReturn = Assert<
IsExact<Awaited<ReturnType<typeof getDefaultEbayItems>>, EbayListingDetails[]>
>;
type _EbayUnstableReturn = Assert<
IsExact<
Awaited<ReturnType<typeof getUnstableEbayItems>>,
UnstableListingBuckets<EbayListingDetails>
>
>;
const originalFetch = global.fetch; const originalFetch = global.fetch;
const originalWarn = console.warn; const originalWarn = console.warn;
@@ -24,9 +47,7 @@ describe("eBay Scraper Cookie Handling", () => {
const warnMock = mock(() => {}); const warnMock = mock(() => {});
console.warn = warnMock; console.warn = warnMock;
await fetchEbayItems("laptop", 1000, { await fetchEbayItems("laptop", 1000);
cookies: "s=from-request",
});
expect(global.fetch).toHaveBeenCalledTimes(1); expect(global.fetch).toHaveBeenCalledTimes(1);
@@ -38,4 +59,579 @@ describe("eBay Scraper Cookie Handling", () => {
"No valid eBay cookies found in EBAY_COOKIE. eBay may block requests without a raw Cookie header string.", "No valid eBay cookies found in EBAY_COOKIE. eBay may block requests without a raw Cookie header string.",
); );
}); });
test("keeps relative item links on the ebay.ca host", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({ url: "https://www.ebay.ca/itm/123" }),
]);
});
test("deduplicates repeated item links from the same card", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"><span>Open</span></a>
<a href="/itm/123"><span>Image</span></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toHaveLength(1);
expect(results[0]).toEqual(
expect.objectContaining({ url: "https://www.ebay.ca/itm/123" }),
);
});
test("deduplicates tracking variants of the same item URL", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123?_trkparms=foo"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/123?hash=item123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toHaveLength(1);
expect(results[0]).toEqual(
expect.objectContaining({ url: "https://www.ebay.ca/itm/123?_trkparms=foo" }),
);
});
test("deduplicates tracking variants of SEO-style item URLs", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/title-slug/1234567890?_trkparms=foo"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/title-slug/1234567890?hash=item123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/title-slug/9999999999?hash=item999"></a>
<h3>Another Laptop Bundle</h3>
<span class="s-item__price">CA $110.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toHaveLength(2);
expect(results[0]).toEqual(
expect.objectContaining({
url: "https://www.ebay.ca/itm/title-slug/1234567890?_trkparms=foo",
}),
);
expect(results[1]).toEqual(
expect.objectContaining({
url: "https://www.ebay.ca/itm/title-slug/9999999999?hash=item999",
}),
);
});
test("treats bare dollar prices as CAD on ebay.ca", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">$100.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({ currency: "CAD" }),
}),
]);
});
test("treats US dollar prices as USD", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">US $123.45</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({ currency: "USD", cents: 12345 }),
}),
]);
});
test("treats US dollar prices without space as USD", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">US$123.45</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({ currency: "USD", cents: 12345 }),
}),
]);
});
test("maps pound prices to GBP", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">£123.45</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({ currency: "GBP", cents: 12345 }),
}),
]);
});
test("maps euro and yen prices to the matching currency labels", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Euro Bundle</h3>
<span class="s-item__price">€123.45</span>
</li>
<li class="s-item">
<a href="/itm/456"></a>
<h3>Yen Bundle</h3>
<span class="s-item__price">¥123</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("bundle", 1000, {
keywords: ["bundle"],
});
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({ currency: "EUR", cents: 12345 }),
}),
expect.objectContaining({
listingPrice: expect.objectContaining({ currency: "JPY", cents: 12300 }),
}),
]);
});
test("prefers the discounted Canadian-formatted price", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">
<s>CA $150.00</s>
<span>CA $100.00</span>
</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({
amountFormatted: "CA $100.00",
cents: 10000,
}),
}),
]);
});
test("prefers discounted Canadian prices that contain four consecutive digits", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">
<s>CA $1500.00</s>
<span>CA $1000.00</span>
</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({
amountFormatted: "CA $1000.00",
cents: 100000,
}),
}),
]);
});
test("prefers discounted US dollar prices over original prices", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">
<s>US $150.00</s>
<span>US $100.00</span>
</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({
amountFormatted: "US $100.00",
cents: 10000,
currency: "USD",
}),
}),
]);
});
test("keeps short titles that were not shortened by UI cleaning", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Free Bike</h3>
<span class="s-item__price">CA $0.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("bike", 1000);
expect(results).toEqual([
expect.objectContaining({
title: "Free Bike",
listingPrice: expect.objectContaining({ cents: 0, currency: "CAD" }),
}),
]);
});
test("accepts higher fallback prices without price classes", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Studio Microphone Bundle</h3>
<div>CA $2500.00</div>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("microphone", 1000, {
keywords: ["microphone"],
});
expect(results).toEqual([
expect.objectContaining({
title: "Studio Microphone Bundle",
listingPrice: expect.objectContaining({
amountFormatted: "CA $2500.00",
cents: 250000,
}),
}),
]);
});
test("retains free items when the requested price range includes zero", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="/itm/123"></a>
<h3>Free Laptop Bundle</h3>
<span class="s-item__price">$0.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000, {
minPrice: 0,
maxPrice: 0,
});
expect(results).toEqual([
expect.objectContaining({
title: "Free Laptop Bundle",
listingPrice: expect.objectContaining({ cents: 0 }),
}),
]);
});
test("returns results and unstableResults when unstable mode is enabled", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="https://www.ebay.ca/itm/1"></a>
<h3>Stable Laptop Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/2"></a>
<h3>Another Laptop Bundle</h3>
<span class="s-item__price">CA $110.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/3"></a>
<h3>Cheap Laptop Bundle</h3>
<span class="s-item__price">CA $70.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems(
"laptop",
1000,
{},
{ hideUnstableResults: true },
);
expect(results).toEqual({
results: [
expect.objectContaining({ title: "Stable Laptop Bundle" }),
expect.objectContaining({ title: "Another Laptop Bundle" }),
],
unstableResults: [
expect.objectContaining({ title: "Cheap Laptop Bundle" }),
],
});
});
test("respects maxItems in default mode", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="https://www.ebay.ca/itm/1"></a>
<h3>First Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/2"></a>
<h3>Second Bundle</h3>
<span class="s-item__price">CA $110.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/3"></a>
<h3>Third Bundle</h3>
<span class="s-item__price">CA $70.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems("laptop", 1000, { maxItems: 2 });
expect(results).toHaveLength(2);
expect(results[0]).toEqual(
expect.objectContaining({ title: "First Bundle" }),
);
expect(results[1]).toEqual(
expect.objectContaining({ title: "Second Bundle" }),
);
});
test("respects maxItems in unstable mode", async () => {
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () =>
Promise.resolve(`
<html><body>
<li class="s-item">
<a href="https://www.ebay.ca/itm/1"></a>
<h3>First Bundle</h3>
<span class="s-item__price">CA $100.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/2"></a>
<h3>Second Bundle</h3>
<span class="s-item__price">CA $110.00</span>
</li>
<li class="s-item">
<a href="https://www.ebay.ca/itm/3"></a>
<h3>Third Bundle</h3>
<span class="s-item__price">CA $70.00</span>
</li>
</body></html>
`),
}),
) as typeof fetch;
const results = await fetchEbayItems(
"laptop",
1000,
{ maxItems: 2 },
{ hideUnstableResults: true },
);
expect(results.results).toHaveLength(2);
expect(results.unstableResults).toHaveLength(0);
expect(results.results[0]).toEqual(
expect.objectContaining({ title: "First Bundle" }),
);
expect(results.results[1]).toEqual(
expect.objectContaining({ title: "Second Bundle" }),
);
});
}); });

View File

@@ -1,18 +1,46 @@
import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test"; import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
import cliProgress from "cli-progress";
import { import {
classifyFacebookResponse, classifyFacebookResponse,
type FacebookListingDetails,
ensureFacebookCookies, ensureFacebookCookies,
extractFacebookBootstrapCandidates, extractFacebookBootstrapCandidates,
extractFacebookItemData, extractFacebookItemData,
extractFacebookMarketplaceData, extractFacebookMarketplaceData,
default as fetchFacebookItems,
fetchFacebookItem, fetchFacebookItem,
parseFacebookAds, parseFacebookAds,
parseFacebookCookieString, parseFacebookCookieString,
parseFacebookItem, parseFacebookItem,
} from "../src/scrapers/facebook"; } from "../src/scrapers/facebook";
import type { UnstableListingBuckets } from "../src/types/common";
import { formatCookiesForHeader } from "../src/utils/cookies"; import { formatCookiesForHeader } from "../src/utils/cookies";
import { formatCentsToCurrency } from "../src/utils/format"; import { formatCentsToCurrency } from "../src/utils/format";
const originalStdoutIsTTY = process.stdout.isTTY;
type Assert<T extends true> = T;
type IsExact<T, U> =
(<G>() => G extends T ? 1 : 2) extends <G>() => G extends U ? 1 : 2
? (<G>() => G extends U ? 1 : 2) extends <G>() => G extends T ? 1 : 2
? true
: false
: false;
const getDefaultFacebookItems = async () => fetchFacebookItems("chair");
const getUnstableFacebookItems = async (): Promise<
UnstableListingBuckets<FacebookListingDetails>
> => fetchFacebookItems("chair", 1, "toronto", 25, { hideUnstableResults: true });
type _FacebookDefaultReturn = Assert<
IsExact<Awaited<ReturnType<typeof getDefaultFacebookItems>>, FacebookListingDetails[]>
>;
type _FacebookUnstableReturn = Assert<
IsExact<
Awaited<ReturnType<typeof getUnstableFacebookItems>>,
UnstableListingBuckets<FacebookListingDetails>
>
>;
// Mock fetch globally // Mock fetch globally
const originalFetch = global.fetch; const originalFetch = global.fetch;
@@ -25,6 +53,7 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
afterEach(() => { afterEach(() => {
global.fetch = originalFetch; global.fetch = originalFetch;
process.stdout.isTTY = originalStdoutIsTTY;
}); });
describe("Cookie Parsing", () => { describe("Cookie Parsing", () => {
@@ -162,6 +191,7 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
try { try {
const result = await fetchFacebookItem("123"); const result = await fetchFacebookItem("123");
expect(result).toBeNull(); expect(result).toBeNull();
expect(global.fetch).toHaveBeenCalledTimes(1);
expect(warnMock).toHaveBeenCalledWith( expect(warnMock).toHaveBeenCalledWith(
"Authentication error: Invalid or expired cookies. Update FACEBOOK_COOKIE with a fresh raw Cookie header string.", "Authentication error: Invalid or expired cookies. Update FACEBOOK_COOKIE with a fresh raw Cookie header string.",
); );
@@ -247,6 +277,30 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
// Should eventually succeed after retry // Should eventually succeed after retry
}); });
test("should handle exhausted rate limiting retries as a 429", async () => {
let attempts = 0;
global.fetch = mock(() => {
attempts++;
return Promise.resolve({
ok: false,
status: 429,
headers: {
get: (header: string) => {
if (header === "X-RateLimit-Reset") return "0";
return null;
},
},
text: () => Promise.resolve("Rate limited"),
});
});
const result = await fetchFacebookItem("429-loop");
expect(result).toBeNull();
expect(attempts).toBe(4);
});
test("should handle sold items", async () => { test("should handle sold items", async () => {
const mockData = { const mockData = {
require: [ require: [
@@ -294,6 +348,101 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
expect(result?.listingStatus).toBe("SOLD"); expect(result?.listingStatus).toBe("SOLD");
}); });
test("should still parse sold items when structured data exists", async () => {
const soldStructuredHtml = `
<html><body>
<div>This item has been sold</div>
<script>"XCometMarketplacePermalinkController"</script>
<script>
${JSON.stringify({
payload: {
listing: {
id: "457",
__typename: "GroupCommerceProductItem",
marketplace_listing_title: "Structured Sold Item",
formatted_price: { text: "CA$90" },
listing_price: {
amount: "90.00",
currency: "CAD",
amount_with_offset: "90.00",
},
is_sold: true,
is_live: false,
},
},
})}
</script>
</body></html>
`;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(soldStructuredHtml),
url: "https://www.facebook.com/marketplace/item/457/",
headers: {
get: () => null,
},
}),
);
const result = await fetchFacebookItem("457");
expect(result).toEqual(
expect.objectContaining({
title: "Structured Sold Item",
listingStatus: "SOLD",
}),
);
});
test("should parse structured data even when an unavailable banner is present", async () => {
const unavailableStructuredHtml = `
<html><body>
<div>This listing is no longer available.</div>
<script>"XCometMarketplacePermalinkController"</script>
<script>
${JSON.stringify({
payload: {
listing: {
id: "458",
__typename: "GroupCommerceProductItem",
marketplace_listing_title: "Recovered Item",
formatted_price: { text: "CA$120" },
listing_price: {
amount: "120.00",
currency: "CAD",
amount_with_offset: "120.00",
},
is_live: true,
},
},
})}
</script>
</body></html>
`;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(unavailableStructuredHtml),
url: "https://www.facebook.com/marketplace/item/458/",
headers: {
get: () => null,
},
}),
);
const result = await fetchFacebookItem("458");
expect(result).toEqual(
expect.objectContaining({
title: "Recovered Item",
listingStatus: "ACTIVE",
}),
);
});
test("should handle successful item extraction", async () => { test("should handle successful item extraction", async () => {
const mockData = { const mockData = {
require: [ require: [
@@ -367,6 +516,338 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
}); });
}); });
describe("fetchFacebookItems", () => {
let previousCookie: string | undefined;
beforeEach(() => {
previousCookie = process.env.FACEBOOK_COOKIE;
process.env.FACEBOOK_COOKIE = "c_user=12345; xs=abc123";
});
afterEach(() => {
if (previousCookie === undefined) {
delete process.env.FACEBOOK_COOKIE;
} else {
process.env.FACEBOOK_COOKIE = previousCookie;
}
});
test("returns an array by default", async () => {
const mockSearchHtml = `<html><body><script>"XCometMarketplaceSearchController"</script><script>${JSON.stringify({
payload: {
resultGroups: [
{
edges: [
{
node: {
listing: {
id: "1",
marketplace_listing_title: "Stable Chair Listing",
listing_price: {
amount: "120.00",
formatted_amount: "CA$120",
currency: "CAD",
},
is_live: true,
},
},
},
],
},
],
},
})}</script></body></html>`;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(mockSearchHtml),
url: "https://www.facebook.com/marketplace/toronto/search?query=chair",
headers: {
get: () => null,
},
}),
);
const results = await fetchFacebookItems("chair", 1, "toronto", 25);
expect(Array.isArray(results)).toBe(true);
expect(results).toHaveLength(1);
});
test("preserves free listings through the public fetch entrypoint", async () => {
const mockSearchHtml = `<html><body><script>"XCometMarketplaceSearchController"</script><script>${JSON.stringify({
payload: {
resultGroups: [
{
edges: [
{
node: {
listing: {
id: "free-1",
marketplace_listing_title: "Free Chair",
listing_price: {
amount: "0.00",
formatted_amount: "FREE",
currency: "CAD",
},
is_live: true,
},
},
},
],
},
],
},
})}</script></body></html>`;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(mockSearchHtml),
url: "https://www.facebook.com/marketplace/toronto/search?query=chair",
headers: {
get: () => null,
},
}),
);
const results = await fetchFacebookItems("chair", 1, "toronto", 25);
expect(results).toEqual([
expect.objectContaining({
title: "Free Chair",
listingPrice: expect.objectContaining({
cents: 0,
amountFormatted: "FREE",
}),
}),
]);
});
test("does not start a progress bar when stdout is not a TTY", async () => {
const mockSearchHtml = `<html><body><script>"XCometMarketplaceSearchController"</script><script>${JSON.stringify({
payload: {
resultGroups: [
{
edges: [
{
node: {
listing: {
id: "1",
marketplace_listing_title: "Chair Listing",
listing_price: {
amount: "120.00",
formatted_amount: "CA$120",
currency: "CAD",
},
is_live: true,
},
},
},
],
},
],
},
})}</script></body></html>`;
process.stdout.isTTY = false;
const startSpy = mock(() => {});
const updateSpy = mock(() => {});
const stopSpy = mock(() => {});
const originalStart = cliProgress.SingleBar.prototype.start;
const originalUpdate = cliProgress.SingleBar.prototype.update;
const originalStop = cliProgress.SingleBar.prototype.stop;
try {
cliProgress.SingleBar.prototype.start = startSpy;
cliProgress.SingleBar.prototype.update = updateSpy;
cliProgress.SingleBar.prototype.stop = stopSpy;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(mockSearchHtml),
url: "https://www.facebook.com/marketplace/toronto/search?query=chair",
headers: {
get: () => null,
},
}),
);
const results = await fetchFacebookItems("chair", 1, "toronto", 25);
expect(results).toHaveLength(1);
expect(startSpy).not.toHaveBeenCalled();
expect(updateSpy).not.toHaveBeenCalled();
expect(stopSpy).not.toHaveBeenCalled();
} finally {
cliProgress.SingleBar.prototype.start = originalStart;
cliProgress.SingleBar.prototype.update = originalUpdate;
cliProgress.SingleBar.prototype.stop = originalStop;
}
});
test("returns results and unstableResults when unstable mode is enabled", async () => {
const mockSearchHtml = `<html><body><script>"XCometMarketplaceSearchController"</script><script>${JSON.stringify({
payload: {
resultGroups: [
{
edges: [
{
node: {
listing: {
id: "1",
marketplace_listing_title: "Stable Chair Listing",
listing_price: {
amount: "100.00",
formatted_amount: "CA$100",
currency: "CAD",
},
is_live: true,
},
},
},
{
node: {
listing: {
id: "2",
marketplace_listing_title: "Another Stable Chair",
listing_price: {
amount: "110.00",
formatted_amount: "CA$110",
currency: "CAD",
},
is_live: true,
},
},
},
{
node: {
listing: {
id: "3",
marketplace_listing_title: "Suspiciously Cheap Chair",
listing_price: {
amount: "70.00",
formatted_amount: "CA$70",
currency: "CAD",
},
is_live: true,
},
},
},
],
},
],
},
})}</script></body></html>`;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(mockSearchHtml),
url: "https://www.facebook.com/marketplace/toronto/search?query=chair",
headers: {
get: () => null,
},
}),
);
const results = await fetchFacebookItems("chair", 1, "toronto", 25, {
hideUnstableResults: true,
});
expect(results).toEqual({
results: [
expect.objectContaining({ title: "Stable Chair Listing" }),
expect.objectContaining({ title: "Another Stable Chair" }),
],
unstableResults: [
expect.objectContaining({ title: "Suspiciously Cheap Chair" }),
],
});
});
test("unstable mode classifies before the final MAX_ITEMS limit", async () => {
const mockSearchHtml = `<html><body><script>"XCometMarketplaceSearchController"</script><script>${JSON.stringify({
payload: {
resultGroups: [
{
edges: [
{
node: {
listing: {
id: "1",
marketplace_listing_title: "Boundary Stable Chair",
listing_price: {
amount: "100.00",
formatted_amount: "CA$100",
currency: "CAD",
},
is_live: true,
},
},
},
{
node: {
listing: {
id: "2",
marketplace_listing_title: "Second Boundary Stable Chair",
listing_price: {
amount: "110.00",
formatted_amount: "CA$110",
currency: "CAD",
},
is_live: true,
},
},
},
{
node: {
listing: {
id: "3",
marketplace_listing_title: "Past Boundary Cheap Chair",
listing_price: {
amount: "70.00",
formatted_amount: "CA$70",
currency: "CAD",
},
is_live: true,
},
},
},
],
},
],
},
})}</script></body></html>`;
global.fetch = mock(() =>
Promise.resolve({
ok: true,
text: () => Promise.resolve(mockSearchHtml),
url: "https://www.facebook.com/marketplace/toronto/search?query=chair",
headers: {
get: () => null,
},
}),
);
const results = await fetchFacebookItems("chair", 1, "toronto", 2, {
hideUnstableResults: true,
});
expect(results).toEqual({
results: [
expect.objectContaining({ title: "Boundary Stable Chair" }),
expect.objectContaining({ title: "Second Boundary Stable Chair" }),
],
unstableResults: [
expect.objectContaining({ title: "Past Boundary Cheap Chair" }),
],
});
});
});
describe("Data Extraction", () => { describe("Data Extraction", () => {
describe("extractFacebookItemData", () => { describe("extractFacebookItemData", () => {
test("extracts item details from Comet permalink bootstrap candidates", () => { test("extracts item details from Comet permalink bootstrap candidates", () => {
@@ -1051,10 +1532,21 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
}; };
const result = parseFacebookItem(item); const result = parseFacebookItem(item);
expect(result).not.toBeNull(); expect(result).toBeNull();
expect(result?.title).toBe("Minimal Item"); });
expect(result?.description).toBeUndefined();
expect(result?.seller).toBeUndefined(); test("returns null when item price data is present but unparseable", () => {
const item = {
id: "456b",
__typename: "GroupCommerceProductItem" as const,
marketplace_listing_title: "Broken Price Item",
formatted_price: { text: "price unavailable" },
listing_price: { amount: "not-a-number", currency: "CAD" },
};
const result = parseFacebookItem(item);
expect(result).toBeNull();
}); });
test("should identify vehicle listings", () => { test("should identify vehicle listings", () => {
@@ -1198,6 +1690,10 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
}); });
test("should handle malformed ads gracefully", () => { test("should handle malformed ads gracefully", () => {
const originalWarn = console.warn;
const warnMock = mock(() => {});
console.warn = warnMock;
const ads = [ const ads = [
{ {
node: { node: {
@@ -1223,6 +1719,120 @@ describe("Facebook Marketplace Scraper Core Tests", () => {
const results = parseFacebookAds(ads); const results = parseFacebookAds(ads);
expect(results).toHaveLength(1); expect(results).toHaveLength(1);
expect(results[0].title).toBe("Valid Ad"); expect(results[0].title).toBe("Valid Ad");
expect(warnMock).toHaveBeenCalledTimes(1);
console.warn = originalWarn;
});
test("parses formatted fallback prices with multiple commas", () => {
const ads = [
{
node: {
listing: {
id: "big-price",
marketplace_listing_title: "Luxury Home",
listing_price: {
amount_with_offset_in_currency: "123456789",
formatted_amount: "$1,234,567.89",
currency: "CAD",
},
is_live: true,
},
},
},
];
const results = parseFacebookAds(ads);
expect(results).toEqual([
expect.objectContaining({
listingPrice: expect.objectContaining({ cents: 123456789 }),
}),
]);
});
test("does not trust amount_with_offset_in_currency without a parseable formatted price", () => {
const ads = [
{
node: {
listing: {
id: "bad-offset",
marketplace_listing_title: "Broken Price Listing",
listing_price: {
amount_with_offset_in_currency: "123456789",
formatted_amount: "price unavailable",
currency: "CAD",
},
is_live: true,
},
},
},
];
const results = parseFacebookAds(ads);
expect(results).toEqual([]);
});
test("keeps valid free search listings", () => {
const ads = [
{
node: {
listing: {
id: "free-item",
marketplace_listing_title: "Free Chair",
listing_price: {
amount: "0.00",
formatted_amount: "FREE",
currency: "CAD",
},
is_live: true,
},
},
},
];
const results = parseFacebookAds(ads);
expect(results).toEqual([
expect.objectContaining({
title: "Free Chair",
listingPrice: expect.objectContaining({
cents: 0,
amountFormatted: "FREE",
}),
}),
]);
});
test("keeps free search listings when amount is missing but formatted_amount is FREE", () => {
const ads = [
{
node: {
listing: {
id: "free-no-amount",
marketplace_listing_title: "Free Sofa",
listing_price: {
formatted_amount: "FREE",
currency: "CAD",
},
is_live: true,
},
},
},
];
const results = parseFacebookAds(ads);
expect(results).toEqual([
expect.objectContaining({
title: "Free Sofa",
listingPrice: expect.objectContaining({
cents: 0,
amountFormatted: "FREE",
}),
}),
]);
}); });
}); });
}); });

View File

@@ -1,13 +1,60 @@
import { describe, expect, test } from "bun:test"; import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
import { import {
buildSearchUrl, buildSearchUrl,
default as fetchKijijiItems,
type DetailedListing,
NetworkError, NetworkError,
parseSearch,
parseDetailedListing,
ParseError, ParseError,
RateLimitError, RateLimitError,
resolveCategoryId, resolveCategoryId,
resolveLocationId, resolveLocationId,
ValidationError, ValidationError,
} from "../src/scrapers/kijiji"; } from "../src/scrapers/kijiji";
import type { UnstableListingBuckets } from "../src/types/common";
type Assert<T extends true> = T;
type IsExact<T, U> =
(<G>() => G extends T ? 1 : 2) extends <G>() => G extends U ? 1 : 2
? (<G>() => G extends U ? 1 : 2) extends <G>() => G extends T ? 1 : 2
? true
: false
: false;
const getDefaultKijijiItems = async () => fetchKijijiItems("phone");
const getUnstableKijijiItems = async (): Promise<
UnstableListingBuckets<DetailedListing>
> =>
fetchKijijiItems(
"phone",
1000,
"https://www.kijiji.ca",
{},
{},
{ hideUnstableResults: true },
);
type _KijijiDefaultReturn = Assert<
IsExact<Awaited<ReturnType<typeof getDefaultKijijiItems>>, DetailedListing[]>
>;
type _KijijiUnstableReturn = Assert<
IsExact<
Awaited<ReturnType<typeof getUnstableKijijiItems>>,
UnstableListingBuckets<DetailedListing>
>
>;
const originalFetch = global.fetch;
beforeEach(() => {
global.fetch = mock(() => {
throw new Error("fetch should be mocked in individual tests");
});
});
afterEach(() => {
global.fetch = originalFetch;
});
describe("Location and Category Resolution", () => { describe("Location and Category Resolution", () => {
describe("resolveLocationId", () => { describe("resolveLocationId", () => {
@@ -21,6 +68,7 @@ describe("Location and Category Resolution", () => {
expect(resolveLocationId("ontario")).toBe(9004); expect(resolveLocationId("ontario")).toBe(9004);
expect(resolveLocationId("toronto")).toBe(1700273); expect(resolveLocationId("toronto")).toBe(1700273);
expect(resolveLocationId("gta")).toBe(1700272); expect(resolveLocationId("gta")).toBe(1700272);
expect(resolveLocationId("Nova Scotia")).toBe(9002);
}); });
test("should handle case insensitive matching", () => { test("should handle case insensitive matching", () => {
@@ -77,7 +125,7 @@ describe("URL Construction", () => {
sortOrder: "desc", sortOrder: "desc",
}); });
expect(url).toContain("b-buy-sell/canada/iphone/k0c132l1700272"); expect(url).toContain("b-phones/gta/iphone/k0c132l1700272");
expect(url).toContain("sort=relevancyDesc"); expect(url).toContain("sort=relevancyDesc");
expect(url).toContain("order=DESC"); expect(url).toContain("order=DESC");
}); });
@@ -97,6 +145,7 @@ describe("URL Construction", () => {
sortBy: "date", sortBy: "date",
sortOrder: "asc", sortOrder: "asc",
}); });
expect(dateUrl.match(/sort=/g)?.length).toBe(1);
expect(dateUrl).toContain("sort=DATE"); expect(dateUrl).toContain("sort=DATE");
expect(dateUrl).toContain("order=ASC"); expect(dateUrl).toContain("order=ASC");
@@ -108,12 +157,23 @@ describe("URL Construction", () => {
expect(priceUrl).toContain("order=DESC"); expect(priceUrl).toContain("order=DESC");
}); });
test("includes price filters in the generated search URL", () => {
const url = buildSearchUrl("iphone", {
priceMin: 8000,
priceMax: 10000,
});
expect(url).toContain("priceMin=80");
expect(url).toContain("priceMax=100");
});
test("should handle string location/category inputs", () => { test("should handle string location/category inputs", () => {
const url = buildSearchUrl("iphone", { const url = buildSearchUrl("iphone", {
location: "toronto", location: "toronto",
category: "phones", category: "phones",
}); });
expect(url).toContain("/b-phones/toronto/");
expect(url).toContain("k0c132l1700273"); // phones + toronto expect(url).toContain("k0c132l1700273"); // phones + toronto
}); });
}); });
@@ -155,3 +215,770 @@ describe("Error Classes", () => {
expect(error.name).toBe("ValidationError"); expect(error.name).toBe("ValidationError");
}); });
}); });
describe("fetchKijijiItems", () => {
test("filters fetched listings by priceMin and priceMax", async () => {
const searchHtml = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:1": {
url: "/v-low/k0l0",
title: "Low Listing",
},
"Listing:2": {
url: "/v-mid/k0l0",
title: "Mid Listing",
},
"Listing:3": {
url: "/v-high/k0l0",
title: "High Listing",
},
},
},
},
})}
</script>
</html>
`;
const listingHtml = (title: string, amount: number, slug: string) => `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: `/${slug}`,
title,
price: { amount, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
global.fetch = mock((input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/k0c0l1700272")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(searchHtml),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-low/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Low Listing", 7000, "v-low/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-mid/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Mid Listing", 9000, "v-mid/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-high/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("High Listing", 12000, "v-high/k0l0")),
headers: { get: () => null },
url,
});
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
const results = await fetchKijijiItems(
"phone",
1000,
"https://www.kijiji.ca",
{ maxPages: 1, priceMin: 8000, priceMax: 10000 },
);
expect(results).toEqual([
expect.objectContaining({ title: "Mid Listing" }),
]);
});
test("respects REQUESTS_PER_SECOND without concurrent detail fetch bursts", async () => {
const searchHtml = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:1": { url: "/v-one/k0l0", title: "One" },
"Listing:2": { url: "/v-two/k0l0", title: "Two" },
"Listing:3": { url: "/v-three/k0l0", title: "Three" },
},
},
},
})}
</script>
</html>
`;
const listingHtml = (title: string, slug: string) => `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: `/${slug}`,
title,
price: { amount: 10000, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
let activeDetailRequests = 0;
let maxActiveDetailRequests = 0;
global.fetch = mock(async (input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/k0c0l1700272")) {
return {
ok: true,
text: () => Promise.resolve(searchHtml),
headers: { get: () => null },
url,
};
}
activeDetailRequests++;
maxActiveDetailRequests = Math.max(
maxActiveDetailRequests,
activeDetailRequests,
);
await new Promise((resolve) => setTimeout(resolve, 5));
activeDetailRequests--;
if (url.endsWith("/v-one/k0l0")) {
return {
ok: true,
text: () => Promise.resolve(listingHtml("One", "v-one/k0l0")),
headers: { get: () => null },
url,
};
}
if (url.endsWith("/v-two/k0l0")) {
return {
ok: true,
text: () => Promise.resolve(listingHtml("Two", "v-two/k0l0")),
headers: { get: () => null },
url,
};
}
if (url.endsWith("/v-three/k0l0")) {
return {
ok: true,
text: () => Promise.resolve(listingHtml("Three", "v-three/k0l0")),
headers: { get: () => null },
url,
};
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
const results = await fetchKijijiItems(
"phone",
1,
"https://www.kijiji.ca",
{ maxPages: 1 },
);
expect(results).toHaveLength(3);
expect(maxActiveDetailRequests).toBe(1);
});
test("allows bounded concurrency to scale with REQUESTS_PER_SECOND", async () => {
const searchHtml = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:1": { url: "/v-one/k0l0", title: "One" },
"Listing:2": { url: "/v-two/k0l0", title: "Two" },
},
},
},
})}
</script>
</html>
`;
const listingHtml = (title: string, slug: string) => `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: `/${slug}`,
title,
price: { amount: 10000, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
let activeDetailRequests = 0;
let maxActiveDetailRequests = 0;
global.fetch = mock(async (input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/k0c0l1700272")) {
return {
ok: true,
text: () => Promise.resolve(searchHtml),
headers: { get: () => null },
url,
};
}
activeDetailRequests++;
maxActiveDetailRequests = Math.max(
maxActiveDetailRequests,
activeDetailRequests,
);
await new Promise((resolve) => setTimeout(resolve, 300));
activeDetailRequests--;
if (url.endsWith("/v-one/k0l0")) {
return {
ok: true,
text: () => Promise.resolve(listingHtml("One", "v-one/k0l0")),
headers: { get: () => null },
url,
};
}
if (url.endsWith("/v-two/k0l0")) {
return {
ok: true,
text: () => Promise.resolve(listingHtml("Two", "v-two/k0l0")),
headers: { get: () => null },
url,
};
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
const results = await fetchKijijiItems(
"phone",
4,
"https://www.kijiji.ca",
{ maxPages: 1 },
);
expect(results).toHaveLength(2);
expect(maxActiveDetailRequests).toBeGreaterThan(1);
expect(maxActiveDetailRequests).toBeLessThanOrEqual(4);
});
test("classifies the filtered Kijiji result set in unstable mode", async () => {
const searchHtml = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:1": { url: "/v-stable-one/k0l0", title: "Stable Listing One" },
"Listing:2": { url: "/v-stable-two/k0l0", title: "Stable Listing Two" },
"Listing:3": { url: "/v-unstable/k0l0", title: "Unstable Listing" },
},
},
},
})}
</script>
</html>
`;
const listingHtml = (title: string, amount: number, slug: string) => `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: `/${slug}`,
title,
price: { amount, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
global.fetch = mock((input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/k0c0l1700272") && url.includes("priceMin=80")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(searchHtml),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-stable-one/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Stable Listing One", 10000, "v-stable-one/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-stable-two/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Stable Listing Two", 11000, "v-stable-two/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-unstable/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Unstable Listing", 7000, "v-unstable/k0l0")),
headers: { get: () => null },
url,
});
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
const results = await fetchKijijiItems(
"phone",
1000,
"https://www.kijiji.ca",
{ maxPages: 1, priceMin: 8000 },
{},
{ hideUnstableResults: true },
);
expect(results).toEqual({
results: [
expect.objectContaining({ title: "Stable Listing One" }),
expect.objectContaining({ title: "Stable Listing Two" }),
],
unstableResults: [],
});
});
test("keeps out-of-range Kijiji listings out of both buckets and median input", async () => {
const searchHtml = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:1": { url: "/v-stable-one/k0l0", title: "Stable Listing One" },
"Listing:2": { url: "/v-stable-two/k0l0", title: "Stable Listing Two" },
"Listing:3": { url: "/v-out-of-range-high/k0l0", title: "Out Of Range High" },
"Listing:4": { url: "/v-out-of-range-low/k0l0", title: "Out Of Range Low" },
},
},
},
})}
</script>
</html>
`;
const listingHtml = (title: string, amount: number, slug: string) => `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: `/${slug}`,
title,
price: { amount, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
global.fetch = mock((input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/k0c0l1700272") && url.includes("priceMin=80") && url.includes("priceMax=150")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(searchHtml),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-stable-one/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Stable Listing One", 10000, "v-stable-one/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-stable-two/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Stable Listing Two", 11000, "v-stable-two/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-out-of-range-high/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Out Of Range High", 20000, "v-out-of-range-high/k0l0")),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-out-of-range-low/k0l0")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(listingHtml("Out Of Range Low", 7000, "v-out-of-range-low/k0l0")),
headers: { get: () => null },
url,
});
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
const results = await fetchKijijiItems(
"phone",
1000,
"https://www.kijiji.ca",
{ maxPages: 1, priceMin: 8000, priceMax: 15000 },
{},
{ hideUnstableResults: true },
);
expect(results).toEqual({
results: [
expect.objectContaining({ title: "Stable Listing One" }),
expect.objectContaining({ title: "Stable Listing Two" }),
],
unstableResults: [],
});
});
test("parseDetailedListing ignores non-root listing-like entities", async () => {
const html = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"SearchListingCard:1": {
url: "/v-card/k0l0",
title: "Card Listing",
},
"Listing:detail": {
url: "/v-detailed/k0l0",
title: "Detailed Listing",
price: { amount: 10000, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
const result = await parseDetailedListing(html, "https://www.kijiji.ca");
expect(result).toEqual(
expect.objectContaining({ title: "Detailed Listing" }),
);
});
test("fetchSellerDetails does not fire concurrent GraphQL requests", async () => {
const html = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: "/v-test/k0l0",
title: "Test Listing",
price: { amount: 10000, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
posterInfo: { posterId: "123" },
},
},
},
},
})}
</script>
</html>
`;
let activeAnvilRequests = 0;
let maxActiveAnvilRequests = 0;
global.fetch = mock(async (input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/anvil/api")) {
activeAnvilRequests++;
maxActiveAnvilRequests = Math.max(
maxActiveAnvilRequests,
activeAnvilRequests,
);
await new Promise((resolve) => setTimeout(resolve, 50));
activeAnvilRequests--;
return {
ok: true,
json: () => Promise.resolve({ data: { user: {} } }),
headers: { get: () => null },
url,
};
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
await parseDetailedListing(html, "https://www.kijiji.ca", {
includeClientSideData: true,
sellerDataDepth: "detailed",
});
expect(maxActiveAnvilRequests).toBe(1);
});
test("returns results and unstableResults when unstable mode is enabled", async () => {
const searchHtml = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:1": {
url: "/v-stable-one/k0l0",
title: "Stable Listing One",
},
"Listing:2": {
url: "/v-stable-two/k0l0",
title: "Stable Listing Two",
},
"Listing:3": {
url: "/v-unstable/k0l0",
title: "Unstable Listing",
},
},
},
},
})}
</script>
</html>
`;
const listingHtml = (title: string, amount: number, slug: string) => `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"Listing:detail": {
url: `/${slug}`,
title,
price: { amount, currency: "CAD", type: "FIXED" },
type: "OFFER",
status: "ACTIVE",
},
},
},
},
})}
</script>
</html>
`;
global.fetch = mock((input: string | URL | Request) => {
const url = typeof input === "string" ? input : input.toString();
if (url.includes("/k0c0l1700272")) {
return Promise.resolve({
ok: true,
text: () => Promise.resolve(searchHtml),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-stable-one/k0l0")) {
return Promise.resolve({
ok: true,
text: () =>
Promise.resolve(
listingHtml("Stable Listing One", 10000, "v-stable-one/k0l0"),
),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-stable-two/k0l0")) {
return Promise.resolve({
ok: true,
text: () =>
Promise.resolve(
listingHtml("Stable Listing Two", 11000, "v-stable-two/k0l0"),
),
headers: { get: () => null },
url,
});
}
if (url.endsWith("/v-unstable/k0l0")) {
return Promise.resolve({
ok: true,
text: () =>
Promise.resolve(
listingHtml("Unstable Listing", 7000, "v-unstable/k0l0"),
),
headers: { get: () => null },
url,
});
}
throw new Error(`Unexpected URL: ${url}`);
}) as typeof fetch;
const results = await fetchKijijiItems(
"phone",
1000,
"https://www.kijiji.ca",
{ maxPages: 1 },
{},
{ hideUnstableResults: true },
);
expect(results).toEqual({
results: [
expect.objectContaining({ title: "Stable Listing One" }),
expect.objectContaining({ title: "Stable Listing Two" }),
],
unstableResults: [expect.objectContaining({ title: "Unstable Listing" })],
});
});
});
describe("parseSearch", () => {
test("ignores SearchListingCard noise keys", () => {
const html = `
<html>
<script id="__NEXT_DATA__" type="application/json">
${JSON.stringify({
props: {
pageProps: {
__APOLLO_STATE__: {
"SearchListingCard:1": {
url: "/v-card-noise/k0l0",
title: "Card Noise",
},
"Listing:1": {
url: "/v-real-result/k0l0",
title: "Real Result",
},
},
},
},
})}
</script>
</html>
`;
expect(parseSearch(html, "https://www.kijiji.ca")).toEqual([
{
listingLink: "https://www.kijiji.ca/v-real-result/k0l0",
name: "Real Result",
},
]);
});
});

View File

@@ -0,0 +1,84 @@
import { describe, expect, test } from "bun:test";
import type { ListingDetails } from "../src/types/common";
import { classifyUnstableListings } from "../src/utils/unstable";
interface TestListing extends ListingDetails {
id: string;
}
function makeListing(id: string, cents: number): TestListing {
return {
id,
url: `https://example.com/${id}`,
title: id,
listingPrice: {
amountFormatted: `$${(cents / 100).toFixed(2)}`,
cents,
currency: "CAD",
},
listingType: "test",
listingStatus: "active",
};
}
describe("classifyUnstableListings", () => {
test("moves listings below 80% of median into unstableResults", () => {
const listings = [
makeListing("stable-1", 100_00),
makeListing("stable-2", 110_00),
makeListing("unstable", 70_00),
];
const buckets = classifyUnstableListings(listings);
expect(buckets.results.map((listing) => listing.id)).toEqual(["stable-1", "stable-2"]);
expect(buckets.unstableResults.map((listing) => listing.id)).toEqual(["unstable"]);
});
test("uses the midpoint median for even-sized priced inputs", () => {
const listings = [
makeListing("low", 79_00),
makeListing("mid-low", 100_00),
makeListing("mid-high", 120_00),
makeListing("high", 140_00),
];
const buckets = classifyUnstableListings(listings);
expect(buckets.results.map((listing) => listing.id)).toEqual(["mid-low", "mid-high", "high"]);
expect(buckets.unstableResults.map((listing) => listing.id)).toEqual(["low"]);
});
test("keeps non-positive prices in results and excludes them from the median input", () => {
const listings = [
makeListing("zero", 0),
makeListing("negative", -500),
makeListing("stable-1", 100_00),
makeListing("stable-2", 120_00),
makeListing("unstable", 70_00),
];
const buckets = classifyUnstableListings(listings);
expect(buckets.results.map((listing) => listing.id)).toEqual([
"zero",
"negative",
"stable-1",
"stable-2",
]);
expect(buckets.unstableResults.map((listing) => listing.id)).toEqual(["unstable"]);
});
test("returns all listings in results when fewer than two valid prices are present", () => {
const listings = [
makeListing("zero", 0),
makeListing("negative", -100),
makeListing("only-valid", 150_00),
];
const buckets = classifyUnstableListings(listings);
expect(buckets.results.map((listing) => listing.id)).toEqual(["zero", "negative", "only-valid"]);
expect(buckets.unstableResults).toEqual([]);
});
});

View File

@@ -116,6 +116,8 @@ export async function handleMcpRequest(req: Request): Promise<Response> {
if (args.priceMax) if (args.priceMax)
params.append("priceMax", args.priceMax.toString()); params.append("priceMax", args.priceMax.toString());
if (args.cookies) params.append("cookies", args.cookies); if (args.cookies) params.append("cookies", args.cookies);
if (args.unstableFilter !== undefined)
params.append("unstableFilter", args.unstableFilter.toString());
console.log( console.log(
`[MCP] Calling Kijiji API: ${API_BASE_URL}/kijiji?${params.toString()}`, `[MCP] Calling Kijiji API: ${API_BASE_URL}/kijiji?${params.toString()}`,
@@ -155,6 +157,8 @@ export async function handleMcpRequest(req: Request): Promise<Response> {
if (args.location) params.append("location", args.location); if (args.location) params.append("location", args.location);
if (args.maxItems) if (args.maxItems)
params.append("maxItems", args.maxItems.toString()); params.append("maxItems", args.maxItems.toString());
if (args.unstableFilter !== undefined)
params.append("unstableFilter", args.unstableFilter.toString());
console.log( console.log(
`[MCP] Calling Facebook API: ${API_BASE_URL}/facebook?${params.toString()}`, `[MCP] Calling Facebook API: ${API_BASE_URL}/facebook?${params.toString()}`,
@@ -207,6 +211,8 @@ export async function handleMcpRequest(req: Request): Promise<Response> {
params.append("canadaOnly", args.canadaOnly.toString()); params.append("canadaOnly", args.canadaOnly.toString());
if (args.maxItems) if (args.maxItems)
params.append("maxItems", args.maxItems.toString()); params.append("maxItems", args.maxItems.toString());
if (args.unstableFilter !== undefined)
params.append("unstableFilter", args.unstableFilter.toString());
console.log( console.log(
`[MCP] Calling eBay API: ${API_BASE_URL}/ebay?${params.toString()}`, `[MCP] Calling eBay API: ${API_BASE_URL}/ebay?${params.toString()}`,

View File

@@ -57,6 +57,11 @@ export const tools = [
description: description:
"Optional: Kijiji session cookies to bypass bot detection (JSON array or 'name1=value1; name2=value2')", "Optional: Kijiji session cookies to bypass bot detection (JSON array or 'name1=value1; name2=value2')",
}, },
unstableFilter: {
type: "boolean",
description:
"optional: when enabled, listings priced more than 20% below the median are moved into an `unstableResults` bucket. Changes the response shape from a plain list to an object with `results` and `unstableResults`.",
},
}, },
required: ["query"], required: ["query"],
}, },
@@ -81,6 +86,11 @@ export const tools = [
description: "Maximum number of items to return", description: "Maximum number of items to return",
default: 5, default: 5,
}, },
unstableFilter: {
type: "boolean",
description:
"optional: when enabled, listings priced more than 20% below the median are moved into an `unstableResults` bucket. Changes the response shape from a plain list to an object with `results` and `unstableResults`.",
},
}, },
required: ["query"], required: ["query"],
}, },
@@ -134,6 +144,11 @@ export const tools = [
description: "Maximum number of items to return", description: "Maximum number of items to return",
default: 5, default: 5,
}, },
unstableFilter: {
type: "boolean",
description:
"optional: when enabled, listings priced more than 20% below the median are moved into an `unstableResults` bucket. Changes the response shape from a plain list to an object with `results` and `unstableResults`.",
},
}, },
required: ["query"], required: ["query"],
}, },

View File

@@ -54,3 +54,102 @@ describe("MCP protocol cookie inputs", () => {
expect(String(calledUrl)).not.toContain("cookies="); expect(String(calledUrl)).not.toContain("cookies=");
}); });
}); });
describe("MCP protocol unstableFilter", () => {
beforeEach(() => {
global.fetch = mock(() =>
Promise.resolve(new Response(JSON.stringify([]), { status: 200 })),
) as typeof fetch;
});
afterEach(() => {
global.fetch = originalFetch;
});
test("all search tools should document the unstableFilter property", () => {
const toolNames = ["search_kijiji", "search_facebook", "search_ebay"];
for (const toolName of toolNames) {
const tool = tools.find((t) => t.name === toolName);
expect(tool).toBeDefined();
expect(tool?.inputSchema.properties).toHaveProperty("unstableFilter");
const prop = tool?.inputSchema.properties.unstableFilter as any;
expect(prop.type).toBe("boolean");
expect(prop.description).toContain("optional");
expect(prop.description).toContain("20%");
expect(prop.description).toContain("median");
expect(prop.description).toContain("unstableResults");
}
});
test("handler should forward unstableFilter=true for search_kijiji", async () => {
await handleMcpRequest(
new Request("http://localhost", {
method: "POST",
body: JSON.stringify({
jsonrpc: "2.0",
id: 1,
method: "tools/call",
params: {
name: "search_kijiji",
arguments: {
query: "laptop",
unstableFilter: true,
},
},
}),
}),
);
const calledUrl = (global.fetch as ReturnType<typeof mock>).mock
.calls[0]?.[0];
expect(String(calledUrl)).toContain("unstableFilter=true");
});
test("handler should forward unstableFilter=true for search_facebook", async () => {
await handleMcpRequest(
new Request("http://localhost", {
method: "POST",
body: JSON.stringify({
jsonrpc: "2.0",
id: 1,
method: "tools/call",
params: {
name: "search_facebook",
arguments: {
query: "laptop",
unstableFilter: true,
},
},
}),
}),
);
const calledUrl = (global.fetch as ReturnType<typeof mock>).mock
.calls[0]?.[0];
expect(String(calledUrl)).toContain("unstableFilter=true");
});
test("handler should forward unstableFilter=true for search_ebay", async () => {
await handleMcpRequest(
new Request("http://localhost", {
method: "POST",
body: JSON.stringify({
jsonrpc: "2.0",
id: 1,
method: "tools/call",
params: {
name: "search_ebay",
arguments: {
query: "laptop",
unstableFilter: true,
},
},
}),
}),
);
const calledUrl = (global.fetch as ReturnType<typeof mock>).mock
.calls[0]?.[0];
expect(String(calledUrl)).toContain("unstableFilter=true");
});
});