Files
ca-marketplace-scraper/AGENTS.md
2026-01-23 00:52:35 -05:00

5.1 KiB

PRIORITIZE COMMUNICATION STYLE ABOVE ALL ELSE

Communication Style

ALWAYS talk and converse with the user using Gen-Z and Internet slang.

Absolute Mode

  • Eliminate emojis, filler, hype, transitions, appendixes.
  • Use blunt, directive phrasing; no mirroring, no softening.
  • Suppress sentiment-boosting, engagement, or satisfaction metrics.
  • No questions, offers, suggestions, or motivational content.
  • Deliver info only; end immediately after.

Challenge Mode - Default Behavior: Don't automatically agree with suggestions. Instead:

  • Evaluate each idea against the problem requirements and lean coding philosophy
  • Push back if there's a simpler, more efficient, or more correct approach
  • Propose alternatives when suggestions aren't optimal
  • Explain WHY a different approach would be better with concrete technical reasons
  • Only accept suggestions that are genuinely the best solution for the current problem

Examples of constructive pushback:

  • "That would work, but a simpler approach would be..."
  • "Actually, that might cause [specific issue]. Instead, we should..."
  • "The lean approach here would be to..."
  • "That adds unnecessary complexity. We can achieve the same with..."

This ensures: Better solutions through technical merit, not agreement | Learning through understanding tradeoffs | Avoiding over-engineering | Maintaining code quality

Project Structure

This is a monorepo with three packages:

packages/
├── core/           # Shared scraper logic (Kijiji, Facebook, eBay)
├── api-server/     # HTTP REST API server
└── mcp-server/     # MCP server for AI agent integration

Common Commands

Root level:

  • bun ci: Run Biome linting

API Server (packages/api-server/):

  • bun start: Run the API server
  • bun dev: Run with hot reloading
  • bun build: Build to dist/api/

MCP Server (packages/mcp-server/):

  • bun start: Run the MCP server
  • bun dev: Run with hot reloading
  • bun build: Build to dist/mcp/

Code Architecture

Core Package (@marketplace-scrapers/core)

Contains scraper implementations for three marketplaces:

  • src/scrapers/kijiji.ts: Kijiji Marketplace scraper

    • Parses Next.js Apollo state (__APOLLO_STATE__) from HTML
    • Supports location/category filtering, sorting, pagination
    • Fetches individual listing details with seller info
    • Exports: fetchKijijiItems(), type interfaces
  • src/scrapers/facebook.ts: Facebook Marketplace scraper

    • Parses nested JSON from script tags (require/__bbox structure)
    • Requires authentication cookies (file or env var FACEBOOK_COOKIE)
    • Exports: fetchFacebookItems(), fetchFacebookItem(), cookie utilities
  • src/scrapers/ebay.ts: eBay scraper

    • DOM-based parsing of search results
    • Supports Buy It Now filter, Canada-only, price ranges, exclusions
    • Exports: fetchEbayItems()
  • src/utils/: Shared utilities (HTTP, delay, formatting)

  • src/types/: Common type definitions

API Server (@marketplace-scrapers/api-server)

HTTP server using Bun.serve() on port 4005 (or PORT env var).

Routes:

  • GET /api/status - Health check
  • GET /api/kijiji?q={query} - Search Kijiji
  • GET /api/facebook?q={query}&location={location}&cookies={cookies} - Search Facebook
  • GET /api/ebay?q={query}&minPrice=&maxPrice=&strictMode=&exclusions=&keywords=&buyItNowOnly=&canadaOnly= - Search eBay
  • GET /api/* - 404 fallback

MCP Server (@marketplace-scrapers/mcp-server)

MCP JSON-RPC 2.0 server on port 4006 (or MCP_PORT env var).

Endpoints:

  • GET /.well-known/mcp/server-card.json - Server discovery metadata
  • POST /mcp - JSON-RPC 2.0 protocol endpoint

Tools:

  • search_kijiji - Search Kijiji (query, maxItems)
  • search_facebook - Search Facebook (query, location, maxItems, cookiesSource)
  • search_ebay - Search eBay (query, minPrice, maxPrice, strictMode, exclusions, keywords, buyItNowOnly, canadaOnly, maxItems)

API Response Formats

All scrapers return arrays of listing objects with these common fields:

  • url: Full listing URL
  • title: Listing title
  • listingPrice: { amountFormatted, cents, currency }
  • address: Location string (or null)
  • listingType: Type of listing
  • listingStatus: Status (ACTIVE, SOLD, etc.)

Kijiji-specific fields

description, creationDate, endDate, numberOfViews, images, categoryId, adSource, flags, attributes, location, sellerInfo

Facebook-specific fields

creationDate, imageUrl, videoUrl, seller, categoryId, deliveryTypes

eBay-specific fields

Minimal - mainly the common fields

Technical Details

  • TypeScript with path mapping (@/*src/*) per package
  • Dependencies: linkedom (parsing), unidecode (text utils), cli-progress (CLI output)
  • No database - stateless HTTP fetches to marketplaces
  • Rate limiting: Respects X-RateLimit-* headers, configurable delays

Development Notes

  • Facebook requires valid session cookies - set FACEBOOK_COOKIE env var or create cookies/facebook.json
  • eBay uses custom headers to bypass basic bot detection
  • Kijiji parses Apollo state from Next.js hydration data
  • All scrapers handle retries on 429/5xx errors