chore: agents update
This commit is contained in:
111
AGENTS.md
111
AGENTS.md
@@ -1,9 +1,3 @@
|
||||
# AGENTS.md
|
||||
|
||||
This file provides guidance to coding agents when working with code in this repository.
|
||||
|
||||
The project uses TypeScript with path mapping (`@/*` to `src/*`). Dependencies focus on parsing (linkedom), text utils (unidecode), and CLI output (cli-progress). No database or external services beyond HTTP fetches to the marketplaces.
|
||||
|
||||
PRIORITIZE COMMUNICATION STYLE ABOVE ALL ELSE
|
||||
|
||||
## Communication Style
|
||||
@@ -31,3 +25,108 @@ Examples of constructive pushback:
|
||||
- "That adds unnecessary complexity. We can achieve the same with..."
|
||||
|
||||
This ensures: Better solutions through technical merit, not agreement | Learning through understanding tradeoffs | Avoiding over-engineering | Maintaining code quality
|
||||
|
||||
## Project Structure
|
||||
|
||||
This is a **monorepo** with three packages:
|
||||
|
||||
```
|
||||
packages/
|
||||
├── core/ # Shared scraper logic (Kijiji, Facebook, eBay)
|
||||
├── api-server/ # HTTP REST API server
|
||||
└── mcp-server/ # MCP server for AI agent integration
|
||||
```
|
||||
|
||||
## Common Commands
|
||||
|
||||
**Root level:**
|
||||
- `bun ci`: Run Biome linting
|
||||
|
||||
**API Server (`packages/api-server/`):**
|
||||
- `bun start`: Run the API server
|
||||
- `bun dev`: Run with hot reloading
|
||||
- `bun build`: Build to `dist/api/`
|
||||
|
||||
**MCP Server (`packages/mcp-server/`):**
|
||||
- `bun start`: Run the MCP server
|
||||
- `bun dev`: Run with hot reloading
|
||||
- `bun build`: Build to `dist/mcp/`
|
||||
|
||||
## Code Architecture
|
||||
|
||||
### Core Package (`@marketplace-scrapers/core`)
|
||||
Contains scraper implementations for three marketplaces:
|
||||
|
||||
- **`src/scrapers/kijiji.ts`**: Kijiji Marketplace scraper
|
||||
- Parses Next.js Apollo state (`__APOLLO_STATE__`) from HTML
|
||||
- Supports location/category filtering, sorting, pagination
|
||||
- Fetches individual listing details with seller info
|
||||
- Exports: `fetchKijijiItems()`, type interfaces
|
||||
|
||||
- **`src/scrapers/facebook.ts`**: Facebook Marketplace scraper
|
||||
- Parses nested JSON from script tags (`require/__bbox` structure)
|
||||
- Requires authentication cookies (file or env var `FACEBOOK_COOKIE`)
|
||||
- Exports: `fetchFacebookItems()`, `fetchFacebookItem()`, cookie utilities
|
||||
|
||||
- **`src/scrapers/ebay.ts`**: eBay scraper
|
||||
- DOM-based parsing of search results
|
||||
- Supports Buy It Now filter, Canada-only, price ranges, exclusions
|
||||
- Exports: `fetchEbayItems()`
|
||||
|
||||
- **`src/utils/`**: Shared utilities (HTTP, delay, formatting)
|
||||
- **`src/types/`**: Common type definitions
|
||||
|
||||
### API Server (`@marketplace-scrapers/api-server`)
|
||||
HTTP server using `Bun.serve()` on port 4005 (or `PORT` env var).
|
||||
|
||||
**Routes:**
|
||||
- `GET /api/status` - Health check
|
||||
- `GET /api/kijiji?q={query}` - Search Kijiji
|
||||
- `GET /api/facebook?q={query}&location={location}&cookies={cookies}` - Search Facebook
|
||||
- `GET /api/ebay?q={query}&minPrice=&maxPrice=&strictMode=&exclusions=&keywords=&buyItNowOnly=&canadaOnly=` - Search eBay
|
||||
- `GET /api/*` - 404 fallback
|
||||
|
||||
### MCP Server (`@marketplace-scrapers/mcp-server`)
|
||||
MCP JSON-RPC 2.0 server on port 4006 (or `MCP_PORT` env var).
|
||||
|
||||
**Endpoints:**
|
||||
- `GET /.well-known/mcp/server-card.json` - Server discovery metadata
|
||||
- `POST /mcp` - JSON-RPC 2.0 protocol endpoint
|
||||
|
||||
**Tools:**
|
||||
- `search_kijiji` - Search Kijiji (query, maxItems)
|
||||
- `search_facebook` - Search Facebook (query, location, maxItems, cookiesSource)
|
||||
- `search_ebay` - Search eBay (query, minPrice, maxPrice, strictMode, exclusions, keywords, buyItNowOnly, canadaOnly, maxItems)
|
||||
|
||||
## API Response Formats
|
||||
|
||||
All scrapers return arrays of listing objects with these common fields:
|
||||
- `url`: Full listing URL
|
||||
- `title`: Listing title
|
||||
- `listingPrice`: `{ amountFormatted, cents, currency }`
|
||||
- `address`: Location string (or null)
|
||||
- `listingType`: Type of listing
|
||||
- `listingStatus`: Status (ACTIVE, SOLD, etc.)
|
||||
|
||||
### Kijiji-specific fields
|
||||
`description`, `creationDate`, `endDate`, `numberOfViews`, `images`, `categoryId`, `adSource`, `flags`, `attributes`, `location`, `sellerInfo`
|
||||
|
||||
### Facebook-specific fields
|
||||
`creationDate`, `imageUrl`, `videoUrl`, `seller`, `categoryId`, `deliveryTypes`
|
||||
|
||||
### eBay-specific fields
|
||||
Minimal - mainly the common fields
|
||||
|
||||
## Technical Details
|
||||
|
||||
- **TypeScript** with path mapping (`@/*` → `src/*`) per package
|
||||
- **Dependencies**: linkedom (parsing), unidecode (text utils), cli-progress (CLI output)
|
||||
- **No database** - stateless HTTP fetches to marketplaces
|
||||
- **Rate limiting**: Respects `X-RateLimit-*` headers, configurable delays
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Facebook requires valid session cookies - set `FACEBOOK_COOKIE` env var or create `cookies/facebook.json`
|
||||
- eBay uses custom headers to bypass basic bot detection
|
||||
- Kijiji parses Apollo state from Next.js hydration data
|
||||
- All scrapers handle retries on 429/5xx errors
|
||||
|
||||
Reference in New Issue
Block a user