feat: implement cookie priority hierarchy (URL param > env var > file) for Facebook and eBay scrapers
This commit is contained in:
54
AGENTS.md
54
AGENTS.md
@@ -83,7 +83,7 @@ HTTP server using `Bun.serve()` on port 4005 (or `PORT` env var).
|
||||
- `GET /api/status` - Health check
|
||||
- `GET /api/kijiji?q={query}` - Search Kijiji
|
||||
- `GET /api/facebook?q={query}&location={location}&cookies={cookies}` - Search Facebook
|
||||
- `GET /api/ebay?q={query}&minPrice=&maxPrice=&strictMode=&exclusions=&keywords=&buyItNowOnly=&canadaOnly=` - Search eBay
|
||||
- `GET /api/ebay?q={query}&minPrice=&maxPrice=&strictMode=&exclusions=&keywords=&buyItNowOnly=&canadaOnly=&cookies=` - Search eBay
|
||||
- `GET /api/*` - 404 fallback
|
||||
|
||||
### MCP Server (`@marketplace-scrapers/mcp-server`)
|
||||
@@ -96,7 +96,7 @@ MCP JSON-RPC 2.0 server on port 4006 (or `MCP_PORT` env var).
|
||||
**Tools:**
|
||||
- `search_kijiji` - Search Kijiji (query, maxItems)
|
||||
- `search_facebook` - Search Facebook (query, location, maxItems, cookiesSource)
|
||||
- `search_ebay` - Search eBay (query, minPrice, maxPrice, strictMode, exclusions, keywords, buyItNowOnly, canadaOnly, maxItems)
|
||||
- `search_ebay` - Search eBay (query, minPrice, maxPrice, strictMode, exclusions, keywords, buyItNowOnly, canadaOnly, maxItems, cookies)
|
||||
|
||||
## API Response Formats
|
||||
|
||||
@@ -117,6 +117,52 @@ All scrapers return arrays of listing objects with these common fields:
|
||||
### eBay-specific fields
|
||||
Minimal - mainly the common fields
|
||||
|
||||
## Cookie Management
|
||||
|
||||
Both **Facebook Marketplace** and **eBay** require valid session cookies for reliable scraping.
|
||||
|
||||
### Cookie Priority Hierarchy (High → Low)
|
||||
All scrapers follow this loading order:
|
||||
1. **URL/API Parameter** - Passed directly via `cookies` parameter (highest priority)
|
||||
2. **Environment Variable** - `FACEBOOK_COOKIE` or `EBAY_COOKIE`
|
||||
3. **Cookie File** - `cookies/facebook.json` or `cookies/ebay.json` (fallback)
|
||||
|
||||
### Facebook Cookies
|
||||
- **Required for**: Facebook Marketplace scraping
|
||||
- **Format**: JSON array (see `cookies/README.md`)
|
||||
- **Key cookies**: `c_user`, `xs`, `fr`, `datr`, `sb`
|
||||
|
||||
**Setup:**
|
||||
```bash
|
||||
# Option 1: File (fallback)
|
||||
# Create cookies/facebook.json with cookie array
|
||||
|
||||
# Option 2: Environment variable
|
||||
export FACEBOOK_COOKIE='c_user=123; xs=token; fr=request'
|
||||
|
||||
# Option 3: URL parameter (highest priority)
|
||||
curl "http://localhost:4005/api/facebook?q=laptop&cookies=[{...}]"
|
||||
```
|
||||
|
||||
### eBay Cookies
|
||||
- **Required for**: Bypassing bot detection
|
||||
- **Format**: Cookie string `"name=value; name2=value2"`
|
||||
- **Key cookies**: `s`, `ds2`, `ebay`, `dp1`, `nonsession`
|
||||
|
||||
**Setup:**
|
||||
```bash
|
||||
# Option 1: File (fallback)
|
||||
# Create cookies/ebay.json with cookie string
|
||||
|
||||
# Option 2: Environment variable
|
||||
export EBAY_COOKIE='s=VALUE; ds2=VALUE; ebay=VALUE'
|
||||
|
||||
# Option 3: URL parameter (highest priority)
|
||||
curl "http://localhost:4005/api/ebay?q=laptop&cookies=s=VALUE;ds2=VALUE"
|
||||
```
|
||||
|
||||
**Important - eBay Bot Detection**: Without cookies, eBay returns a "Checking your browser" challenge page instead of listings.
|
||||
|
||||
## Technical Details
|
||||
|
||||
- **TypeScript** with path mapping (`@/*` → `src/*`) per package
|
||||
@@ -126,7 +172,7 @@ Minimal - mainly the common fields
|
||||
|
||||
## Development Notes
|
||||
|
||||
- Facebook requires valid session cookies - set `FACEBOOK_COOKIE` env var or create `cookies/facebook.json`
|
||||
- eBay uses custom headers to bypass basic bot detection
|
||||
- **Cookie files** are git-ignored for security (see `cookies/README.md`)
|
||||
- Kijiji parses Apollo state from Next.js hydration data
|
||||
- All scrapers handle retries on 429/5xx errors
|
||||
- Cookie priority ensures flexibility across different deployment environments
|
||||
|
||||
Reference in New Issue
Block a user