Compare commits
12 Commits
b657ea594a
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| ec545723bb | |||
| 0a246a29bf | |||
| 7ab33d0b02 | |||
| d2c3c07e7d | |||
| 0470a7bec7 | |||
| 89ad1c521f | |||
| 5c732287c5 | |||
| 20fb46190a | |||
| e791fc5478 | |||
| c1fa5168dc | |||
| ec2a26cedf | |||
| 5d99e984e0 |
104
FMARKETPLACE.md
104
FMARKETPLACE.md
@@ -1,44 +1,56 @@
|
|||||||
# Facebook Marketplace API Reverse Engineering
|
# Facebook Marketplace API Reverse Engineering
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
This document tracks findings from reverse-engineering Facebook Marketplace APIs for listing details.
|
|
||||||
|
This document tracks findings from reverse-engineering Facebook Marketplace APIs for
|
||||||
|
listing details.
|
||||||
|
|
||||||
## Current Implementation Status
|
## Current Implementation Status
|
||||||
|
|
||||||
- Search functionality: Implemented in `src/facebook.ts`
|
- Search functionality: Implemented in `src/facebook.ts`
|
||||||
- Individual listing details: Not yet implemented
|
- Individual listing details: Not yet implemented
|
||||||
|
|
||||||
## Findings
|
## Findings
|
||||||
|
|
||||||
### Step 1: Initial Setup
|
### Step 1: Initial Setup
|
||||||
|
|
||||||
- Using Chrome DevTools to inspect Facebook Marketplace
|
- Using Chrome DevTools to inspect Facebook Marketplace
|
||||||
- Need to authenticate with Facebook account to access marketplace data
|
- Need to authenticate with Facebook account to access marketplace data
|
||||||
- Cookies required for full access
|
- Cookies required for full access
|
||||||
- Current status: Successfully logged in and accessed marketplace data
|
- Current status: Successfully logged in and accessed marketplace data
|
||||||
|
|
||||||
### Step 2: Individual Listing Details Analysis - COMPLETED
|
### Step 2: Individual Listing Details Analysis - COMPLETED
|
||||||
|
|
||||||
- **Data Location**: Embedded in HTML script tags within `require` array structure
|
- **Data Location**: Embedded in HTML script tags within `require` array structure
|
||||||
- **Path**: `require[0][3].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
- **Path**:
|
||||||
|
`require[0][3].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
||||||
- **Authentication**: Required for full data access
|
- **Authentication**: Required for full data access
|
||||||
- **Current Status**: Successfully reverse-engineered the API structure and data extraction method
|
- **Current Status**: Successfully reverse-engineered the API structure and data
|
||||||
|
extraction method
|
||||||
|
|
||||||
### API Endpoints Discovered
|
### API Endpoints Discovered
|
||||||
|
|
||||||
#### Search Endpoint
|
#### Search Endpoint
|
||||||
|
|
||||||
- URL: `https://www.facebook.com/marketplace/{location}/search`
|
- URL: `https://www.facebook.com/marketplace/{location}/search`
|
||||||
- Parameters: `query`, `sortBy`, `exact`
|
- Parameters: `query`, `sortBy`, `exact`
|
||||||
- Data embedded in HTML script tags with `require` structure
|
- Data embedded in HTML script tags with `require` structure
|
||||||
- Authentication: Required (cookies)
|
- Authentication: Required (cookies)
|
||||||
|
|
||||||
#### Listing Details Endpoint
|
#### Listing Details Endpoint
|
||||||
|
|
||||||
- **URL Structure**: `https://www.facebook.com/marketplace/item/{listing_id}/`
|
- **URL Structure**: `https://www.facebook.com/marketplace/item/{listing_id}/`
|
||||||
- **Data Source**: Server-side rendered HTML with embedded JSON data in script tags
|
- **Data Source**: Server-side rendered HTML with embedded JSON data in script tags
|
||||||
- **Data Structure**: Relay/GraphQL style data structure under `require[0][3].__bbox.require[...].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
- **Data Structure**: Relay/GraphQL style data structure under
|
||||||
- **Extraction Method**: Parse JSON from script tags containing marketplace data, navigate to the target object
|
`require[0][3].__bbox.require[...].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
||||||
|
- **Extraction Method**: Parse JSON from script tags containing marketplace data,
|
||||||
|
navigate to the target object
|
||||||
- **Authentication**: Required (cookies)
|
- **Authentication**: Required (cookies)
|
||||||
|
|
||||||
### Listing Data Structure Discovered (Current - 2026)
|
### Listing Data Structure Discovered (Current - 2026)
|
||||||
|
|
||||||
The current Facebook Marketplace API returns a comprehensive `GroupCommerceProductItem` object with the following key properties:
|
The current Facebook Marketplace API returns a comprehensive `GroupCommerceProductItem`
|
||||||
|
object with the following key properties:
|
||||||
|
|
||||||
```typescript
|
```typescript
|
||||||
interface FacebookMarketplaceItem {
|
interface FacebookMarketplaceItem {
|
||||||
@@ -151,6 +163,7 @@ interface FacebookMarketplaceItem {
|
|||||||
```
|
```
|
||||||
|
|
||||||
### Example Data Extracted (Current Structure)
|
### Example Data Extracted (Current Structure)
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"__typename": "GroupCommerceProductItem",
|
"__typename": "GroupCommerceProductItem",
|
||||||
@@ -228,36 +241,47 @@ interface FacebookMarketplaceItem {
|
|||||||
## Data Extraction Method
|
## Data Extraction Method
|
||||||
|
|
||||||
### Current Method (2026)
|
### Current Method (2026)
|
||||||
Facebook Marketplace listing data is embedded in JSON within `<script>` tags in the HTML response. The extraction process:
|
|
||||||
|
|
||||||
1. **Find the Correct Script**: Look for script tags containing marketplace listing data by searching for key fields like `marketplace_listing_title`, `redacted_description`, and `formatted_price`.
|
Facebook Marketplace listing data is embedded in JSON within `<script>` tags in the HTML
|
||||||
|
response. The extraction process:
|
||||||
|
|
||||||
|
1. **Find the Correct Script**: Look for script tags containing marketplace listing data
|
||||||
|
by searching for key fields like `marketplace_listing_title`, `redacted_description`,
|
||||||
|
and `formatted_price`.
|
||||||
|
|
||||||
2. **Parse JSON Structure**: The data is nested within a `require` array structure:
|
2. **Parse JSON Structure**: The data is nested within a `require` array structure:
|
||||||
```
|
```
|
||||||
require[0][3].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target
|
require[0][3].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target
|
||||||
```
|
```
|
||||||
|
|
||||||
3. **Navigate to Target Object**: The actual listing data is a `GroupCommerceProductItem` object containing comprehensive information about the listing, seller, and vehicle details.
|
3. **Navigate to Target Object**: The actual listing data is a
|
||||||
|
`GroupCommerceProductItem` object containing comprehensive information about the
|
||||||
|
listing, seller, and vehicle details.
|
||||||
|
|
||||||
4. **Handle Dynamic Structure**: Facebook may change the exact path, so robust extraction should search for the target object recursively within the parsed JSON.
|
4. **Handle Dynamic Structure**: Facebook may change the exact path, so robust
|
||||||
|
extraction should search for the target object recursively within the parsed JSON.
|
||||||
|
|
||||||
### Authentication Requirements
|
### Authentication Requirements
|
||||||
|
|
||||||
- Valid Facebook session cookies are required
|
- Valid Facebook session cookies are required
|
||||||
- User must be logged in to Facebook
|
- User must be logged in to Facebook
|
||||||
- Marketplace access may be location-restricted
|
- Marketplace access may be location-restricted
|
||||||
|
|
||||||
## Tools Used
|
## Tools Used
|
||||||
|
|
||||||
- Chrome DevTools Protocol
|
- Chrome DevTools Protocol
|
||||||
- Network monitoring
|
- Network monitoring
|
||||||
- HTML/script parsing
|
- HTML/script parsing
|
||||||
- JSON structure analysis
|
- JSON structure analysis
|
||||||
|
|
||||||
## Implementation Status
|
## Implementation Status
|
||||||
|
|
||||||
- ✅ Successfully reverse-engineered Facebook Marketplace API for listing details
|
- ✅ Successfully reverse-engineered Facebook Marketplace API for listing details
|
||||||
- ✅ Identified current data structure and extraction method (2026)
|
- ✅ Identified current data structure and extraction method (2026)
|
||||||
- ✅ Documented comprehensive GroupCommerceProductItem interface
|
- ✅ Documented comprehensive GroupCommerceProductItem interface
|
||||||
- ✅ Implemented `extractFacebookItemData()` function with script parsing logic
|
- ✅ Implemented `extractFacebookItemData()` function with script parsing logic
|
||||||
- ✅ Implemented `parseFacebookItem()` function to convert GroupCommerceProductItem to ListingDetails
|
- ✅ Implemented `parseFacebookItem()` function to convert GroupCommerceProductItem to
|
||||||
|
ListingDetails
|
||||||
- ✅ Implemented `fetchFacebookItem()` function with authentication and error handling
|
- ✅ Implemented `fetchFacebookItem()` function with authentication and error handling
|
||||||
- ✅ Updated TypeScript interfaces to match current API structure
|
- ✅ Updated TypeScript interfaces to match current API structure
|
||||||
- ✅ Added robust extraction with fallback methods for changing API paths
|
- ✅ Added robust extraction with fallback methods for changing API paths
|
||||||
@@ -266,12 +290,15 @@ Facebook Marketplace listing data is embedded in JSON within `<script>` tags in
|
|||||||
|
|
||||||
### Core Functions Implemented
|
### Core Functions Implemented
|
||||||
|
|
||||||
1. **`extractFacebookItemData(htmlString)`**: Extracts marketplace item data from HTML-embedded JSON in script tags
|
1. **`extractFacebookItemData(htmlString)`**: Extracts marketplace item data from
|
||||||
|
HTML-embedded JSON in script tags
|
||||||
- Searches for scripts containing marketplace listing data
|
- Searches for scripts containing marketplace listing data
|
||||||
- Uses primary path: `require[0][3][0].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
- Uses primary path:
|
||||||
|
`require[0][3][0].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
||||||
- Falls back to recursive search for GroupCommerceProductItem objects
|
- Falls back to recursive search for GroupCommerceProductItem objects
|
||||||
|
|
||||||
2. **`parseFacebookItem(item)`**: Converts Facebook's GroupCommerceProductItem to unified ListingDetails format
|
2. **`parseFacebookItem(item)`**: Converts Facebook’s GroupCommerceProductItem to
|
||||||
|
unified ListingDetails format
|
||||||
- Handles pricing (FREE listings, CAD currency)
|
- Handles pricing (FREE listings, CAD currency)
|
||||||
- Extracts seller information, location, and status
|
- Extracts seller information, location, and status
|
||||||
- Supports vehicle-specific metadata
|
- Supports vehicle-specific metadata
|
||||||
@@ -284,25 +311,31 @@ Facebook Marketplace listing data is embedded in JSON within `<script>` tags in
|
|||||||
- Returns parsed ListingDetails or null on failure
|
- Returns parsed ListingDetails or null on failure
|
||||||
|
|
||||||
### Authentication Requirements
|
### Authentication Requirements
|
||||||
- Facebook session cookies required in `./cookies/facebook.json` or provided as parameter
|
|
||||||
|
- Facebook session cookies required in `./cookies/facebook.json` or provided as
|
||||||
|
parameter
|
||||||
- Cookies must include valid authentication tokens for marketplace access
|
- Cookies must include valid authentication tokens for marketplace access
|
||||||
- Handles cookie expiration and domain validation
|
- Handles cookie expiration and domain validation
|
||||||
|
|
||||||
## Current Implementation Status - 2026 Verification
|
## Current Implementation Status - 2026 Verification
|
||||||
|
|
||||||
### Step 3: API Verification and Current Structure Analysis (January 2026)
|
### Step 3: API Verification and Current Structure Analysis (January 2026)
|
||||||
|
|
||||||
- **Verification Date**: January 22, 2026
|
- **Verification Date**: January 22, 2026
|
||||||
- **Status**: Successfully verified current Facebook Marketplace API structure
|
- **Status**: Successfully verified current Facebook Marketplace API structure
|
||||||
- **Data Source**: Embedded JSON in HTML script tags (server-side rendered)
|
- **Data Source**: Embedded JSON in HTML script tags (server-side rendered)
|
||||||
- **Extraction Path**: `require[0][3].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
- **Extraction Path**:
|
||||||
|
`require[0][3].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
||||||
|
|
||||||
#### Verified Listing Structure (Real Example - 2006 Hyundai Tiburon)
|
#### Verified Listing Structure (Real Example - 2006 Hyundai Tiburon)
|
||||||
|
|
||||||
- **Listing ID**: 1226468515995685
|
- **Listing ID**: 1226468515995685
|
||||||
- **Title**: "2006 Hyundai Tiburon"
|
- **Title**: “2006 Hyundai Tiburon”
|
||||||
- **Price**: CA$3,000 (formatted_price.text)
|
- **Price**: CA$3,000 (formatted_price.text)
|
||||||
- **Raw Price Data**: {"amount_with_offset": "300000", "currency": "CAD", "amount": "3000.00"}
|
- **Raw Price Data**: {"amount_with_offset": “300000”, “currency”: “CAD”, “amount”:
|
||||||
|
"3000.00"}
|
||||||
- **Location**: Hamilton, ON (with coordinates: 43.250427246094, -79.963989257812)
|
- **Location**: Hamilton, ON (with coordinates: 43.250427246094, -79.963989257812)
|
||||||
- **Description**: "As is" (redacted_description.text)
|
- **Description**: “As is” (redacted_description.text)
|
||||||
- **Vehicle Details**:
|
- **Vehicle Details**:
|
||||||
- Make: Hyundai
|
- Make: Hyundai
|
||||||
- Model: Tiburon
|
- Model: Tiburon
|
||||||
@@ -323,41 +356,54 @@ Facebook Marketplace listing data is embedded in JSON within `<script>` tags in
|
|||||||
- **Messaging**: Enabled
|
- **Messaging**: Enabled
|
||||||
|
|
||||||
#### Current API Characteristics
|
#### Current API Characteristics
|
||||||
|
|
||||||
- **Authentication**: Still requires valid Facebook session cookies
|
- **Authentication**: Still requires valid Facebook session cookies
|
||||||
- **Data Format**: Server-side rendered HTML with embedded GraphQL/Relay JSON
|
- **Data Format**: Server-side rendered HTML with embedded GraphQL/Relay JSON
|
||||||
- **Structure Stability**: Primary extraction path remains functional
|
- **Structure Stability**: Primary extraction path remains functional
|
||||||
- **Additional Features**: Includes marketplace ratings, seller verification badges, cross-posting info
|
- **Additional Features**: Includes marketplace ratings, seller verification badges,
|
||||||
|
cross-posting info
|
||||||
|
|
||||||
### API Changes Observed Since 2024 Documentation
|
### API Changes Observed Since 2024 Documentation
|
||||||
|
|
||||||
- **Minimal Changes**: Core data structure largely unchanged
|
- **Minimal Changes**: Core data structure largely unchanged
|
||||||
- **Enhanced Fields**: Added more detailed vehicle specifications and seller profile information
|
- **Enhanced Fields**: Added more detailed vehicle specifications and seller profile
|
||||||
- **GraphQL Integration**: Deeper integration with Facebook's GraphQL infrastructure
|
information
|
||||||
|
- **GraphQL Integration**: Deeper integration with Facebook’s GraphQL infrastructure
|
||||||
- **Security Features**: Additional integrity checks and reporting mechanisms
|
- **Security Features**: Additional integrity checks and reporting mechanisms
|
||||||
|
|
||||||
### Multi-Category Testing Results (January 2026)
|
### Multi-Category Testing Results (January 2026)
|
||||||
|
|
||||||
Successfully tested extraction across different listing categories:
|
Successfully tested extraction across different listing categories:
|
||||||
|
|
||||||
#### 1. Vehicle Listings (Automotive)
|
#### 1. Vehicle Listings (Automotive)
|
||||||
|
|
||||||
- **Example**: 2006 Hyundai Tiburon (ID: 1226468515995685)
|
- **Example**: 2006 Hyundai Tiburon (ID: 1226468515995685)
|
||||||
- **Status**: ✅ Fully functional
|
- **Status**: ✅ Fully functional
|
||||||
- **Data Extracted**: Complete vehicle specs, pricing, seller info, location coordinates
|
- **Data Extracted**: Complete vehicle specs, pricing, seller info, location coordinates
|
||||||
- **Unique Fields**: vehicle_make_display_name, vehicle_odometer_data, vehicle_transmission_type, vehicle_exterior_color, vehicle_interior_color, vehicle_fuel_type
|
- **Unique Fields**: vehicle_make_display_name, vehicle_odometer_data,
|
||||||
|
vehicle_transmission_type, vehicle_exterior_color, vehicle_interior_color,
|
||||||
|
vehicle_fuel_type
|
||||||
|
|
||||||
#### 2. Electronics Listings
|
#### 2. Electronics Listings
|
||||||
|
|
||||||
- **Example**: Nintendo Switch (ID: 3903865769914262)
|
- **Example**: Nintendo Switch (ID: 3903865769914262)
|
||||||
- **Status**: ✅ Fully functional
|
- **Status**: ✅ Fully functional
|
||||||
- **Data Extracted**: Title, price (CA$140), location (Toronto, ON), condition (Used - like new), seller (Yitao Hou)
|
- **Data Extracted**: Title, price (CA$140), location (Toronto, ON), condition (Used -
|
||||||
|
like new), seller (Yitao Hou)
|
||||||
- **Category**: Electronics (category_id: 479353692612078)
|
- **Category**: Electronics (category_id: 479353692612078)
|
||||||
- **Notes**: Standard GroupCommerceProductItem structure applies
|
- **Notes**: Standard GroupCommerceProductItem structure applies
|
||||||
|
|
||||||
#### 3. Home Goods/Furniture Listings
|
#### 3. Home Goods/Furniture Listings
|
||||||
|
|
||||||
- **Example**: Tabletop Mirror (cat not included) (ID: 1082389057290709)
|
- **Example**: Tabletop Mirror (cat not included) (ID: 1082389057290709)
|
||||||
- **Status**: ✅ Fully functional
|
- **Status**: ✅ Fully functional
|
||||||
- **Data Extracted**: Title, price (CA$5), location (Mississauga, ON), condition (Used - like new), seller (Rohit Rehan)
|
- **Data Extracted**: Title, price (CA$5), location (Mississauga, ON), condition (Used -
|
||||||
|
like new), seller (Rohit Rehan)
|
||||||
- **Category**: Home Goods (category_id: 1569171756675761)
|
- **Category**: Home Goods (category_id: 1569171756675761)
|
||||||
- **Notes**: Includes detailed description and delivery options
|
- **Notes**: Includes detailed description and delivery options
|
||||||
|
|
||||||
#### Testing Summary
|
#### Testing Summary
|
||||||
|
|
||||||
- **Extraction Method**: Consistent across all categories
|
- **Extraction Method**: Consistent across all categories
|
||||||
- **Data Structure**: GroupCommerceProductItem interface works for all listing types
|
- **Data Structure**: GroupCommerceProductItem interface works for all listing types
|
||||||
- **Authentication**: Required for all categories
|
- **Authentication**: Required for all categories
|
||||||
@@ -365,16 +411,20 @@ Successfully tested extraction across different listing categories:
|
|||||||
- **Edge Cases**: All tested listings were active/in-person pickup
|
- **Edge Cases**: All tested listings were active/in-person pickup
|
||||||
|
|
||||||
## Implementation Status - COMPLETED (January 2026)
|
## Implementation Status - COMPLETED (January 2026)
|
||||||
|
|
||||||
- ✅ Successfully reverse-engineered Facebook Marketplace API for listing details
|
- ✅ Successfully reverse-engineered Facebook Marketplace API for listing details
|
||||||
- ✅ Verified current API structure and extraction method (January 2026)
|
- ✅ Verified current API structure and extraction method (January 2026)
|
||||||
- ✅ Tested extraction across multiple listing categories (vehicles, electronics, home goods)
|
- ✅ Tested extraction across multiple listing categories (vehicles, electronics, home
|
||||||
- ✅ Implemented comprehensive error handling for sold/removed listings and authentication failures
|
goods)
|
||||||
|
- ✅ Implemented comprehensive error handling for sold/removed listings and
|
||||||
|
authentication failures
|
||||||
- ✅ Enhanced rate limiting and retry logic (already robust)
|
- ✅ Enhanced rate limiting and retry logic (already robust)
|
||||||
- ✅ Added monitoring and metrics for API stability detection
|
- ✅ Added monitoring and metrics for API stability detection
|
||||||
- ✅ Updated all scraper functions to use verified extraction methods
|
- ✅ Updated all scraper functions to use verified extraction methods
|
||||||
- ✅ Documented comprehensive GroupCommerceProductItem interface with real examples
|
- ✅ Documented comprehensive GroupCommerceProductItem interface with real examples
|
||||||
|
|
||||||
## Next Steps (Future Maintenance)
|
## Next Steps (Future Maintenance)
|
||||||
|
|
||||||
1. Monitor extraction success rates for API change detection
|
1. Monitor extraction success rates for API change detection
|
||||||
2. Update extraction paths if Facebook changes their API structure
|
2. Update extraction paths if Facebook changes their API structure
|
||||||
3. Add support for additional marketplace features as they become available
|
3. Add support for additional marketplace features as they become available
|
||||||
|
|||||||
145
KIJIJI.md
145
KIJIJI.md
@@ -1,9 +1,13 @@
|
|||||||
# Kijiji API Findings
|
# Kijiji API Findings
|
||||||
|
|
||||||
## Overview
|
## Overview
|
||||||
Kijiji is a Canadian classifieds marketplace that uses a modern web application built with Next.js and Apollo GraphQL. The search results are powered by a GraphQL API with client-side state management.
|
|
||||||
|
Kijiji is a Canadian classifieds marketplace that uses a modern web application built
|
||||||
|
with Next.js and Apollo GraphQL. The search results are powered by a GraphQL API with
|
||||||
|
client-side state management.
|
||||||
|
|
||||||
## Initial Page Load (Homepage)
|
## Initial Page Load (Homepage)
|
||||||
|
|
||||||
- **URL**: https://www.kijiji.ca/
|
- **URL**: https://www.kijiji.ca/
|
||||||
- **Architecture**: Server-side rendered React application with Next.js
|
- **Architecture**: Server-side rendered React application with Next.js
|
||||||
- **Data Sources**:
|
- **Data Sources**:
|
||||||
@@ -12,18 +16,27 @@ Kijiji is a Canadian classifieds marketplace that uses a modern web application
|
|||||||
- No initial API calls for listings - data appears to be embedded in HTML
|
- No initial API calls for listings - data appears to be embedded in HTML
|
||||||
|
|
||||||
## Search Results Page
|
## Search Results Page
|
||||||
|
|
||||||
- **URL Pattern**: `https://www.kijiji.ca/b-[location]/[keywords]/k0l0`
|
- **URL Pattern**: `https://www.kijiji.ca/b-[location]/[keywords]/k0l0`
|
||||||
- **Example**: `https://www.kijiji.ca/b-canada/iphone/k0l0`
|
- **Example**: `https://www.kijiji.ca/b-canada/iphone/k0l0`
|
||||||
- **Technology Stack**: Next.js with Apollo GraphQL client
|
- **Technology Stack**: Next.js with Apollo GraphQL client
|
||||||
- **Data Structure**: Uses `__APOLLO_STATE__` global object containing normalized GraphQL cache
|
- **Data Structure**: Uses `__APOLLO_STATE__` global object containing normalized
|
||||||
|
GraphQL cache
|
||||||
|
|
||||||
### GraphQL Data Structure
|
### GraphQL Data Structure
|
||||||
|
|
||||||
#### Data Location
|
#### Data Location
|
||||||
Search results data is embedded in the Next.js page props under `__NEXT_DATA__.props.pageProps.__APOLLO_STATE__`. The data is pre-rendered on the server and sent to the client. Each page (including pagination) has its own pre-rendered data.
|
|
||||||
|
Search results data is embedded in the Next.js page props under
|
||||||
|
`__NEXT_DATA__.props.pageProps.__APOLLO_STATE__`. The data is pre-rendered on the server
|
||||||
|
and sent to the client.
|
||||||
|
Each page (including pagination) has its own pre-rendered data.
|
||||||
|
|
||||||
#### Search Results Container
|
#### Search Results Container
|
||||||
The search results are stored directly in the Apollo ROOT_QUERY with keys following the pattern `searchResultsPageByUrl:{url_path}` where `url_path` includes pagination parameters.
|
|
||||||
|
The search results are stored directly in the Apollo ROOT_QUERY with keys following the
|
||||||
|
pattern `searchResultsPageByUrl:{url_path}` where `url_path` includes pagination
|
||||||
|
parameters.
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -33,17 +46,20 @@ The search results are stored directly in the Apollo ROOT_QUERY with keys follow
|
|||||||
```
|
```
|
||||||
|
|
||||||
#### Pagination Handling
|
#### Pagination Handling
|
||||||
|
|
||||||
- Each page is server-side rendered with its own embedded data
|
- Each page is server-side rendered with its own embedded data
|
||||||
- No client-side GraphQL requests for pagination
|
- No client-side GraphQL requests for pagination
|
||||||
- URL parameter `?page=N` controls which page data is embedded
|
- URL parameter `?page=N` controls which page data is embedded
|
||||||
- Offset in searchString corresponds to `(page-1) * limit`
|
- Offset in searchString corresponds to `(page-1) * limit`
|
||||||
|
|
||||||
#### Search Parameters in URL
|
#### Search Parameters in URL
|
||||||
|
|
||||||
- `k0c{CATEGORY}l{LOCATION}` - Category and location IDs
|
- `k0c{CATEGORY}l{LOCATION}` - Category and location IDs
|
||||||
- `?page=N` - Page number (1-based)
|
- `?page=N` - Page number (1-based)
|
||||||
- Data contains `offset` and `limit` for API-style pagination
|
- Data contains `offset` and `limit` for API-style pagination
|
||||||
|
|
||||||
#### Individual Listing Structure
|
#### Individual Listing Structure
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
"id": "1732061412",
|
"id": "1732061412",
|
||||||
@@ -90,6 +106,7 @@ The search results are stored directly in the Apollo ROOT_QUERY with keys follow
|
|||||||
```
|
```
|
||||||
|
|
||||||
### URL Parameters
|
### URL Parameters
|
||||||
|
|
||||||
- `sort=MATCH` - Sort by relevance
|
- `sort=MATCH` - Sort by relevance
|
||||||
- `order=DESC` - Descending order
|
- `order=DESC` - Descending order
|
||||||
- `type=OFFER` - Show offerings (not wanted ads)
|
- `type=OFFER` - Show offerings (not wanted ads)
|
||||||
@@ -102,6 +119,7 @@ The search results are stored directly in the Apollo ROOT_QUERY with keys follow
|
|||||||
- `eaTopAdPosition=1` - ?
|
- `eaTopAdPosition=1` - ?
|
||||||
|
|
||||||
### Image API
|
### Image API
|
||||||
|
|
||||||
- **Endpoint**: `https://media.kijiji.ca/api/v1/`
|
- **Endpoint**: `https://media.kijiji.ca/api/v1/`
|
||||||
- **Pattern**: `/ca-prod-fsbo-ads/images/{uuid}?rule=kijijica-{size}-jpg`
|
- **Pattern**: `/ca-prod-fsbo-ads/images/{uuid}?rule=kijijica-{size}-jpg`
|
||||||
- **Sizes**: 200, 300, 400, 500 pixels
|
- **Sizes**: 200, 300, 400, 500 pixels
|
||||||
@@ -109,10 +127,12 @@ The search results are stored directly in the Apollo ROOT_QUERY with keys follow
|
|||||||
### Categories and Locations
|
### Categories and Locations
|
||||||
|
|
||||||
#### Category Structure
|
#### Category Structure
|
||||||
Categories are hierarchical with parent-child relationships. The main categories under "Buy & Sell" include:
|
|
||||||
|
Categories are hierarchical with parent-child relationships.
|
||||||
|
The main categories under “Buy & Sell” include:
|
||||||
|
|
||||||
| ID | Name | Total Results (iPhone search) |
|
| ID | Name | Total Results (iPhone search) |
|
||||||
|----|------|------------------------------|
|
| --- | --- | --- |
|
||||||
| 10 | Buy & Sell | 19956 |
|
| 10 | Buy & Sell | 19956 |
|
||||||
| 12 | Arts & Collectibles | 149 |
|
| 12 | Arts & Collectibles | 149 |
|
||||||
| 767 | Audio | 481 |
|
| 767 | Audio | 481 |
|
||||||
@@ -145,10 +165,11 @@ Categories are hierarchical with parent-child relationships. The main categories
|
|||||||
| 26 | Other | 286 |
|
| 26 | Other | 286 |
|
||||||
|
|
||||||
#### Location Structure
|
#### Location Structure
|
||||||
Locations are also hierarchical, with provinces/states under the main "Canada" location:
|
|
||||||
|
Locations are also hierarchical, with provinces/states under the main “Canada” location:
|
||||||
|
|
||||||
| ID | Name | Total Results (iPhone search) |
|
| ID | Name | Total Results (iPhone search) |
|
||||||
|----|------|------------------------------|
|
| --- | --- | --- |
|
||||||
| 0 | Canada | - |
|
| 0 | Canada | - |
|
||||||
| 9001 | Québec | 2516 |
|
| 9001 | Québec | 2516 |
|
||||||
| 9002 | Nova Scotia | 875 |
|
| 9002 | Nova Scotia | 875 |
|
||||||
@@ -163,16 +184,20 @@ Locations are also hierarchical, with provinces/states under the main "Canada" l
|
|||||||
| 9011 | Prince Edward Island | 31 |
|
| 9011 | Prince Edward Island | 31 |
|
||||||
|
|
||||||
#### URL Patterns
|
#### URL Patterns
|
||||||
|
|
||||||
- Categories: `/b-{category-slug}/canada/{keywords}/k0c{CATEGORY_ID}l0`
|
- Categories: `/b-{category-slug}/canada/{keywords}/k0c{CATEGORY_ID}l0`
|
||||||
- Locations: `/b-buy-sell/{location-slug}/iphone/k0c10l{LOCATION_ID}`
|
- Locations: `/b-buy-sell/{location-slug}/iphone/k0c10l{LOCATION_ID}`
|
||||||
- Combined: `/b-{category-slug}/{location-slug}/{keywords}/k0c{CATEGORY_ID}l{LOCATION_ID}`
|
- Combined:
|
||||||
|
`/b-{category-slug}/{location-slug}/{keywords}/k0c{CATEGORY_ID}l{LOCATION_ID}`
|
||||||
|
|
||||||
### Pagination
|
### Pagination
|
||||||
|
|
||||||
- Uses offset-based pagination
|
- Uses offset-based pagination
|
||||||
- 40 results per page
|
- 40 results per page
|
||||||
- Total count provided in pagination metadata
|
- Total count provided in pagination metadata
|
||||||
|
|
||||||
## Authentication & User Management
|
## Authentication & User Management
|
||||||
|
|
||||||
- **Authentication System**: OAuth2-based using CIS (Customer Identity Service)
|
- **Authentication System**: OAuth2-based using CIS (Customer Identity Service)
|
||||||
- **Identity Provider**: `id.kijiji.ca`
|
- **Identity Provider**: `id.kijiji.ca`
|
||||||
- **OAuth2 Flow**:
|
- **OAuth2 Flow**:
|
||||||
@@ -184,24 +209,30 @@ Locations are also hierarchical, with provinces/states under the main "Canada" l
|
|||||||
- **User Features**: Saved searches, messaging, flagging require authentication
|
- **User Features**: Saved searches, messaging, flagging require authentication
|
||||||
|
|
||||||
## Posting API
|
## Posting API
|
||||||
|
|
||||||
- **Posting Flow**: Requires authentication, redirects to login if not authenticated
|
- **Posting Flow**: Requires authentication, redirects to login if not authenticated
|
||||||
- **Posting URL**: `https://www.kijiji.ca/p-post-ad.html`
|
- **Posting URL**: `https://www.kijiji.ca/p-post-ad.html`
|
||||||
- **Authentication Required**: Yes, redirects to `/consumer/login` for unauthenticated users
|
- **Authentication Required**: Yes, redirects to `/consumer/login` for unauthenticated
|
||||||
- **Post-Creation**: Likely uses authenticated GraphQL mutations (not observed in anonymous browsing)
|
users
|
||||||
|
- **Post-Creation**: Likely uses authenticated GraphQL mutations (not observed in
|
||||||
|
anonymous browsing)
|
||||||
|
|
||||||
## GraphQL API Endpoint
|
## GraphQL API Endpoint
|
||||||
|
|
||||||
- **URL**: `https://www.kijiji.ca/anvil/api`
|
- **URL**: `https://www.kijiji.ca/anvil/api`
|
||||||
- **Method**: POST
|
- **Method**: POST
|
||||||
- **Content-Type**: application/json
|
- **Content-Type**: application/json
|
||||||
- **Headers**:
|
- **Headers**:
|
||||||
- `apollo-require-preflight: true`
|
- `apollo-require-preflight: true`
|
||||||
- Standard CORS headers
|
- Standard CORS headers
|
||||||
- **Authentication**: No authentication required for basic queries (uses cookies for session tracking)
|
- **Authentication**: No authentication required for basic queries (uses cookies for
|
||||||
|
session tracking)
|
||||||
- **Technology**: Apollo GraphQL server
|
- **Technology**: Apollo GraphQL server
|
||||||
|
|
||||||
### Sample GraphQL Queries Discovered
|
### Sample GraphQL Queries Discovered
|
||||||
|
|
||||||
#### Get Search Categories
|
#### Get Search Categories
|
||||||
|
|
||||||
```graphql
|
```graphql
|
||||||
query getSearchCategories($locale: String!) {
|
query getSearchCategories($locale: String!) {
|
||||||
searchCategories {
|
searchCategories {
|
||||||
@@ -218,6 +249,7 @@ Variables: `{"locale": "en-CA"}`
|
|||||||
Response includes hierarchical category structure with IDs and localized names.
|
Response includes hierarchical category structure with IDs and localized names.
|
||||||
|
|
||||||
#### Get Geocode from IP (fails for current IP)
|
#### Get Geocode from IP (fails for current IP)
|
||||||
|
|
||||||
```graphql
|
```graphql
|
||||||
query GetGeocodeReverseFromIp {
|
query GetGeocodeReverseFromIp {
|
||||||
geocodeReverseFromIp {
|
geocodeReverseFromIp {
|
||||||
@@ -229,9 +261,11 @@ query GetGeocodeReverseFromIp {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
This query fails for the current IP address, suggesting geolocation-based features may not work or require different IP ranges.
|
This query fails for the current IP address, suggesting geolocation-based features may
|
||||||
|
not work or require different IP ranges.
|
||||||
|
|
||||||
#### Get Category Path
|
#### Get Category Path
|
||||||
|
|
||||||
```graphql
|
```graphql
|
||||||
query GetCategoryPath($categoryId: Int!, $locale: String, $locationId: Int) {
|
query GetCategoryPath($categoryId: Int!, $locale: String, $locationId: Int) {
|
||||||
category(id: $categoryId) {
|
category(id: $categoryId) {
|
||||||
@@ -256,25 +290,33 @@ Variables: `{"categoryId": 10, "locationId": 0, "locale": "en-CA"}`
|
|||||||
## Latest Findings (2026-01-21)
|
## Latest Findings (2026-01-21)
|
||||||
|
|
||||||
### Client-Side GraphQL Queries Observed
|
### Client-Side GraphQL Queries Observed
|
||||||
|
|
||||||
- **getSearchCategories**: Retrieves category hierarchy for search filters
|
- **getSearchCategories**: Retrieves category hierarchy for search filters
|
||||||
- **GetGeocodeReverseFromIp**: Attempts to geolocate user (fails for current IP)
|
- **GetGeocodeReverseFromIp**: Attempts to geolocate user (fails for current IP)
|
||||||
|
|
||||||
### GraphQL Schema Insights
|
### GraphQL Schema Insights
|
||||||
Testing direct GraphQL queries revealed:
|
|
||||||
- Field "searchResults" does not exist on Query type
|
|
||||||
- Suggested alternatives: "searchResultsPage" or "searchUrl"
|
|
||||||
- This suggests the search functionality may use different GraphQL operations than direct queries
|
|
||||||
|
|
||||||
The embedded Apollo state approach appears to be the primary method for accessing search data, with GraphQL used for auxiliary operations like categories and geolocation.
|
Testing direct GraphQL queries revealed:
|
||||||
|
- Field “searchResults” does not exist on Query type
|
||||||
|
- Suggested alternatives: “searchResultsPage” or “searchUrl”
|
||||||
|
- This suggests the search functionality may use different GraphQL operations than
|
||||||
|
direct queries
|
||||||
|
|
||||||
|
The embedded Apollo state approach appears to be the primary method for accessing search
|
||||||
|
data, with GraphQL used for auxiliary operations like categories and geolocation.
|
||||||
|
|
||||||
### Server-Side Rendering Architecture
|
### Server-Side Rendering Architecture
|
||||||
Search results are fully server-side rendered with data embedded in HTML. Each page (including pagination) contains its own pre-rendered data. No client-side GraphQL requests are made for:
|
|
||||||
|
Search results are fully server-side rendered with data embedded in HTML. Each page
|
||||||
|
(including pagination) contains its own pre-rendered data.
|
||||||
|
No client-side GraphQL requests are made for:
|
||||||
|
|
||||||
- Initial search results
|
- Initial search results
|
||||||
- Pagination navigation
|
- Pagination navigation
|
||||||
- Search result data
|
- Search result data
|
||||||
|
|
||||||
### Network Analysis Findings
|
### Network Analysis Findings
|
||||||
|
|
||||||
- GraphQL endpoint: `https://www.kijiji.ca/anvil/api`
|
- GraphQL endpoint: `https://www.kijiji.ca/anvil/api`
|
||||||
- Method: POST
|
- Method: POST
|
||||||
- Content-Type: application/json
|
- Content-Type: application/json
|
||||||
@@ -282,7 +324,10 @@ Search results are fully server-side rendered with data embedded in HTML. Each p
|
|||||||
- Cookies required for session tracking
|
- Cookies required for session tracking
|
||||||
|
|
||||||
### Embedded Data Structure
|
### Embedded Data Structure
|
||||||
Search results data is embedded in the HTML within Next.js `__NEXT_DATA__.props.pageProps.__APOLLO_STATE__` object. The data includes:
|
|
||||||
|
Search results data is embedded in the HTML within Next.js
|
||||||
|
`__NEXT_DATA__.props.pageProps.__APOLLO_STATE__` object.
|
||||||
|
The data includes:
|
||||||
|
|
||||||
- Individual ad listings with complete metadata
|
- Individual ad listings with complete metadata
|
||||||
- Pagination information
|
- Pagination information
|
||||||
@@ -290,20 +335,24 @@ Search results data is embedded in the HTML within Next.js `__NEXT_DATA__.props.
|
|||||||
- Category/location hierarchies
|
- Category/location hierarchies
|
||||||
|
|
||||||
### Current Scraper Implementation
|
### Current Scraper Implementation
|
||||||
|
|
||||||
The existing `src/kijiji.ts` implementation correctly parses the embedded Apollo state:
|
The existing `src/kijiji.ts` implementation correctly parses the embedded Apollo state:
|
||||||
|
|
||||||
- Uses `extractApolloState()` to parse `__NEXT_DATA__` from HTML
|
- Uses `extractApolloState()` to parse `__NEXT_DATA__` from HTML
|
||||||
- Filters Apollo keys containing "Listing" to find ad data
|
- Filters Apollo keys containing “Listing” to find ad data
|
||||||
- Extracts `url`, `title`, and other metadata from each listing
|
- Extracts `url`, `title`, and other metadata from each listing
|
||||||
- Successfully scrapes listings without needing API authentication
|
- Successfully scrapes listings without needing API authentication
|
||||||
|
|
||||||
### Authentication Status
|
### Authentication Status
|
||||||
- **Search functionality**: No authentication required - all search and listing data accessible anonymously
|
|
||||||
|
- **Search functionality**: No authentication required - all search and listing data
|
||||||
|
accessible anonymously
|
||||||
- **Posting functionality**: Requires authentication (redirects to login)
|
- **Posting functionality**: Requires authentication (redirects to login)
|
||||||
- **User features**: Saved searches, messaging require authentication
|
- **User features**: Saved searches, messaging require authentication
|
||||||
- **Rate limiting**: May apply but not observed in anonymous browsing
|
- **Rate limiting**: May apply but not observed in anonymous browsing
|
||||||
|
|
||||||
### Pagination Implementation
|
### Pagination Implementation
|
||||||
|
|
||||||
- Each page is a separate server-rendered route
|
- Each page is a separate server-rendered route
|
||||||
- URL pattern: `/b-{location}/{keywords}/page-{number}/k0{category}l{location_id}`
|
- URL pattern: `/b-{location}/{keywords}/page-{number}/k0{category}l{location_id}`
|
||||||
- No client-side pagination API calls
|
- No client-side pagination API calls
|
||||||
@@ -313,20 +362,24 @@ The existing `src/kijiji.ts` implementation correctly parses the embedded Apollo
|
|||||||
## URL Pattern Analysis
|
## URL Pattern Analysis
|
||||||
|
|
||||||
### Search URL Structure
|
### Search URL Structure
|
||||||
|
|
||||||
`https://www.kijiji.ca/b-{category_slug}/{location_slug}/{keywords}/k0c{category_id}l{location_id}`
|
`https://www.kijiji.ca/b-{category_slug}/{location_slug}/{keywords}/k0c{category_id}l{location_id}`
|
||||||
|
|
||||||
#### Examples Observed:
|
#### Examples Observed:
|
||||||
|
|
||||||
- All categories, Canada: `/b-canada/iphone/k0l0` (c0 = All Categories, l0 = Canada)
|
- All categories, Canada: `/b-canada/iphone/k0l0` (c0 = All Categories, l0 = Canada)
|
||||||
- Cell phones category: `/b-cell-phones/canada/iphone/k0c132l0` (c132 = Cell Phones)
|
- Cell phones category: `/b-cell-phones/canada/iphone/k0c132l0` (c132 = Cell Phones)
|
||||||
- With pagination: `/b-canada/iphone/page-2/k0l0`
|
- With pagination: `/b-canada/iphone/page-2/k0l0`
|
||||||
|
|
||||||
#### URL Components:
|
#### URL Components:
|
||||||
|
|
||||||
- `c{CATEGORY_ID}`: Category ID (0 = All Categories, 132 = Cell Phones, etc.)
|
- `c{CATEGORY_ID}`: Category ID (0 = All Categories, 132 = Cell Phones, etc.)
|
||||||
- `l{LOCATION_ID}`: Location ID (0 = Canada, 1700272 = GTA, etc.)
|
- `l{LOCATION_ID}`: Location ID (0 = Canada, 1700272 = GTA, etc.)
|
||||||
- `page-{N}`: Pagination (1-based, optional)
|
- `page-{N}`: Pagination (1-based, optional)
|
||||||
- Keywords are slugified in URL path
|
- Keywords are slugified in URL path
|
||||||
|
|
||||||
### Current Implementation Status
|
### Current Implementation Status
|
||||||
|
|
||||||
The existing scraper in `src/kijiji.ts` successfully implements the approach:
|
The existing scraper in `src/kijiji.ts` successfully implements the approach:
|
||||||
- Parses embedded Apollo state from HTML responses
|
- Parses embedded Apollo state from HTML responses
|
||||||
- Handles rate limiting and retries
|
- Handles rate limiting and retries
|
||||||
@@ -336,14 +389,22 @@ The existing scraper in `src/kijiji.ts` successfully implements the approach:
|
|||||||
## Listing Details Page
|
## Listing Details Page
|
||||||
|
|
||||||
### Overview
|
### Overview
|
||||||
Similar to search results, listing details pages use server-side rendering with embedded Apollo GraphQL state in the HTML. No dedicated API endpoint serves individual listing data - all information is pre-rendered on the server.
|
|
||||||
|
Similar to search results, listing details pages use server-side rendering with embedded
|
||||||
|
Apollo GraphQL state in the HTML. No dedicated API endpoint serves individual listing
|
||||||
|
data - all information is pre-rendered on the server.
|
||||||
|
|
||||||
### Data Architecture
|
### Data Architecture
|
||||||
- **Server-Side Rendering**: Each listing page is fully server-rendered with data embedded in HTML
|
|
||||||
- **Embedded Apollo State**: Listing data is stored in `__NEXT_DATA__.props.pageProps.__APOLLO_STATE__`
|
- **Server-Side Rendering**: Each listing page is fully server-rendered with data
|
||||||
- **Client-Side GraphQL**: Additional data (categories, campaigns, similar listings, user profiles) fetched via GraphQL API
|
embedded in HTML
|
||||||
|
- **Embedded Apollo State**: Listing data is stored in
|
||||||
|
`__NEXT_DATA__.props.pageProps.__APOLLO_STATE__`
|
||||||
|
- **Client-Side GraphQL**: Additional data (categories, campaigns, similar listings,
|
||||||
|
user profiles) fetched via GraphQL API
|
||||||
|
|
||||||
### Listing Data Structure
|
### Listing Data Structure
|
||||||
|
|
||||||
The main listing data follows the same pattern as search results:
|
The main listing data follows the same pattern as search results:
|
||||||
|
|
||||||
```json
|
```json
|
||||||
@@ -385,40 +446,50 @@ The main listing data follows the same pattern as search results:
|
|||||||
```
|
```
|
||||||
|
|
||||||
### Client-Side GraphQL Queries
|
### Client-Side GraphQL Queries
|
||||||
|
|
||||||
When loading a listing details page, the following GraphQL queries are executed:
|
When loading a listing details page, the following GraphQL queries are executed:
|
||||||
|
|
||||||
#### 1. getSearchCategories
|
#### 1. getSearchCategories
|
||||||
|
|
||||||
- **Purpose**: Category hierarchy for navigation
|
- **Purpose**: Category hierarchy for navigation
|
||||||
- **Variables**: `{"locale": "en-CA"}`
|
- **Variables**: `{"locale": "en-CA"}`
|
||||||
- **Response**: Hierarchical category structure
|
- **Response**: Hierarchical category structure
|
||||||
|
|
||||||
#### 2. getCampaignsForVip
|
#### 2. getCampaignsForVip
|
||||||
|
|
||||||
- **Purpose**: Advertisement targeting data
|
- **Purpose**: Advertisement targeting data
|
||||||
- **Variables**: `{"placement": "vip", "locationId": 1700275, "categoryId": 760, "platform": "desktop"}`
|
- **Variables**:
|
||||||
|
`{"placement": "vip", "locationId": 1700275, "categoryId": 760, "platform": "desktop"}`
|
||||||
- **Response**: Campaign/ads data (usually null)
|
- **Response**: Campaign/ads data (usually null)
|
||||||
|
|
||||||
#### 3. GetReviewSummary
|
#### 3. GetReviewSummary
|
||||||
|
|
||||||
- **Purpose**: Seller review statistics
|
- **Purpose**: Seller review statistics
|
||||||
- **Variables**: `{"userId": "1044934581"}`
|
- **Variables**: `{"userId": "1044934581"}`
|
||||||
- **Response**: Review count and score (usually 0 for new sellers)
|
- **Response**: Review count and score (usually 0 for new sellers)
|
||||||
|
|
||||||
#### 4. GetProfileMetrics
|
#### 4. GetProfileMetrics
|
||||||
|
|
||||||
- **Purpose**: Seller profile information
|
- **Purpose**: Seller profile information
|
||||||
- **Variables**: `{"profileId": "1044934581"}`
|
- **Variables**: `{"profileId": "1044934581"}`
|
||||||
- **Response**: Member since date, account type
|
- **Response**: Member since date, account type
|
||||||
|
|
||||||
#### 5. GetListingsSimilar
|
#### 5. GetListingsSimilar
|
||||||
|
|
||||||
- **Purpose**: Similar listings for cross-selling
|
- **Purpose**: Similar listings for cross-selling
|
||||||
- **Variables**: `{"listingId": "1705585530", "limit": 10, "isExternalId": false}`
|
- **Variables**: `{"listingId": "1705585530", "limit": 10, "isExternalId": false}`
|
||||||
- **Response**: Array of similar listings with basic metadata
|
- **Response**: Array of similar listings with basic metadata
|
||||||
|
|
||||||
#### 6. GetGeocodeReverseFromIp
|
#### 6. GetGeocodeReverseFromIp
|
||||||
|
|
||||||
- **Purpose**: Geolocation-based features
|
- **Purpose**: Geolocation-based features
|
||||||
- **Variables**: `{}`
|
- **Variables**: `{}`
|
||||||
- **Response**: Fails with 404 for most IPs
|
- **Response**: Fails with 404 for most IPs
|
||||||
|
|
||||||
### Implementation Status
|
### Implementation Status
|
||||||
The existing `parseListing()` function in `src/kijiji.ts` successfully extracts listing details from embedded Apollo state:
|
|
||||||
|
The existing `parseListing()` function in `src/kijiji.ts` successfully extracts listing
|
||||||
|
details from embedded Apollo state:
|
||||||
|
|
||||||
- ✅ Extracts title, description, price, location
|
- ✅ Extracts title, description, price, location
|
||||||
- ✅ Handles contact-based pricing ("Please Contact")
|
- ✅ Handles contact-based pricing ("Please Contact")
|
||||||
@@ -427,22 +498,30 @@ The existing `parseListing()` function in `src/kijiji.ts` successfully extracts
|
|||||||
- ✅ Works without authentication or API keys
|
- ✅ Works without authentication or API keys
|
||||||
|
|
||||||
### Key Findings
|
### Key Findings
|
||||||
1. **No Dedicated Listing API**: Unlike search results, there's no separate GraphQL query for individual listing data
|
|
||||||
2. **Complete Data Available**: All listing information is embedded in the initial HTML response
|
1. **No Dedicated Listing API**: Unlike search results, there’s no separate GraphQL
|
||||||
3. **Additional Context Fetched**: Secondary GraphQL queries provide complementary data (reviews, similar listings)
|
query for individual listing data
|
||||||
|
2. **Complete Data Available**: All listing information is embedded in the initial HTML
|
||||||
|
response
|
||||||
|
3. **Additional Context Fetched**: Secondary GraphQL queries provide complementary data
|
||||||
|
(reviews, similar listings)
|
||||||
4. **Consistent Architecture**: Same Apollo state embedding pattern as search pages
|
4. **Consistent Architecture**: Same Apollo state embedding pattern as search pages
|
||||||
|
|
||||||
### Current Scraper Implementation
|
### Current Scraper Implementation
|
||||||
|
|
||||||
The scraper successfully extracts listing details by:
|
The scraper successfully extracts listing details by:
|
||||||
1. Fetching the listing URL HTML
|
1. Fetching the listing URL HTML
|
||||||
2. Parsing embedded `__NEXT_DATA__` Apollo state
|
2. Parsing embedded `__NEXT_DATA__` Apollo state
|
||||||
3. Extracting the `Listing:{id}` object from Apollo cache
|
3. Extracting the `Listing:{id}` object from Apollo cache
|
||||||
4. Mapping fields to typed `ListingDetails` interface
|
4. Mapping fields to typed `ListingDetails` interface
|
||||||
|
|
||||||
This approach works reliably without requiring authentication or dealing with rate limiting on individual listing fetches.
|
This approach works reliably without requiring authentication or dealing with rate
|
||||||
|
limiting on individual listing fetches.
|
||||||
|
|
||||||
## Next Steps
|
## Next Steps
|
||||||
|
|
||||||
- Explore posting/authentication APIs (requires user login)
|
- Explore posting/authentication APIs (requires user login)
|
||||||
- Investigate if GraphQL API can be used for programmatic access with proper authentication
|
- Investigate if GraphQL API can be used for programmatic access with proper
|
||||||
|
authentication
|
||||||
- Test rate limiting patterns and optimal scraping strategies
|
- Test rate limiting patterns and optimal scraping strategies
|
||||||
- Document additional category and location ID mappings
|
- Document additional category and location ID mappings
|
||||||
|
|||||||
@@ -1,19 +1,26 @@
|
|||||||
# opencode Monorepo Config Adoption Implementation Plan
|
# opencode Monorepo Config Adoption Implementation Plan
|
||||||
|
|
||||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||||
|
> superpowers:subagent-driven-development (recommended) or superpowers:executing-plans
|
||||||
|
> to implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
**Goal:** Adopt opencode-style monorepo config: Turbo task orchestration, workspace dep catalog, shared root tsconfig, bunfig.toml, and `exports` field in all packages.
|
**Goal:** Adopt opencode-style monorepo config: Turbo task orchestration, workspace dep
|
||||||
|
catalog, shared root tsconfig, bunfig.toml, and `exports` field in all packages.
|
||||||
|
|
||||||
**Architecture:** Pure config changes across 10 files — no source code touched. Root config files are added/updated first, then per-package files updated to reference them. Changes are independent within each task and safe to commit atomically.
|
**Architecture:** Pure config changes across 10 files — no source code touched.
|
||||||
|
Root config files are added/updated first, then per-package files updated to reference
|
||||||
|
them. Changes are independent within each task and safe to commit atomically.
|
||||||
|
|
||||||
**Tech Stack:** Bun workspaces, Turbo 2.x, @tsconfig/bun, TypeScript (tsgo / @typescript/native-preview)
|
**Tech Stack:** Bun workspaces, Turbo 2.x, @tsconfig/bun, TypeScript (tsgo /
|
||||||
|
@typescript/native-preview)
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
## File Map
|
## File Map
|
||||||
|
|
||||||
| File | Action | Responsible for |
|
| File | Action | Responsible for |
|
||||||
|---|---|---|
|
| --- | --- | --- |
|
||||||
| `package.json` | Modify | Workspace catalog, turbo devDep, @tsconfig/bun devDep, updated scripts |
|
| `package.json` | Modify | Workspace catalog, turbo devDep, @tsconfig/bun devDep, updated scripts |
|
||||||
| `turbo.json` | Create | Task graph: typecheck, build, test |
|
| `turbo.json` | Create | Task graph: typecheck, build, test |
|
||||||
| `tsconfig.json` | Create | Shared TS compiler options for all packages |
|
| `tsconfig.json` | Create | Shared TS compiler options for all packages |
|
||||||
@@ -25,14 +32,16 @@
|
|||||||
| `packages/api-server/tsconfig.json` | Modify | Slim — extends root, paths only |
|
| `packages/api-server/tsconfig.json` | Modify | Slim — extends root, paths only |
|
||||||
| `packages/mcp-server/tsconfig.json` | Modify | Slim — extends root, paths only |
|
| `packages/mcp-server/tsconfig.json` | Modify | Slim — extends root, paths only |
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
### Task 1: Add `bunfig.toml` and `turbo.json`
|
### Task 1: Add `bunfig.toml` and `turbo.json`
|
||||||
|
|
||||||
Two new root config files with no dependencies on other tasks.
|
Two new root config files with no dependencies on other tasks.
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Create: `bunfig.toml`
|
- Create: `bunfig.toml`
|
||||||
|
|
||||||
- Create: `turbo.json`
|
- Create: `turbo.json`
|
||||||
|
|
||||||
- [ ] **Step 1: Create `bunfig.toml`**
|
- [ ] **Step 1: Create `bunfig.toml`**
|
||||||
@@ -83,13 +92,15 @@ git add bunfig.toml turbo.json
|
|||||||
git commit -m "chore: add bunfig.toml and turbo.json"
|
git commit -m "chore: add bunfig.toml and turbo.json"
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
### Task 2: Create root `tsconfig.json`
|
### Task 2: Create root `tsconfig.json`
|
||||||
|
|
||||||
Shared base tsconfig all packages will extend. Extracts the common options currently duplicated in all 3 per-package tsconfigs.
|
Shared base tsconfig all packages will extend.
|
||||||
|
Extracts the common options currently duplicated in all 3 per-package tsconfigs.
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Create: `tsconfig.json`
|
- Create: `tsconfig.json`
|
||||||
|
|
||||||
- [ ] **Step 1: Create root `tsconfig.json`**
|
- [ ] **Step 1: Create root `tsconfig.json`**
|
||||||
@@ -130,13 +141,15 @@ git add tsconfig.json
|
|||||||
git commit -m "chore: add shared root tsconfig.json"
|
git commit -m "chore: add shared root tsconfig.json"
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
### Task 3: Update root `package.json`
|
### Task 3: Update root `package.json`
|
||||||
|
|
||||||
Add workspace catalog, `turbo` + `@tsconfig/bun` devDependencies, and update scripts to use `turbo run`.
|
Add workspace catalog, `turbo` + `@tsconfig/bun` devDependencies, and update scripts to
|
||||||
|
use `turbo run`.
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `package.json`
|
- Modify: `package.json`
|
||||||
|
|
||||||
- [ ] **Step 1: Replace root `package.json`**
|
- [ ] **Step 1: Replace root `package.json`**
|
||||||
@@ -180,7 +193,11 @@ Write this complete file:
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
> **Note on catalog versions:** The catalog pins exact versions. The values above are taken from the current package installs. If `@types/bun` was `latest`, check `node_modules/@types/bun/package.json` for the actual installed version and use that. Same for `@typescript/native-preview`.
|
> **Note on catalog versions:** The catalog pins exact versions.
|
||||||
|
> The values above are taken from the current package installs.
|
||||||
|
> If `@types/bun` was `latest`, check `node_modules/@types/bun/package.json` for the
|
||||||
|
> actual installed version and use that.
|
||||||
|
> Same for `@typescript/native-preview`.
|
||||||
|
|
||||||
- [ ] **Step 2: Check actual installed versions**
|
- [ ] **Step 2: Check actual installed versions**
|
||||||
|
|
||||||
@@ -208,7 +225,8 @@ Expected: lock file updated, `turbo` and `@tsconfig/bun` appear in `node_modules
|
|||||||
bunx turbo run typecheck --dry
|
bunx turbo run typecheck --dry
|
||||||
```
|
```
|
||||||
|
|
||||||
Expected: output lists the `typecheck` task for each package (even if no `typecheck` script exists yet — turbo will note them as skipped/missing).
|
Expected: output lists the `typecheck` task for each package (even if no `typecheck`
|
||||||
|
script exists yet — turbo will note them as skipped/missing).
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -217,15 +235,19 @@ git add package.json bun.lock
|
|||||||
git commit -m "chore: add workspace catalog and turbo to root package.json"
|
git commit -m "chore: add workspace catalog and turbo to root package.json"
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
### Task 4: Update per-package `package.json` files
|
### Task 4: Update per-package `package.json` files
|
||||||
|
|
||||||
Rename `type:check` → `typecheck`, replace `main`/`module` with `exports`, swap pinned dep versions for `catalog:` references.
|
Rename `type:check` → `typecheck`, replace `main`/`module` with `exports`, swap pinned
|
||||||
|
dep versions for `catalog:` references.
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/package.json`
|
- Modify: `packages/core/package.json`
|
||||||
|
|
||||||
- Modify: `packages/api-server/package.json`
|
- Modify: `packages/api-server/package.json`
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/package.json`
|
- Modify: `packages/mcp-server/package.json`
|
||||||
|
|
||||||
- [ ] **Step 1: Replace `packages/core/package.json`**
|
- [ ] **Step 1: Replace `packages/core/package.json`**
|
||||||
@@ -325,7 +347,9 @@ Rename `type:check` → `typecheck`, replace `main`/`module` with `exports`, swa
|
|||||||
bun install
|
bun install
|
||||||
```
|
```
|
||||||
|
|
||||||
Expected: no errors. Catalog refs resolved. `bun.lock` updated.
|
Expected: no errors.
|
||||||
|
Catalog refs resolved.
|
||||||
|
`bun.lock` updated.
|
||||||
|
|
||||||
- [ ] **Step 5: Verify typecheck still works per-package**
|
- [ ] **Step 5: Verify typecheck still works per-package**
|
||||||
|
|
||||||
@@ -345,15 +369,19 @@ git add packages/core/package.json packages/api-server/package.json packages/mcp
|
|||||||
git commit -m "chore: use exports field and catalog refs in all packages"
|
git commit -m "chore: use exports field and catalog refs in all packages"
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
### Task 5: Slim per-package `tsconfig.json` files
|
### Task 5: Slim per-package `tsconfig.json` files
|
||||||
|
|
||||||
Replace the duplicated full tsconfig in each package with a slim `extends`-based one pointing to root.
|
Replace the duplicated full tsconfig in each package with a slim `extends`-based one
|
||||||
|
pointing to root.
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/tsconfig.json`
|
- Modify: `packages/core/tsconfig.json`
|
||||||
|
|
||||||
- Modify: `packages/api-server/tsconfig.json`
|
- Modify: `packages/api-server/tsconfig.json`
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/tsconfig.json`
|
- Modify: `packages/mcp-server/tsconfig.json`
|
||||||
|
|
||||||
- [ ] **Step 1: Replace `packages/core/tsconfig.json`**
|
- [ ] **Step 1: Replace `packages/core/tsconfig.json`**
|
||||||
@@ -400,7 +428,8 @@ Replace the duplicated full tsconfig in each package with a slim `extends`-based
|
|||||||
|
|
||||||
- [ ] **Step 4: Verify `@tsconfig/bun` is resolvable**
|
- [ ] **Step 4: Verify `@tsconfig/bun` is resolvable**
|
||||||
|
|
||||||
The root tsconfig extends `@tsconfig/bun/tsconfig.json`. Confirm the package is installed:
|
The root tsconfig extends `@tsconfig/bun/tsconfig.json`. Confirm the package is
|
||||||
|
installed:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ls node_modules/@tsconfig/bun/tsconfig.json
|
ls node_modules/@tsconfig/bun/tsconfig.json
|
||||||
@@ -414,7 +443,8 @@ Expected: file exists.
|
|||||||
bun run typecheck
|
bun run typecheck
|
||||||
```
|
```
|
||||||
|
|
||||||
Expected: Turbo runs `typecheck` for all 3 packages in parallel, all pass (or same pre-existing errors — no new ones).
|
Expected: Turbo runs `typecheck` for all 3 packages in parallel, all pass (or same
|
||||||
|
pre-existing errors — no new ones).
|
||||||
|
|
||||||
- [ ] **Step 6: Commit**
|
- [ ] **Step 6: Commit**
|
||||||
|
|
||||||
@@ -423,7 +453,7 @@ git add packages/core/tsconfig.json packages/api-server/tsconfig.json packages/m
|
|||||||
git commit -m "chore: slim per-package tsconfigs to extend root"
|
git commit -m "chore: slim per-package tsconfigs to extend root"
|
||||||
```
|
```
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
### Task 6: Smoke test full build pipeline
|
### Task 6: Smoke test full build pipeline
|
||||||
|
|
||||||
@@ -437,7 +467,8 @@ Verify everything works end-to-end.
|
|||||||
bun run typecheck
|
bun run typecheck
|
||||||
```
|
```
|
||||||
|
|
||||||
Expected: Turbo runs `typecheck` across all packages. Exit 0.
|
Expected: Turbo runs `typecheck` across all packages.
|
||||||
|
Exit 0.
|
||||||
|
|
||||||
- [ ] **Step 2: Run full build**
|
- [ ] **Step 2: Run full build**
|
||||||
|
|
||||||
@@ -445,7 +476,8 @@ Expected: Turbo runs `typecheck` across all packages. Exit 0.
|
|||||||
bun run build
|
bun run build
|
||||||
```
|
```
|
||||||
|
|
||||||
Expected: `dist/` cleaned, Turbo runs `build` (core first, then api-server and mcp-server in parallel), build artifacts appear in `dist/api/` and `dist/mcp/`.
|
Expected: `dist/` cleaned, Turbo runs `build` (core first, then api-server and
|
||||||
|
mcp-server in parallel), build artifacts appear in `dist/api/` and `dist/mcp/`.
|
||||||
|
|
||||||
- [ ] **Step 3: Verify dist artifacts**
|
- [ ] **Step 3: Verify dist artifacts**
|
||||||
|
|
||||||
@@ -461,7 +493,9 @@ Expected: compiled output files in both directories.
|
|||||||
grep -c '\^' bun.lock | head -5
|
grep -c '\^' bun.lock | head -5
|
||||||
```
|
```
|
||||||
|
|
||||||
With `exact = true` in bunfig.toml, new installs won't add `^` ranges. Existing `^` ranges in `bun.lock` from before are fine — they'll be resolved to exact on next fresh install.
|
With `exact = true` in bunfig.toml, new installs won’t add `^` ranges.
|
||||||
|
Existing `^` ranges in `bun.lock` from before are fine — they’ll be resolved to exact on
|
||||||
|
next fresh install.
|
||||||
|
|
||||||
- [ ] **Step 5: Final commit if any loose files**
|
- [ ] **Step 5: Final commit if any loose files**
|
||||||
|
|
||||||
|
|||||||
@@ -1,53 +1,64 @@
|
|||||||
# Cookie Env-Only Implementation Plan
|
# Cookie Env-Only Implementation Plan
|
||||||
|
|
||||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||||
|
> superpowers:subagent-driven-development (recommended) or superpowers:executing-plans
|
||||||
|
> to implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
**Goal:** Remove cookie files and request-provided cookie overrides so all authenticated marketplace scraping reads raw `Cookie` header strings only from environment variables.
|
**Goal:** Remove cookie files and request-provided cookie overrides so all authenticated
|
||||||
|
marketplace scraping reads raw `Cookie` header strings only from environment variables.
|
||||||
|
|
||||||
**Architecture:** Collapse shared cookie loading to a single env-var reader in `packages/core/src/utils/cookies.ts`, then tighten Facebook and eBay core signatures to stop accepting request/file cookie inputs. Update the API and MCP adapters so they no longer advertise or forward cookie parameters, and rewrite docs/tests to match the env-only contract.
|
**Architecture:** Collapse shared cookie loading to a single env-var reader in
|
||||||
|
`packages/core/src/utils/cookies.ts`, then tighten Facebook and eBay core signatures to
|
||||||
|
stop accepting request/file cookie inputs.
|
||||||
|
Update the API and MCP adapters so they no longer advertise or forward cookie
|
||||||
|
parameters, and rewrite docs/tests to match the env-only contract.
|
||||||
|
|
||||||
**Tech Stack:** Bun, TypeScript, Bun test, Biome, workspace package exports
|
**Tech Stack:** Bun, TypeScript, Bun test, Biome, workspace package exports
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
## File Map
|
## File Map
|
||||||
|
|
||||||
- Modify: `packages/core/src/utils/cookies.ts`
|
- Modify: `packages/core/src/utils/cookies.ts` Purpose: remove JSON/file/request-source
|
||||||
Purpose: remove JSON/file/request-source loading and keep env-only cookie parsing/formatting.
|
loading and keep env-only cookie parsing/formatting.
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts` Purpose: drop `cookiesSource` /
|
||||||
Purpose: drop `cookiesSource` / `cookiePath` arguments and env-only error text.
|
`cookiePath` arguments and env-only error text.
|
||||||
- Modify: `packages/core/src/scrapers/ebay.ts`
|
- Modify: `packages/core/src/scrapers/ebay.ts` Purpose: remove `opts.cookies` request
|
||||||
Purpose: remove `opts.cookies` request override and use env-only cookie loading.
|
override and use env-only cookie loading.
|
||||||
- Modify: `packages/core/src/index.ts`
|
- Modify: `packages/core/src/index.ts` Purpose: keep exports aligned with tightened core
|
||||||
Purpose: keep exports aligned with tightened core signatures.
|
signatures.
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts` Purpose: replace missing-file
|
||||||
Purpose: replace missing-file coverage with env-only auth tests.
|
coverage with env-only auth tests.
|
||||||
- Create: `packages/core/test/ebay-core.test.ts`
|
- Create: `packages/core/test/ebay-core.test.ts` Purpose: add dedicated eBay auth
|
||||||
Purpose: add dedicated eBay auth regression coverage instead of mixing it into Facebook tests.
|
regression coverage instead of mixing it into Facebook tests.
|
||||||
- Modify: `packages/api-server/src/routes/facebook.ts`
|
- Modify: `packages/api-server/src/routes/facebook.ts` Purpose: stop parsing/forwarding
|
||||||
Purpose: stop parsing/forwarding `cookies` query params.
|
`cookies` query params.
|
||||||
- Modify: `packages/api-server/src/routes/ebay.ts`
|
- Modify: `packages/api-server/src/routes/ebay.ts` Purpose: stop parsing/forwarding
|
||||||
Purpose: stop parsing/forwarding `cookies` query params.
|
`cookies` query params.
|
||||||
- Create: `packages/api-server/test/routes.test.ts`
|
- Create: `packages/api-server/test/routes.test.ts` Purpose: verify Facebook/eBay routes
|
||||||
Purpose: verify Facebook/eBay routes ignore cookie query params and still call core correctly.
|
ignore cookie query params and still call core correctly.
|
||||||
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
- Modify: `packages/mcp-server/src/protocol/tools.ts` Purpose: remove Facebook/eBay
|
||||||
Purpose: remove Facebook/eBay cookie tool inputs and descriptions.
|
cookie tool inputs and descriptions.
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts` Purpose: stop mapping removed
|
||||||
Purpose: stop mapping removed cookie tool inputs into API URLs.
|
cookie tool inputs into API URLs.
|
||||||
- Create: `packages/mcp-server/test/protocol.test.ts`
|
- Create: `packages/mcp-server/test/protocol.test.ts` Purpose: verify tool schemas and
|
||||||
Purpose: verify tool schemas and handler URL building no longer include Facebook/eBay cookie fields.
|
handler URL building no longer include Facebook/eBay cookie fields.
|
||||||
- Modify: `cookies/AGENTS.md`
|
- Modify: `cookies/AGENTS.md` Purpose: document env vars as the only supported cookie
|
||||||
Purpose: document env vars as the only supported cookie input.
|
input.
|
||||||
|
|
||||||
### Task 1: Lock core cookie utilities to env-only loading
|
### Task 1: Lock core cookie utilities to env-only loading
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/utils/cookies.ts:19-227`
|
- Modify: `packages/core/src/utils/cookies.ts:19-227`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
|
|
||||||
Add or replace the auth-source test block in `packages/core/test/facebook-core.test.ts` with env-only expectations:
|
Add or replace the auth-source test block in `packages/core/test/facebook-core.test.ts`
|
||||||
|
with env-only expectations:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
test("should load Facebook cookies from FACEBOOK_COOKIE env var", async () => {
|
test("should load Facebook cookies from FACEBOOK_COOKIE env var", async () => {
|
||||||
@@ -85,12 +96,14 @@ test("should reject missing Facebook auth env var", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts`
|
Run: `bun test packages/core/test/facebook-core.test.ts` Expected: FAIL because the
|
||||||
Expected: FAIL because the current implementation still allows missing env values to fall through to file/request-based behavior and does not emit the new env-only error.
|
current implementation still allows missing env values to fall through to
|
||||||
|
file/request-based behavior and does not emit the new env-only error.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
Replace the multi-source loader in `packages/core/src/utils/cookies.ts` with an env-only loader. The target shape is:
|
Replace the multi-source loader in `packages/core/src/utils/cookies.ts` with an env-only
|
||||||
|
loader. The target shape is:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
export interface CookieConfig {
|
export interface CookieConfig {
|
||||||
@@ -129,8 +142,8 @@ Delete the now-dead helpers and types that exist only for JSON/file/request load
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts`
|
Run: `bun test packages/core/test/facebook-core.test.ts` Expected: PASS for the new
|
||||||
Expected: PASS for the new env-only tests.
|
env-only tests.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -142,10 +155,15 @@ git commit -m "refactor: make cookie loading env-only"
|
|||||||
### Task 2: Tighten Facebook core APIs to the new contract
|
### Task 2: Tighten Facebook core APIs to the new contract
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts:23-29`
|
- Modify: `packages/core/src/scrapers/facebook.ts:23-29`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts:214-228`
|
- Modify: `packages/core/src/scrapers/facebook.ts:214-228`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts:823-929`
|
- Modify: `packages/core/src/scrapers/facebook.ts:823-929`
|
||||||
|
|
||||||
- Modify: `packages/core/src/index.ts:5-15`
|
- Modify: `packages/core/src/index.ts:5-15`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
@@ -171,8 +189,9 @@ test("should fail Facebook item fetch when FACEBOOK_COOKIE is unset", async () =
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts`
|
Run: `bun test packages/core/test/facebook-core.test.ts` Expected: FAIL because the
|
||||||
Expected: FAIL because the current function signatures and error text still mention parameter/file-based auth paths.
|
current function signatures and error text still mention parameter/file-based auth
|
||||||
|
paths.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -206,12 +225,14 @@ console.warn(
|
|||||||
);
|
);
|
||||||
```
|
```
|
||||||
|
|
||||||
Remove the extra cookie arguments from `fetchFacebookItem(...)` and keep `packages/core/src/index.ts` exporting the tightened functions without the old parameter contract.
|
Remove the extra cookie arguments from `fetchFacebookItem(...)` and keep
|
||||||
|
`packages/core/src/index.ts` exporting the tightened functions without the old parameter
|
||||||
|
contract.
|
||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts`
|
Run: `bun test packages/core/test/facebook-core.test.ts` Expected: PASS with the new
|
||||||
Expected: PASS with the new env-only Facebook API surface.
|
env-only Facebook API surface.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -223,8 +244,11 @@ git commit -m "refactor: remove facebook cookie overrides"
|
|||||||
### Task 3: Tighten eBay core APIs to env-only auth
|
### Task 3: Tighten eBay core APIs to env-only auth
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/ebay.ts:9-15`
|
- Modify: `packages/core/src/scrapers/ebay.ts:9-15`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/ebay.ts:337-389`
|
- Modify: `packages/core/src/scrapers/ebay.ts:337-389`
|
||||||
|
|
||||||
- Create: `packages/core/test/ebay-core.test.ts`
|
- Create: `packages/core/test/ebay-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
@@ -249,8 +273,8 @@ test("should warn and continue without eBay cookies when EBAY_COOKIE is unset",
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/ebay-core.test.ts`
|
Run: `bun test packages/core/test/ebay-core.test.ts` Expected: FAIL because
|
||||||
Expected: FAIL because `loadEbayCookies` still accepts request overrides and mentions file/json sources.
|
`loadEbayCookies` still accepts request overrides and mentions file/json sources.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -276,12 +300,13 @@ async function loadEbayCookies(): Promise<string | undefined> {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Then remove `cookies` from `fetchEbayItems(..., opts)` and the destructuring that feeds it into `loadEbayCookies()`.
|
Then remove `cookies` from `fetchEbayItems(..., opts)` and the destructuring that feeds
|
||||||
|
it into `loadEbayCookies()`.
|
||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/ebay-core.test.ts`
|
Run: `bun test packages/core/test/ebay-core.test.ts` Expected: PASS for the eBay
|
||||||
Expected: PASS for the eBay env-only regression coverage.
|
env-only regression coverage.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -293,13 +318,17 @@ git commit -m "refactor: make ebay auth env-only"
|
|||||||
### Task 4: Remove cookie query parameters from the API adapter
|
### Task 4: Remove cookie query parameters from the API adapter
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/facebook.ts:3-33`
|
- Modify: `packages/api-server/src/routes/facebook.ts:3-33`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/ebay.ts:3-52`
|
- Modify: `packages/api-server/src/routes/ebay.ts:3-52`
|
||||||
|
|
||||||
- Create: `packages/api-server/test/routes.test.ts`
|
- Create: `packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
|
|
||||||
Create `packages/api-server/test/routes.test.ts` and mock `@marketplace-scrapers/core` so the route contract is explicit:
|
Create `packages/api-server/test/routes.test.ts` and mock `@marketplace-scrapers/core`
|
||||||
|
so the route contract is explicit:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
import { afterEach, describe, expect, mock, test } from "bun:test";
|
import { afterEach, describe, expect, mock, test } from "bun:test";
|
||||||
@@ -347,8 +376,9 @@ test("ebayRoute ignores cookies query parameter", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/api-server/test/routes.test.ts`
|
Run: `bun test packages/api-server/test/routes.test.ts` Expected: FAIL because the
|
||||||
Expected: FAIL because the current routes still parse `reqUrl.searchParams.get("cookies")` and forward it downstream.
|
current routes still parse `reqUrl.searchParams.get("cookies")` and forward it
|
||||||
|
downstream.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -383,8 +413,8 @@ const items = await fetchEbayItems(SEARCH_QUERY, 1, {
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/api-server/test/routes.test.ts`
|
Run: `bun test packages/api-server/test/routes.test.ts` Expected: PASS for route
|
||||||
Expected: PASS for route coverage and no remaining adapter references to `cookies` for Facebook/eBay.
|
coverage and no remaining adapter references to `cookies` for Facebook/eBay.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -396,13 +426,17 @@ git commit -m "refactor: remove api cookie query overrides"
|
|||||||
### Task 5: Remove cookie inputs from MCP tool schemas and request mapping
|
### Task 5: Remove cookie inputs from MCP tool schemas and request mapping
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/tools.ts:65-148`
|
- Modify: `packages/mcp-server/src/protocol/tools.ts:65-148`
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts:154-211`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts:154-211`
|
||||||
|
|
||||||
- Create: `packages/mcp-server/test/protocol.test.ts`
|
- Create: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
|
|
||||||
Create `packages/mcp-server/test/protocol.test.ts` with schema and URL-building assertions:
|
Create `packages/mcp-server/test/protocol.test.ts` with schema and URL-building
|
||||||
|
assertions:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
import { expect, mock, test } from "bun:test";
|
import { expect, mock, test } from "bun:test";
|
||||||
@@ -445,8 +479,8 @@ expect(calledUrl).not.toContain("cookies=");
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts`
|
Run: `bun test packages/mcp-server/test/protocol.test.ts` Expected: FAIL because the
|
||||||
Expected: FAIL because the current MCP schema and handler still expose and forward those inputs.
|
current MCP schema and handler still expose and forward those inputs.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -465,12 +499,13 @@ Delete the Facebook/eBay cookie tool properties and handler mapping:
|
|||||||
// if (args.cookies) params.append("cookies", args.cookies);
|
// if (args.cookies) params.append("cookies", args.cookies);
|
||||||
```
|
```
|
||||||
|
|
||||||
Leave Kijiji alone; this plan only changes Facebook/eBay env-only auth paths defined by the approved spec.
|
Leave Kijiji alone; this plan only changes Facebook/eBay env-only auth paths defined by
|
||||||
|
the approved spec.
|
||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts`
|
Run: `bun test packages/mcp-server/test/protocol.test.ts` Expected: PASS with MCP
|
||||||
Expected: PASS with MCP definitions and handler mapping in sync.
|
definitions and handler mapping in sync.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -482,12 +517,16 @@ git commit -m "refactor: remove mcp cookie parameters"
|
|||||||
### Task 6: Rewrite cookie documentation and run full verification
|
### Task 6: Rewrite cookie documentation and run full verification
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `cookies/AGENTS.md:9-85`
|
- Modify: `cookies/AGENTS.md:9-85`
|
||||||
- Modify: `docs/superpowers/specs/2026-04-21-cookie-env-only-design.md` only if implementation reveals a spec mismatch
|
|
||||||
|
- Modify: `docs/superpowers/specs/2026-04-21-cookie-env-only-design.md` only if
|
||||||
|
implementation reveals a spec mismatch
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
|
|
||||||
Treat docs drift as a contract failure. Capture the required state before editing:
|
Treat docs drift as a contract failure.
|
||||||
|
Capture the required state before editing:
|
||||||
|
|
||||||
```md
|
```md
|
||||||
- Cookie setup docs mention env vars only for Facebook and eBay
|
- Cookie setup docs mention env vars only for Facebook and eBay
|
||||||
@@ -497,14 +536,14 @@ Treat docs drift as a contract failure. Capture the required state before editin
|
|||||||
|
|
||||||
- [ ] **Step 2: Run verification to prove current docs are stale**
|
- [ ] **Step 2: Run verification to prove current docs are stale**
|
||||||
|
|
||||||
Run: `rg -n "facebook\.json|ebay\.json|cookies=" cookies/AGENTS.md`
|
Run: `rg -n "facebook\.json|ebay\.json|cookies=" cookies/AGENTS.md` Expected: matches
|
||||||
Expected: matches found
|
found
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
Rewrite the cookie setup doc so Facebook and eBay each show only env-var setup:
|
Rewrite the cookie setup doc so Facebook and eBay each show only env-var setup:
|
||||||
|
|
||||||
```md
|
````md
|
||||||
## Cookie Configuration
|
## Cookie Configuration
|
||||||
|
|
||||||
All supported authenticated scrapers read cookies only from environment variables.
|
All supported authenticated scrapers read cookies only from environment variables.
|
||||||
@@ -513,14 +552,14 @@ All supported authenticated scrapers read cookies only from environment variable
|
|||||||
|
|
||||||
```bash
|
```bash
|
||||||
export FACEBOOK_COOKIE='c_user=123; xs=token; fr=request'
|
export FACEBOOK_COOKIE='c_user=123; xs=token; fr=request'
|
||||||
```
|
````
|
||||||
|
|
||||||
### eBay
|
### eBay
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
export EBAY_COOKIE='s=VALUE; ds2=VALUE; ebay=VALUE'
|
export EBAY_COOKIE='s=VALUE; ds2=VALUE; ebay=VALUE'
|
||||||
```
|
```
|
||||||
```
|
````
|
||||||
|
|
||||||
Remove the file-based and request-parameter sections entirely.
|
Remove the file-based and request-parameter sections entirely.
|
||||||
|
|
||||||
@@ -534,10 +573,14 @@ Expected: all commands pass
|
|||||||
```bash
|
```bash
|
||||||
git add cookies/AGENTS.md docs/superpowers/specs/2026-04-21-cookie-env-only-design.md
|
git add cookies/AGENTS.md docs/superpowers/specs/2026-04-21-cookie-env-only-design.md
|
||||||
git commit -m "docs: align cookie setup with env-only auth"
|
git commit -m "docs: align cookie setup with env-only auth"
|
||||||
```
|
````
|
||||||
|
|
||||||
## Self-Review
|
## Self-Review
|
||||||
|
|
||||||
- Spec coverage check: shared cookie utils, Facebook, eBay, API adapter, MCP adapter, tests, and docs each have explicit tasks.
|
- Spec coverage check: shared cookie utils, Facebook, eBay, API adapter, MCP adapter,
|
||||||
- Placeholder scan: concrete test files are now named for eBay core, API routes, and MCP protocol coverage.
|
tests, and docs each have explicit tasks.
|
||||||
- Type consistency check: `ensureCookies(config)` is the single shared loader name used across Tasks 1-3, and Facebook/eBay route signatures stay aligned with the core changes.
|
- Placeholder scan: concrete test files are now named for eBay core, API routes, and MCP
|
||||||
|
protocol coverage.
|
||||||
|
- Type consistency check: `ensureCookies(config)` is the single shared loader name used
|
||||||
|
across Tasks 1-3, and Facebook/eBay route signatures stay aligned with the core
|
||||||
|
changes.
|
||||||
|
|||||||
@@ -1,34 +1,49 @@
|
|||||||
# Facebook Comet Rewrite Implementation Plan
|
# Facebook Comet Rewrite Implementation Plan
|
||||||
|
|
||||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||||
|
> superpowers:subagent-driven-development (recommended) or superpowers:executing-plans
|
||||||
|
> to implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
**Goal:** Replace the legacy Facebook Marketplace scraper with a route-aware hybrid Comet-bootstrap parser for both search and item routes.
|
**Goal:** Replace the legacy Facebook Marketplace scraper with a route-aware hybrid
|
||||||
|
Comet-bootstrap parser for both search and item routes.
|
||||||
|
|
||||||
**Architecture:** Keep authenticated direct HTTP fetches as the transport. Classify each Facebook response first, then parse route-specific Comet bootstrap/state candidates, and fall back to rendered-HTML extraction only when bootstrap decoding cannot produce the expected search or item shape.
|
**Architecture:** Keep authenticated direct HTTP fetches as the transport.
|
||||||
|
Classify each Facebook response first, then parse route-specific Comet bootstrap/state
|
||||||
|
candidates, and fall back to rendered-HTML extraction only when bootstrap decoding
|
||||||
|
cannot produce the expected search or item shape.
|
||||||
|
|
||||||
**Tech Stack:** Bun, TypeScript, `bun:test`, `linkedom`, existing shared cookie/http helpers
|
**Tech Stack:** Bun, TypeScript, `bun:test`, `linkedom`, existing shared cookie/http
|
||||||
|
helpers
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
## File Structure
|
## File Structure
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
- Owns Facebook fetch flow, response classification, bootstrap candidate extraction, search parsing, item parsing, and HTML fallbacks.
|
- Owns Facebook fetch flow, response classification, bootstrap candidate extraction,
|
||||||
|
search parsing, item parsing, and HTML fallbacks.
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
- Owns unit coverage for response classification, bootstrap parsing, fallback parsing, and route-aware item/search extraction behavior.
|
- Owns unit coverage for response classification, bootstrap parsing, fallback parsing,
|
||||||
|
and route-aware item/search extraction behavior.
|
||||||
- Modify: `packages/core/test/facebook-integration.test.ts`
|
- Modify: `packages/core/test/facebook-integration.test.ts`
|
||||||
- Owns higher-level fetch flow tests, auth/degradation behavior, and result shaping for search/item entrypoints.
|
- Owns higher-level fetch flow tests, auth/degradation behavior, and result shaping
|
||||||
|
for search/item entrypoints.
|
||||||
|
|
||||||
### Task 1: Add Route Classification Coverage
|
### Task 1: Add Route Classification Coverage
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
|
|
||||||
Add these tests near the Facebook parser tests in `packages/core/test/facebook-core.test.ts`:
|
Add these tests near the Facebook parser tests in
|
||||||
|
`packages/core/test/facebook-core.test.ts`:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
test("classifies Comet search responses", () => {
|
test("classifies Comet search responses", () => {
|
||||||
@@ -89,12 +104,14 @@ test("classifies unavailable item responses", () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "classifies"`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "classifies"`
|
||||||
Expected: FAIL because `classifyFacebookResponse` does not exist yet.
|
Expected: FAIL because `classifyFacebookResponse` does not exist yet.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
Add this type and function near the parsing section in `packages/core/src/scrapers/facebook.ts`:
|
Add this type and function near the parsing section in
|
||||||
|
`packages/core/src/scrapers/facebook.ts`:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
type FacebookResponseKind = "search" | "item" | "auth_gated" | "unavailable" | "unknown";
|
type FacebookResponseKind = "search" | "item" | "auth_gated" | "unavailable" | "unknown";
|
||||||
@@ -128,7 +145,8 @@ export function classifyFacebookResponse(htmlString: HTMLString, responseUrl: st
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "classifies"`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "classifies"`
|
||||||
Expected: PASS
|
Expected: PASS
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
@@ -141,8 +159,11 @@ git commit -m "refactor: add facebook response classification"
|
|||||||
### Task 2: Add Bootstrap Candidate Extraction
|
### Task 2: Add Bootstrap Candidate Extraction
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
@@ -185,7 +206,8 @@ test("keeps candidate order stable for later scoring", () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "bootstrap candidates"`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "bootstrap candidates"`
|
||||||
Expected: FAIL because `extractFacebookBootstrapCandidates` does not exist.
|
Expected: FAIL because `extractFacebookBootstrapCandidates` does not exist.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
@@ -218,7 +240,8 @@ export function extractFacebookBootstrapCandidates(htmlString: HTMLString): Reco
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "bootstrap candidates"`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "bootstrap candidates"`
|
||||||
Expected: PASS
|
Expected: PASS
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
@@ -231,10 +254,15 @@ git commit -m "refactor: add facebook bootstrap candidate extraction"
|
|||||||
### Task 3: Replace Search Parsing With Candidate Scoring
|
### Task 3: Replace Search Parsing With Candidate Scoring
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-integration.test.ts`
|
- Modify: `packages/core/test/facebook-integration.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-integration.test.ts`
|
- Test: `packages/core/test/facebook-integration.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
@@ -323,12 +351,15 @@ const mockSearchHtml = `
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "Comet bootstrap candidates"`
|
Run:
|
||||||
Expected: FAIL because the current search extractor only understands legacy `marketplace_search` shapes.
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "Comet bootstrap candidates"`
|
||||||
|
Expected: FAIL because the current search extractor only understands legacy
|
||||||
|
`marketplace_search` shapes.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
Replace the search extraction internals in `extractFacebookMarketplaceData()` with candidate scoring like this:
|
Replace the search extraction internals in `extractFacebookMarketplaceData()` with
|
||||||
|
candidate scoring like this:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
function findSearchEdges(candidate: unknown): FacebookEdge[] | null {
|
function findSearchEdges(candidate: unknown): FacebookEdge[] | null {
|
||||||
@@ -383,7 +414,8 @@ export function extractFacebookMarketplaceData(htmlString: HTMLString): Facebook
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts packages/core/test/facebook-integration.test.ts`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts packages/core/test/facebook-integration.test.ts`
|
||||||
Expected: PASS for the rewritten search fixtures and existing unaffected tests.
|
Expected: PASS for the rewritten search fixtures and existing unaffected tests.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
@@ -396,8 +428,11 @@ git commit -m "refactor: rewrite facebook search parser for comet bootstrap"
|
|||||||
### Task 4: Replace Item Parsing With Candidate Scoring
|
### Task 4: Replace Item Parsing With Candidate Scoring
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
@@ -438,7 +473,8 @@ test("extracts item details from Comet permalink bootstrap candidates", () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "Comet permalink bootstrap"`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "Comet permalink bootstrap"`
|
||||||
Expected: FAIL because the current item extractor depends on legacy permalink markers.
|
Expected: FAIL because the current item extractor depends on legacy permalink markers.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
@@ -491,8 +527,8 @@ export function extractFacebookItemData(htmlString: HTMLString): FacebookMarketp
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts`
|
Run: `bun test packages/core/test/facebook-core.test.ts` Expected: PASS for
|
||||||
Expected: PASS for current-shape item tests and remaining parser tests.
|
current-shape item tests and remaining parser tests.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -504,8 +540,11 @@ git commit -m "refactor: rewrite facebook item parser for comet bootstrap"
|
|||||||
### Task 5: Add HTML Fallback Extraction
|
### Task 5: Add HTML Fallback Extraction
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
@@ -549,8 +588,10 @@ test("falls back to rendered item HTML when bootstrap payloads are undecodable",
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "falls back"`
|
Run:
|
||||||
Expected: FAIL because the extractor currently returns `null` without a structured candidate.
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "falls back"`
|
||||||
|
Expected: FAIL because the extractor currently returns `null` without a structured
|
||||||
|
candidate.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -607,11 +648,13 @@ function extractItemFallback(htmlString: HTMLString): FacebookMarketplaceItem |
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Then call these helpers as the last fallback inside `extractFacebookMarketplaceData()` and `extractFacebookItemData()`.
|
Then call these helpers as the last fallback inside `extractFacebookMarketplaceData()`
|
||||||
|
and `extractFacebookItemData()`.
|
||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts --test-name-pattern "falls back"`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts --test-name-pattern "falls back"`
|
||||||
Expected: PASS
|
Expected: PASS
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
@@ -624,8 +667,11 @@ git commit -m "refactor: add facebook html fallbacks"
|
|||||||
### Task 6: Wire Route-Aware Failures Into Entry Points
|
### Task 6: Wire Route-Aware Failures Into Entry Points
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-integration.test.ts`
|
- Modify: `packages/core/test/facebook-integration.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-integration.test.ts`
|
- Test: `packages/core/test/facebook-integration.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
@@ -664,8 +710,10 @@ test("returns null for unavailable item responses", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-integration.test.ts --test-name-pattern "auth-gated|unavailable"`
|
Run:
|
||||||
Expected: FAIL because the entrypoints do not yet classify successful HTML responses by route/auth state.
|
`bun test packages/core/test/facebook-integration.test.ts --test-name-pattern "auth-gated|unavailable"`
|
||||||
|
Expected: FAIL because the entrypoints do not yet classify successful HTML responses by
|
||||||
|
route/auth state.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -690,12 +738,13 @@ if (itemResponseClass.kind === "unavailable") {
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Use the actual response URL from `fetchHtml` plumbing if that helper is extended to return both HTML and final URL; otherwise start by threading final URL support through the fetch helper in the same task.
|
Use the actual response URL from `fetchHtml` plumbing if that helper is extended to
|
||||||
|
return both HTML and final URL; otherwise start by threading final URL support through
|
||||||
|
the fetch helper in the same task.
|
||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-integration.test.ts`
|
Run: `bun test packages/core/test/facebook-integration.test.ts` Expected: PASS
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -707,19 +756,22 @@ git commit -m "refactor: handle facebook route-aware failure states"
|
|||||||
### Task 7: Run Full Verification And Live Probe
|
### Task 7: Run Full Verification And Live Probe
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts` if small cleanup is required
|
- Modify: `packages/core/src/scrapers/facebook.ts` if small cleanup is required
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts` if small cleanup is required
|
- Modify: `packages/core/test/facebook-core.test.ts` if small cleanup is required
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-integration.test.ts` if small cleanup is required
|
- Modify: `packages/core/test/facebook-integration.test.ts` if small cleanup is required
|
||||||
|
|
||||||
- [ ] **Step 1: Run focused Facebook tests**
|
- [ ] **Step 1: Run focused Facebook tests**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts packages/core/test/facebook-integration.test.ts`
|
Run:
|
||||||
|
`bun test packages/core/test/facebook-core.test.ts packages/core/test/facebook-integration.test.ts`
|
||||||
Expected: PASS
|
Expected: PASS
|
||||||
|
|
||||||
- [ ] **Step 2: Run broader core tests**
|
- [ ] **Step 2: Run broader core tests**
|
||||||
|
|
||||||
Run: `bun test packages/core/test`
|
Run: `bun test packages/core/test` Expected: PASS
|
||||||
Expected: PASS
|
|
||||||
|
|
||||||
- [ ] **Step 3: Run live authenticated Facebook probe**
|
- [ ] **Step 3: Run live authenticated Facebook probe**
|
||||||
|
|
||||||
@@ -742,11 +794,14 @@ if (results[0]?.url) {
|
|||||||
Expected:
|
Expected:
|
||||||
|
|
||||||
- search returns at least one result
|
- search returns at least one result
|
||||||
- item fetch returns non-null for the first live result when the route is not stale/unavailable
|
|
||||||
|
- item fetch returns non-null for the first live result when the route is not
|
||||||
|
stale/unavailable
|
||||||
|
|
||||||
- [ ] **Step 4: Make any minimal cleanup needed to keep tests and live probe green**
|
- [ ] **Step 4: Make any minimal cleanup needed to keep tests and live probe green**
|
||||||
|
|
||||||
If cleanup is needed, keep it limited to naming, dead-code removal caused by the rewrite, or small parser corrections directly exposed by the verification commands.
|
If cleanup is needed, keep it limited to naming, dead-code removal caused by the
|
||||||
|
rewrite, or small parser corrections directly exposed by the verification commands.
|
||||||
|
|
||||||
- [ ] **Step 5: Re-run verification**
|
- [ ] **Step 5: Re-run verification**
|
||||||
|
|
||||||
@@ -767,6 +822,11 @@ git commit -m "refactor: complete facebook comet scraper rewrite"
|
|||||||
|
|
||||||
## Self-Review
|
## Self-Review
|
||||||
|
|
||||||
- Spec coverage: the plan covers classification, route-aware search parsing, route-aware item parsing, HTML fallbacks, explicit failure-state handling, test replacement, and live verification.
|
- Spec coverage: the plan covers classification, route-aware search parsing, route-aware
|
||||||
- Placeholder scan: no `TODO`, `TBD`, or unspecified “handle appropriately” steps remain.
|
item parsing, HTML fallbacks, explicit failure-state handling, test replacement, and
|
||||||
- Type consistency: all planned functions and types use the same names across tasks: `classifyFacebookResponse`, `extractFacebookBootstrapCandidates`, `extractFacebookMarketplaceData`, and `extractFacebookItemData`.
|
live verification.
|
||||||
|
- Placeholder scan: no `TODO`, `TBD`, or unspecified “handle appropriately” steps
|
||||||
|
remain.
|
||||||
|
- Type consistency: all planned functions and types use the same names across tasks:
|
||||||
|
`classifyFacebookResponse`, `extractFacebookBootstrapCandidates`,
|
||||||
|
`extractFacebookMarketplaceData`, and `extractFacebookItemData`.
|
||||||
|
|||||||
@@ -1,63 +1,75 @@
|
|||||||
# Unstable Listing Mode Implementation Plan
|
# Unstable Listing Mode Implementation Plan
|
||||||
|
|
||||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||||
|
> superpowers:subagent-driven-development (recommended) or superpowers:executing-plans
|
||||||
|
> to implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
**Goal:** Add an optional shared mode across Facebook, eBay, and Kijiji that moves listings priced below 80% of the median into `unstableResults`, while preserving current default response shapes.
|
**Goal:** Add an optional shared mode across Facebook, eBay, and Kijiji that moves
|
||||||
|
listings priced below 80% of the median into `unstableResults`, while preserving current
|
||||||
|
default response shapes.
|
||||||
|
|
||||||
**Architecture:** Introduce a shared generic classifier in `packages/core` that splits any listing array into `results` and `unstableResults` using the same median-based rule. Then thread one opt-in flag through the scraper entrypoints, API routes, and MCP tool definitions so all surfaces expose the same behavior without changing existing defaults.
|
**Architecture:** Introduce a shared generic classifier in `packages/core` that splits
|
||||||
|
any listing array into `results` and `unstableResults` using the same median-based rule.
|
||||||
|
Then thread one opt-in flag through the scraper entrypoints, API routes, and MCP tool
|
||||||
|
definitions so all surfaces expose the same behavior without changing existing defaults.
|
||||||
|
|
||||||
**Tech Stack:** Bun, TypeScript, Bun test, workspace packages, JSON-RPC MCP server
|
**Tech Stack:** Bun, TypeScript, Bun test, workspace packages, JSON-RPC MCP server
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
## File Map
|
## File Map
|
||||||
|
|
||||||
- Create: `packages/core/src/utils/unstable.ts`
|
- Create: `packages/core/src/utils/unstable.ts` Purpose: shared generic median/cutoff
|
||||||
Purpose: shared generic median/cutoff classifier for listing arrays.
|
classifier for listing arrays.
|
||||||
- Modify: `packages/core/src/types/common.ts`
|
- Modify: `packages/core/src/types/common.ts` Purpose: add shared mode types used by
|
||||||
Purpose: add shared mode types used by scrapers and adapters.
|
scrapers and adapters.
|
||||||
- Modify: `packages/core/src/index.ts`
|
- Modify: `packages/core/src/index.ts` Purpose: export the new shared classifier/types.
|
||||||
Purpose: export the new shared classifier/types.
|
- Modify: `packages/core/src/scrapers/facebook.ts` Purpose: add the optional mode flag
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
and return bucketed results when enabled.
|
||||||
Purpose: add the optional mode flag and return bucketed results when enabled.
|
- Modify: `packages/core/src/scrapers/ebay.ts` Purpose: add the optional mode flag and
|
||||||
- Modify: `packages/core/src/scrapers/ebay.ts`
|
return bucketed results when enabled.
|
||||||
Purpose: add the optional mode flag and return bucketed results when enabled.
|
- Modify: `packages/core/src/scrapers/kijiji.ts` Purpose: add the optional mode flag and
|
||||||
- Modify: `packages/core/src/scrapers/kijiji.ts`
|
return bucketed results when enabled.
|
||||||
Purpose: add the optional mode flag and return bucketed results when enabled.
|
- Create: `packages/core/test/unstable-listing-mode.test.ts` Purpose: lock the shared
|
||||||
- Create: `packages/core/test/unstable-listing-mode.test.ts`
|
classifier behavior with direct unit tests.
|
||||||
Purpose: lock the shared classifier behavior with direct unit tests.
|
- Modify: `packages/core/test/facebook-core.test.ts` Purpose: prove Facebook preserves
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
default arrays and returns buckets when enabled.
|
||||||
Purpose: prove Facebook preserves default arrays and returns buckets when enabled.
|
- Modify: `packages/core/test/ebay-core.test.ts` Purpose: prove eBay preserves default
|
||||||
- Modify: `packages/core/test/ebay-core.test.ts`
|
arrays and returns buckets when enabled.
|
||||||
Purpose: prove eBay preserves default arrays and returns buckets when enabled.
|
- Modify: `packages/core/test/kijiji-core.test.ts` Purpose: prove Kijiji preserves
|
||||||
- Modify: `packages/core/test/kijiji-core.test.ts`
|
default arrays and returns buckets when enabled.
|
||||||
Purpose: prove Kijiji preserves default arrays and returns buckets when enabled.
|
- Modify: `packages/api-server/src/routes/facebook.ts` Purpose: expose a shared opt-in
|
||||||
- Modify: `packages/api-server/src/routes/facebook.ts`
|
query parameter and preserve default response shape.
|
||||||
Purpose: expose a shared opt-in query parameter and preserve default response shape.
|
- Modify: `packages/api-server/src/routes/ebay.ts` Purpose: expose the same query
|
||||||
- Modify: `packages/api-server/src/routes/ebay.ts`
|
parameter and preserve default response shape.
|
||||||
Purpose: expose the same query parameter and preserve default response shape.
|
- Modify: `packages/api-server/src/routes/kijiji.ts` Purpose: expose the same query
|
||||||
- Modify: `packages/api-server/src/routes/kijiji.ts`
|
parameter and preserve default response shape.
|
||||||
Purpose: expose the same query parameter and preserve default response shape.
|
- Modify: `packages/api-server/test/routes.test.ts` Purpose: verify route forwarding and
|
||||||
- Modify: `packages/api-server/test/routes.test.ts`
|
route response-shape switching.
|
||||||
Purpose: verify route forwarding and route response-shape switching.
|
- Modify: `packages/mcp-server/src/protocol/tools.ts` Purpose: document the optional
|
||||||
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
unstable mode in all search tools.
|
||||||
Purpose: document the optional unstable mode in all search tools.
|
- Modify: `packages/mcp-server/src/protocol/handler.ts` Purpose: forward the optional
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts`
|
mode to API routes for all search tools.
|
||||||
Purpose: forward the optional mode to API routes for all search tools.
|
- Modify: `packages/mcp-server/test/protocol.test.ts` Purpose: verify MCP tool metadata
|
||||||
- Modify: `packages/mcp-server/test/protocol.test.ts`
|
and forwarded URLs include the new option.
|
||||||
Purpose: verify MCP tool metadata and forwarded URLs include the new option.
|
|
||||||
|
|
||||||
### Task 1: Add the shared unstable-listing classifier
|
### Task 1: Add the shared unstable-listing classifier
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Create: `packages/core/src/utils/unstable.ts`
|
- Create: `packages/core/src/utils/unstable.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/types/common.ts`
|
- Modify: `packages/core/src/types/common.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/index.ts`
|
- Modify: `packages/core/src/index.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/unstable-listing-mode.test.ts`
|
- Test: `packages/core/test/unstable-listing-mode.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing test**
|
- [ ] **Step 1: Write the failing test**
|
||||||
|
|
||||||
Create `packages/core/test/unstable-listing-mode.test.ts` with focused shared-behavior coverage:
|
Create `packages/core/test/unstable-listing-mode.test.ts` with focused shared-behavior
|
||||||
|
coverage:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
import { describe, expect, test } from "bun:test";
|
import { describe, expect, test } from "bun:test";
|
||||||
@@ -127,8 +139,8 @@ describe("classifyUnstableListings", () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run test to verify it fails**
|
- [ ] **Step 2: Run test to verify it fails**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/unstable-listing-mode.test.ts`
|
Run: `bun test packages/core/test/unstable-listing-mode.test.ts` Expected: FAIL because
|
||||||
Expected: FAIL because `classifyUnstableListings` and the shared mode types do not exist yet.
|
`classifyUnstableListings` and the shared mode types do not exist yet.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -202,8 +214,8 @@ export { classifyUnstableListings } from "./utils/unstable";
|
|||||||
|
|
||||||
- [ ] **Step 4: Run test to verify it passes**
|
- [ ] **Step 4: Run test to verify it passes**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/unstable-listing-mode.test.ts`
|
Run: `bun test packages/core/test/unstable-listing-mode.test.ts` Expected: PASS with 4
|
||||||
Expected: PASS with 4 passing tests.
|
passing tests.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -215,16 +227,24 @@ git commit -m "feat: add shared unstable listing classifier"
|
|||||||
### Task 2: Thread the optional mode through all core scrapers
|
### Task 2: Thread the optional mode through all core scrapers
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/ebay.ts`
|
- Modify: `packages/core/src/scrapers/ebay.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/kijiji.ts`
|
- Modify: `packages/core/src/scrapers/kijiji.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/test/facebook-core.test.ts`
|
- Modify: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/test/ebay-core.test.ts`
|
- Modify: `packages/core/test/ebay-core.test.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/test/kijiji-core.test.ts`
|
- Modify: `packages/core/test/kijiji-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
|
|
||||||
Add one focused opt-in test per scraper. Use the new shared classifier through the public scraper entrypoints instead of testing internal helpers.
|
Add one focused opt-in test per scraper.
|
||||||
|
Use the new shared classifier through the public scraper entrypoints instead of testing
|
||||||
|
internal helpers.
|
||||||
|
|
||||||
In `packages/core/test/facebook-core.test.ts`, add:
|
In `packages/core/test/facebook-core.test.ts`, add:
|
||||||
|
|
||||||
@@ -286,7 +306,8 @@ test("fetchKijijiItems returns stable and unstable buckets when unstable mode is
|
|||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
Also add one default-mode assertion in one existing scraper test file, for example in `packages/core/test/facebook-core.test.ts`:
|
Also add one default-mode assertion in one existing scraper test file, for example in
|
||||||
|
`packages/core/test/facebook-core.test.ts`:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
test("fetchFacebookItems keeps returning an array by default", async () => {
|
test("fetchFacebookItems keeps returning an array by default", async () => {
|
||||||
@@ -307,8 +328,10 @@ test("fetchFacebookItems keeps returning an array by default", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run tests to verify they fail**
|
- [ ] **Step 2: Run tests to verify they fail**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts`
|
Run:
|
||||||
Expected: FAIL because the scraper signatures do not yet accept the new option and still always return arrays.
|
`bun test packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts`
|
||||||
|
Expected: FAIL because the scraper signatures do not yet accept the new option and still
|
||||||
|
always return arrays.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -322,7 +345,8 @@ import {
|
|||||||
} from "../index";
|
} from "../index";
|
||||||
```
|
```
|
||||||
|
|
||||||
In `packages/core/src/scrapers/facebook.ts`, extend the default export signature and branch at the end:
|
In `packages/core/src/scrapers/facebook.ts`, extend the default export signature and
|
||||||
|
branch at the end:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
export default async function fetchFacebookItems(
|
export default async function fetchFacebookItems(
|
||||||
@@ -371,7 +395,8 @@ export default async function fetchEbayItems(
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
In `packages/core/src/scrapers/kijiji.ts`, add the same final argument after `listingOptions`:
|
In `packages/core/src/scrapers/kijiji.ts`, add the same final argument after
|
||||||
|
`listingOptions`:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
export default async function fetchKijijiItems(
|
export default async function fetchKijijiItems(
|
||||||
@@ -392,12 +417,15 @@ export default async function fetchKijijiItems(
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
Keep the default branch untouched in all three files so existing callers still receive arrays.
|
Keep the default branch untouched in all three files so existing callers still receive
|
||||||
|
arrays.
|
||||||
|
|
||||||
- [ ] **Step 4: Run tests to verify they pass**
|
- [ ] **Step 4: Run tests to verify they pass**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/unstable-listing-mode.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts`
|
Run:
|
||||||
Expected: PASS, including the new opt-in bucket assertions and the default-array regression assertion.
|
`bun test packages/core/test/unstable-listing-mode.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts`
|
||||||
|
Expected: PASS, including the new opt-in bucket assertions and the default-array
|
||||||
|
regression assertion.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -409,14 +437,19 @@ git commit -m "feat: add unstable mode to scraper results"
|
|||||||
### Task 3: Expose unstable mode in API routes
|
### Task 3: Expose unstable mode in API routes
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/facebook.ts`
|
- Modify: `packages/api-server/src/routes/facebook.ts`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/ebay.ts`
|
- Modify: `packages/api-server/src/routes/ebay.ts`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/kijiji.ts`
|
- Modify: `packages/api-server/src/routes/kijiji.ts`
|
||||||
|
|
||||||
- Modify: `packages/api-server/test/routes.test.ts`
|
- Modify: `packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
|
|
||||||
Extend `packages/api-server/test/routes.test.ts` with route-forwarding coverage for the new query parameter:
|
Extend `packages/api-server/test/routes.test.ts` with route-forwarding coverage for the
|
||||||
|
new query parameter:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
test("facebookRoute forwards unstableFilter=true to core", async () => {
|
test("facebookRoute forwards unstableFilter=true to core", async () => {
|
||||||
@@ -480,8 +513,8 @@ test("kijijiRoute forwards unstableFilter=true to core", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Run tests to verify they fail**
|
- [ ] **Step 2: Run tests to verify they fail**
|
||||||
|
|
||||||
Run: `bun test packages/api-server/test/routes.test.ts`
|
Run: `bun test packages/api-server/test/routes.test.ts` Expected: FAIL because the
|
||||||
Expected: FAIL because the routes do not yet parse or forward `unstableFilter`.
|
routes do not yet parse or forward `unstableFilter`.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
@@ -533,12 +566,14 @@ const items = await fetchKijijiItems(
|
|||||||
);
|
);
|
||||||
```
|
```
|
||||||
|
|
||||||
Do not add any response wrapper logic in the routes; simply return whatever the core scraper returns so the default array path remains unchanged.
|
Do not add any response wrapper logic in the routes; simply return whatever the core
|
||||||
|
scraper returns so the default array path remains unchanged.
|
||||||
|
|
||||||
- [ ] **Step 4: Run tests to verify they pass**
|
- [ ] **Step 4: Run tests to verify they pass**
|
||||||
|
|
||||||
Run: `bun test packages/api-server/test/routes.test.ts`
|
Run: `bun test packages/api-server/test/routes.test.ts` Expected: PASS, including
|
||||||
Expected: PASS, including existing cookie-parameter regression tests and the new unstable-mode forwarding assertions.
|
existing cookie-parameter regression tests and the new unstable-mode forwarding
|
||||||
|
assertions.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -550,13 +585,17 @@ git commit -m "feat: expose unstable mode in api routes"
|
|||||||
### Task 4: Document and forward unstable mode in MCP tools
|
### Task 4: Document and forward unstable mode in MCP tools
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts`
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/test/protocol.test.ts`
|
- Modify: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write the failing tests**
|
- [ ] **Step 1: Write the failing tests**
|
||||||
|
|
||||||
Extend `packages/mcp-server/test/protocol.test.ts` with metadata and forwarding coverage:
|
Extend `packages/mcp-server/test/protocol.test.ts` with metadata and forwarding
|
||||||
|
coverage:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
test("search tools document unstable listing mode", () => {
|
test("search tools document unstable listing mode", () => {
|
||||||
@@ -601,12 +640,14 @@ Mirror the forwarding assertion for `search_kijiji` and `search_ebay` in the sam
|
|||||||
|
|
||||||
- [ ] **Step 2: Run tests to verify they fail**
|
- [ ] **Step 2: Run tests to verify they fail**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts`
|
Run: `bun test packages/mcp-server/test/protocol.test.ts` Expected: FAIL because the
|
||||||
Expected: FAIL because the tools do not yet describe `unstableFilter` and the handler does not append it to API URLs.
|
tools do not yet describe `unstableFilter` and the handler does not append it to API
|
||||||
|
URLs.
|
||||||
|
|
||||||
- [ ] **Step 3: Write minimal implementation**
|
- [ ] **Step 3: Write minimal implementation**
|
||||||
|
|
||||||
In `packages/mcp-server/src/protocol/tools.ts`, add the same optional property to all three tools:
|
In `packages/mcp-server/src/protocol/tools.ts`, add the same optional property to all
|
||||||
|
three tools:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
unstableFilter: {
|
unstableFilter: {
|
||||||
@@ -617,7 +658,8 @@ unstableFilter: {
|
|||||||
},
|
},
|
||||||
```
|
```
|
||||||
|
|
||||||
In `packages/mcp-server/src/protocol/handler.ts`, append the shared flag in each search branch:
|
In `packages/mcp-server/src/protocol/handler.ts`, append the shared flag in each search
|
||||||
|
branch:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
if (args.unstableFilter !== undefined) {
|
if (args.unstableFilter !== undefined) {
|
||||||
@@ -629,8 +671,8 @@ Add that snippet to the `search_kijiji`, `search_facebook`, and `search_ebay` br
|
|||||||
|
|
||||||
- [ ] **Step 4: Run tests to verify they pass**
|
- [ ] **Step 4: Run tests to verify they pass**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts`
|
Run: `bun test packages/mcp-server/test/protocol.test.ts` Expected: PASS, including the
|
||||||
Expected: PASS, including the new tool-schema assertions and URL-forwarding assertions.
|
new tool-schema assertions and URL-forwarding assertions.
|
||||||
|
|
||||||
- [ ] **Step 5: Commit**
|
- [ ] **Step 5: Commit**
|
||||||
|
|
||||||
@@ -642,21 +684,23 @@ git commit -m "docs: expose unstable mode in mcp tools"
|
|||||||
### Task 5: Verify the full cross-package feature end to end
|
### Task 5: Verify the full cross-package feature end to end
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- No code changes expected.
|
- No code changes expected.
|
||||||
|
|
||||||
- [ ] **Step 1: Run the focused package tests**
|
- [ ] **Step 1: Run the focused package tests**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/unstable-listing-mode.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts packages/api-server/test/routes.test.ts packages/mcp-server/test/protocol.test.ts`
|
Run:
|
||||||
|
`bun test packages/core/test/unstable-listing-mode.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts packages/core/test/kijiji-core.test.ts packages/api-server/test/routes.test.ts packages/mcp-server/test/protocol.test.ts`
|
||||||
Expected: PASS with zero failing tests.
|
Expected: PASS with zero failing tests.
|
||||||
|
|
||||||
- [ ] **Step 2: Run the broader workspace verification**
|
- [ ] **Step 2: Run the broader workspace verification**
|
||||||
|
|
||||||
Run: `bun run ci`
|
Run: `bun run ci` Expected: PASS with clean workspace validation.
|
||||||
Expected: PASS with clean workspace validation.
|
|
||||||
|
|
||||||
- [ ] **Step 3: Commit verification-only follow-ups if needed**
|
- [ ] **Step 3: Commit verification-only follow-ups if needed**
|
||||||
|
|
||||||
If verification forced any tiny fixes, commit them immediately after the fix with a focused message, for example:
|
If verification forced any tiny fixes, commit them immediately after the fix with a
|
||||||
|
focused message, for example:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
git add <exact files changed>
|
git add <exact files changed>
|
||||||
@@ -667,6 +711,8 @@ If no files changed during verification, skip this commit step.
|
|||||||
|
|
||||||
## Self-Review
|
## Self-Review
|
||||||
|
|
||||||
- Spec coverage: shared classifier, all three scrapers, API exposure, MCP documentation, and tests are each mapped to a task.
|
- Spec coverage: shared classifier, all three scrapers, API exposure, MCP documentation,
|
||||||
- Placeholder scan: no `TODO`, `TBD`, or "write tests later" placeholders remain.
|
and tests are each mapped to a task.
|
||||||
- Type consistency: the plan uses one shared flag name, `unstableFilter`, and one shared core option, `hideUnstableResults`, across all tasks.
|
- Placeholder scan: no `TODO`, `TBD`, or “write tests later” placeholders remain.
|
||||||
|
- Type consistency: the plan uses one shared flag name, `unstableFilter`, and one shared
|
||||||
|
core option, `hideUnstableResults`, across all tasks.
|
||||||
|
|||||||
@@ -1,14 +1,22 @@
|
|||||||
# Code Smell Cleanup Implementation Plan
|
# Code Smell Cleanup Implementation Plan
|
||||||
|
|
||||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||||
|
> superpowers:subagent-driven-development (recommended) or superpowers:executing-plans
|
||||||
|
> to implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
**Goal:** Fix concrete code smells found in repo review without changing marketplace behavior or relaxing lint/type rules.
|
**Goal:** Fix concrete code smells found in repo review without changing marketplace
|
||||||
|
behavior or relaxing lint/type rules.
|
||||||
|
|
||||||
**Architecture:** Start with correctness bugs at transport boundaries, then remove secret-leaking query/log paths, then reduce duplicate parsing and HTTP code. Keep marketplace behavior inside `packages/core`, API routes thin, and MCP as JSON-RPC transport only.
|
**Architecture:** Start with correctness bugs at transport boundaries, then remove
|
||||||
|
secret-leaking query/log paths, then reduce duplicate parsing and HTTP code.
|
||||||
|
Keep marketplace behavior inside `packages/core`, API routes thin, and MCP as JSON-RPC
|
||||||
|
transport only.
|
||||||
|
|
||||||
**Tech Stack:** Bun `1.3.13`, TypeScript strict mode, `bun:test`, Biome, framework-free `Bun.serve` adapters.
|
**Tech Stack:** Bun `1.3.13`, TypeScript strict mode, `bun:test`, Biome, framework-free
|
||||||
|
`Bun.serve` adapters.
|
||||||
|
|
||||||
---
|
* * *
|
||||||
|
|
||||||
## File Structure
|
## File Structure
|
||||||
|
|
||||||
@@ -18,7 +26,8 @@
|
|||||||
- Extract shared API call/query-param helpers.
|
- Extract shared API call/query-param helpers.
|
||||||
- Stop logging full URLs with cookie-bearing params.
|
- Stop logging full URLs with cookie-bearing params.
|
||||||
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
||||||
- Remove `cookies` from Kijiji MCP schema or mark it as unsupported after API route no longer accepts it.
|
- Remove `cookies` from Kijiji MCP schema or mark it as unsupported after API route no
|
||||||
|
longer accepts it.
|
||||||
- Modify: `packages/mcp-server/test/protocol.test.ts`
|
- Modify: `packages/mcp-server/test/protocol.test.ts`
|
||||||
- Add coverage for `id: 0`.
|
- Add coverage for `id: 0`.
|
||||||
- Add coverage for zero-valued numeric args.
|
- Add coverage for zero-valued numeric args.
|
||||||
@@ -53,12 +62,15 @@
|
|||||||
- Replace `console.error` with repo logger.
|
- Replace `console.error` with repo logger.
|
||||||
- Modify: `packages/core/test/setup.ts`
|
- Modify: `packages/core/test/setup.ts`
|
||||||
- Remove redundant comments and make fetch-mock policy explicit.
|
- Remove redundant comments and make fetch-mock policy explicit.
|
||||||
- Test: existing package tests under `packages/core/test`, `packages/api-server/test`, `packages/mcp-server/test`.
|
- Test: existing package tests under `packages/core/test`, `packages/api-server/test`,
|
||||||
|
`packages/mcp-server/test`.
|
||||||
|
|
||||||
## Task 1: Fix MCP JSON-RPC `id: 0` Handling
|
## Task 1: Fix MCP JSON-RPC `id: 0` Handling
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts:61-74`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts:61-74`
|
||||||
|
|
||||||
- Test: `packages/mcp-server/test/protocol.test.ts`
|
- Test: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write failing test for `id: 0`**
|
- [ ] **Step 1: Write failing test for `id: 0`**
|
||||||
@@ -137,7 +149,9 @@ git commit -m "fix: preserve zero json-rpc ids"
|
|||||||
## Task 2: Preserve Zero Numeric MCP Arguments
|
## Task 2: Preserve Zero Numeric MCP Arguments
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts:107-216`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts:107-216`
|
||||||
|
|
||||||
- Test: `packages/mcp-server/test/protocol.test.ts`
|
- Test: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write failing tests for zero-valued params**
|
- [ ] **Step 1: Write failing tests for zero-valued params**
|
||||||
@@ -288,10 +302,15 @@ git commit -m "fix: forward zero-valued mcp params"
|
|||||||
## Task 3: Remove Cookie Query Path From MCP and API
|
## Task 3: Remove Cookie Query Path From MCP and API
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/tools.ts:55-59`
|
- Modify: `packages/mcp-server/src/protocol/tools.ts:55-59`
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts:119`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts:119`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/kijiji.ts:65`
|
- Modify: `packages/api-server/src/routes/kijiji.ts:65`
|
||||||
|
|
||||||
- Test: `packages/mcp-server/test/protocol.test.ts`
|
- Test: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
- Test: `packages/api-server/test/routes.test.ts`
|
- Test: `packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Update MCP tests for no cookie exposure**
|
- [ ] **Step 1: Update MCP tests for no cookie exposure**
|
||||||
@@ -341,7 +360,8 @@ test("search_kijiji should not forward cookies query parameters", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 2: Update API test expectation**
|
- [ ] **Step 2: Update API test expectation**
|
||||||
|
|
||||||
In `packages/api-server/test/routes.test.ts`, replace `kijijiRoute passes cookies query parameter` test with:
|
In `packages/api-server/test/routes.test.ts`, replace
|
||||||
|
`kijijiRoute passes cookies query parameter` test with:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
test("kijijiRoute ignores cookies query parameter", async () => {
|
test("kijijiRoute ignores cookies query parameter", async () => {
|
||||||
@@ -374,13 +394,15 @@ test("kijijiRoute ignores cookies query parameter", async () => {
|
|||||||
|
|
||||||
- [ ] **Step 3: Run tests to verify failure**
|
- [ ] **Step 3: Run tests to verify failure**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts packages/api-server/test/routes.test.ts`
|
Run:
|
||||||
|
`bun test packages/mcp-server/test/protocol.test.ts packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
Expected: FAIL because Kijiji cookie query is still exposed/forwarded.
|
Expected: FAIL because Kijiji cookie query is still exposed/forwarded.
|
||||||
|
|
||||||
- [ ] **Step 4: Remove Kijiji cookie schema and forwarding**
|
- [ ] **Step 4: Remove Kijiji cookie schema and forwarding**
|
||||||
|
|
||||||
Delete `cookies` property from `search_kijiji` in `packages/mcp-server/src/protocol/tools.ts`.
|
Delete `cookies` property from `search_kijiji` in
|
||||||
|
`packages/mcp-server/src/protocol/tools.ts`.
|
||||||
|
|
||||||
Delete this line from `packages/mcp-server/src/protocol/handler.ts`:
|
Delete this line from `packages/mcp-server/src/protocol/handler.ts`:
|
||||||
|
|
||||||
@@ -396,7 +418,8 @@ cookies: reqUrl.searchParams.get("cookies") || undefined,
|
|||||||
|
|
||||||
- [ ] **Step 5: Run tests**
|
- [ ] **Step 5: Run tests**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts packages/api-server/test/routes.test.ts`
|
Run:
|
||||||
|
`bun test packages/mcp-server/test/protocol.test.ts packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
Expected: PASS.
|
Expected: PASS.
|
||||||
|
|
||||||
@@ -410,10 +433,15 @@ git commit -m "fix: remove cookie query forwarding"
|
|||||||
## Task 4: Add Strict API Integer Parsing
|
## Task 4: Add Strict API Integer Parsing
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Create: `packages/api-server/src/routes/helpers.ts`
|
- Create: `packages/api-server/src/routes/helpers.ts`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/facebook.ts`
|
- Modify: `packages/api-server/src/routes/facebook.ts`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/ebay.ts`
|
- Modify: `packages/api-server/src/routes/ebay.ts`
|
||||||
|
|
||||||
- Modify: `packages/api-server/src/routes/kijiji.ts`
|
- Modify: `packages/api-server/src/routes/kijiji.ts`
|
||||||
|
|
||||||
- Test: `packages/api-server/test/routes.test.ts`
|
- Test: `packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Write failing API validation tests**
|
- [ ] **Step 1: Write failing API validation tests**
|
||||||
@@ -560,7 +588,9 @@ git commit -m "fix: strictly parse route integers"
|
|||||||
## Task 5: De-Duplicate MCP API Calls
|
## Task 5: De-Duplicate MCP API Calls
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/mcp-server/src/protocol/handler.ts`
|
- Modify: `packages/mcp-server/src/protocol/handler.ts`
|
||||||
|
|
||||||
- Test: `packages/mcp-server/test/protocol.test.ts`
|
- Test: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Add regression test for successful tool result after helper extraction**
|
- [ ] **Step 1: Add regression test for successful tool result after helper extraction**
|
||||||
@@ -645,7 +675,8 @@ Use `"facebook"` and `"ebay"` in their branches.
|
|||||||
|
|
||||||
- [ ] **Step 4: Run MCP tests and build**
|
- [ ] **Step 4: Run MCP tests and build**
|
||||||
|
|
||||||
Run: `bun test packages/mcp-server/test/protocol.test.ts && bun run --cwd packages/mcp-server build`
|
Run:
|
||||||
|
`bun test packages/mcp-server/test/protocol.test.ts && bun run --cwd packages/mcp-server build`
|
||||||
|
|
||||||
Expected: PASS.
|
Expected: PASS.
|
||||||
|
|
||||||
@@ -659,11 +690,17 @@ git commit -m "refactor: share mcp api calls"
|
|||||||
## Task 6: Consolidate Core HTTP Fetching
|
## Task 6: Consolidate Core HTTP Fetching
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/utils/http.ts`
|
- Modify: `packages/core/src/utils/http.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/facebook.ts`
|
- Modify: `packages/core/src/scrapers/facebook.ts`
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/ebay.ts`
|
- Modify: `packages/core/src/scrapers/ebay.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/http.test.ts`
|
- Test: `packages/core/test/http.test.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/facebook-core.test.ts`
|
- Test: `packages/core/test/facebook-core.test.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/ebay-core.test.ts`
|
- Test: `packages/core/test/ebay-core.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Add shared HTTP test for response URL and deterministic jitter**
|
- [ ] **Step 1: Add shared HTTP test for response URL and deterministic jitter**
|
||||||
@@ -695,7 +732,8 @@ test("fetchHtml can return response URL", async () => {
|
|||||||
});
|
});
|
||||||
```
|
```
|
||||||
|
|
||||||
If current `Response.url` cannot be set in Bun tests, use a mocked object cast to `Response` instead:
|
If current `Response.url` cannot be set in Bun tests, use a mocked object cast to
|
||||||
|
`Response` instead:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
global.fetch = mock(() =>
|
global.fetch = mock(() =>
|
||||||
@@ -827,7 +865,8 @@ Update error property reads from `err.status` to `err.statusCode`.
|
|||||||
|
|
||||||
- [ ] **Step 5: Replace eBay direct fetch with shared helper**
|
- [ ] **Step 5: Replace eBay direct fetch with shared helper**
|
||||||
|
|
||||||
In `packages/core/src/scrapers/ebay.ts`, import `fetchHtml` and `HttpError` from `../utils/http`.
|
In `packages/core/src/scrapers/ebay.ts`, import `fetchHtml` and `HttpError` from
|
||||||
|
`../utils/http`.
|
||||||
|
|
||||||
Replace direct `fetch` block with:
|
Replace direct `fetch` block with:
|
||||||
|
|
||||||
@@ -845,7 +884,8 @@ logger.error(`Failed to fetch eBay search (${err.statusCode}): ${err.message}`);
|
|||||||
|
|
||||||
- [ ] **Step 6: Run core tests**
|
- [ ] **Step 6: Run core tests**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/http.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts`
|
Run:
|
||||||
|
`bun test packages/core/test/http.test.ts packages/core/test/facebook-core.test.ts packages/core/test/ebay-core.test.ts`
|
||||||
|
|
||||||
Expected: PASS.
|
Expected: PASS.
|
||||||
|
|
||||||
@@ -865,15 +905,20 @@ git commit -m "refactor: share scraper http fetching"
|
|||||||
## Task 7: Clean Kijiji Dead Code and Logging
|
## Task 7: Clean Kijiji Dead Code and Logging
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/src/scrapers/kijiji.ts`
|
- Modify: `packages/core/src/scrapers/kijiji.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/kijiji-core.test.ts`
|
- Test: `packages/core/test/kijiji-core.test.ts`
|
||||||
|
|
||||||
- Test: `packages/core/test/kijiji-integration.test.ts`
|
- Test: `packages/core/test/kijiji-integration.test.ts`
|
||||||
|
|
||||||
- [ ] **Step 1: Verify `_parseListing` has no callers**
|
- [ ] **Step 1: Verify `_parseListing` has no callers**
|
||||||
|
|
||||||
Run: `rg "_parseListing|parseListing" packages/core packages/api-server packages/mcp-server`
|
Run:
|
||||||
|
`rg "_parseListing|parseListing" packages/core packages/api-server packages/mcp-server`
|
||||||
|
|
||||||
Expected: only `_parseListing` definition appears. If any caller appears, stop and update this task to preserve behavior.
|
Expected: only `_parseListing` definition appears.
|
||||||
|
If any caller appears, stop and update this task to preserve behavior.
|
||||||
|
|
||||||
- [ ] **Step 2: Delete dead function**
|
- [ ] **Step 2: Delete dead function**
|
||||||
|
|
||||||
@@ -911,7 +956,8 @@ Replace `console.error(...)` calls with `logger.error(...)` preserving message t
|
|||||||
|
|
||||||
- [ ] **Step 4: Run Kijiji tests**
|
- [ ] **Step 4: Run Kijiji tests**
|
||||||
|
|
||||||
Run: `bun test packages/core/test/kijiji-core.test.ts packages/core/test/kijiji-integration.test.ts`
|
Run:
|
||||||
|
`bun test packages/core/test/kijiji-core.test.ts packages/core/test/kijiji-integration.test.ts`
|
||||||
|
|
||||||
Expected: PASS.
|
Expected: PASS.
|
||||||
|
|
||||||
@@ -925,7 +971,9 @@ git commit -m "refactor: clean kijiji scraper internals"
|
|||||||
## Task 8: Clean Test Setup Comments and Enforce Fetch Mocking
|
## Task 8: Clean Test Setup Comments and Enforce Fetch Mocking
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Modify: `packages/core/test/setup.ts`
|
- Modify: `packages/core/test/setup.ts`
|
||||||
|
|
||||||
- Test: core test suite
|
- Test: core test suite
|
||||||
|
|
||||||
- [ ] **Step 1: Update setup file**
|
- [ ] **Step 1: Update setup file**
|
||||||
@@ -942,7 +990,8 @@ global.fetch = (() => {
|
|||||||
|
|
||||||
Run: `bun test packages/core/test`
|
Run: `bun test packages/core/test`
|
||||||
|
|
||||||
Expected: PASS. If failures occur, fix individual tests by mocking `global.fetch` in `beforeEach` and restoring in `afterEach`.
|
Expected: PASS. If failures occur, fix individual tests by mocking `global.fetch` in
|
||||||
|
`beforeEach` and restoring in `afterEach`.
|
||||||
|
|
||||||
- [ ] **Step 3: Commit**
|
- [ ] **Step 3: Commit**
|
||||||
|
|
||||||
@@ -954,6 +1003,7 @@ git commit -m "test: require explicit fetch mocks"
|
|||||||
## Task 9: Final Verification
|
## Task 9: Final Verification
|
||||||
|
|
||||||
**Files:**
|
**Files:**
|
||||||
|
|
||||||
- Verify all touched packages.
|
- Verify all touched packages.
|
||||||
|
|
||||||
- [ ] **Step 1: Run full deterministic tests**
|
- [ ] **Step 1: Run full deterministic tests**
|
||||||
@@ -991,9 +1041,13 @@ git commit -m "chore: finish code smell cleanup"
|
|||||||
|
|
||||||
## Self-Review
|
## Self-Review
|
||||||
|
|
||||||
- Spec coverage: all reviewed smells are covered: JSON-RPC id bug, zero args, cookie query leak, strict integer parsing, duplicate route/MCP helper code, duplicate HTTP clients, dead Kijiji function, direct timers/logging, stale setup comments.
|
- Spec coverage: all reviewed smells are covered: JSON-RPC id bug, zero args, cookie
|
||||||
- Placeholder scan: no TBD/TODO/fill-in placeholders remain. Each task has target files, code snippets, commands, and expected results.
|
query leak, strict integer parsing, duplicate route/MCP helper code, duplicate HTTP
|
||||||
- Type consistency: route helper names, MCP helper names, and shared HTTP option names are used consistently across tasks.
|
clients, dead Kijiji function, direct timers/logging, stale setup comments.
|
||||||
|
- Placeholder scan: no TBD/TODO/fill-in placeholders remain.
|
||||||
|
Each task has target files, code snippets, commands, and expected results.
|
||||||
|
- Type consistency: route helper names, MCP helper names, and shared HTTP option names
|
||||||
|
are used consistently across tasks.
|
||||||
|
|
||||||
## Execution Handoff
|
## Execution Handoff
|
||||||
|
|
||||||
@@ -1001,5 +1055,7 @@ Plan complete and saved to `docs/superpowers/plans/2026-04-28-code-smell-cleanup
|
|||||||
|
|
||||||
Two execution options:
|
Two execution options:
|
||||||
|
|
||||||
1. Subagent-Driven (recommended) - dispatch fresh subagent per task, review between tasks, fast iteration.
|
1. Subagent-Driven (recommended) - dispatch fresh subagent per task, review between
|
||||||
2. Inline Execution - execute tasks in this session using executing-plans, batch execution with checkpoints.
|
tasks, fast iteration.
|
||||||
|
2. Inline Execution - execute tasks in this session using executing-plans, batch
|
||||||
|
execution with checkpoints.
|
||||||
|
|||||||
110
docs/superpowers/plans/2026-04-30-ebay-dollar-price-inputs.md
Normal file
110
docs/superpowers/plans/2026-04-30-ebay-dollar-price-inputs.md
Normal file
@@ -0,0 +1,110 @@
|
|||||||
|
# Marketplace Dollar Price Inputs Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to
|
||||||
|
> implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** Make public marketplace price inputs use dollars while preserving core scraper
|
||||||
|
cent-based filtering.
|
||||||
|
|
||||||
|
**Architecture:** API server owns HTTP query parsing and converts dollar amounts to
|
||||||
|
cents before calling core.
|
||||||
|
MCP server keeps forwarding numeric dollar values as query params.
|
||||||
|
Core scraper internals remain unchanged because parsed listing prices already use cents.
|
||||||
|
This applies to eBay `minPrice`/`maxPrice` and Kijiji `priceMin`/`priceMax`; Facebook
|
||||||
|
exposes no price filter inputs.
|
||||||
|
|
||||||
|
**Tech Stack:** Bun, TypeScript, `bun:test`, MCP JSON-RPC adapter, framework-free Bun
|
||||||
|
HTTP routes.
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
### Task 1: API Dollar Parsing
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `packages/api-server/src/routes/helpers.ts`
|
||||||
|
|
||||||
|
- Modify: `packages/api-server/src/routes/ebay.ts`
|
||||||
|
|
||||||
|
- Modify: `packages/api-server/src/routes/kijiji.ts`
|
||||||
|
|
||||||
|
- Test: `packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add failing API route tests**
|
||||||
|
|
||||||
|
Add tests proving eBay `minPrice=999.99` / `maxPrice=1000` and Kijiji `priceMin=999.99`
|
||||||
|
/ `priceMax=1000` are forwarded to core as `99999` and `100000` cents.
|
||||||
|
Add validation tests for empty, whitespace, negative, hex, mixed text, and malformed
|
||||||
|
decimal price values.
|
||||||
|
|
||||||
|
Run: `bun test packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
|
Expected: new forwarding tests fail because route currently rejects decimals and
|
||||||
|
forwards integer dollars unchanged.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Implement dollar parser helper**
|
||||||
|
|
||||||
|
Add `parseDollarPriceParam(searchParams, name)` in
|
||||||
|
`packages/api-server/src/routes/helpers.ts`. Accept `0`, `1000`, `999.99`, and `0.99`.
|
||||||
|
Reject values that do not match `^\d+(?:\.\d{1,2})?$`. Convert to cents with
|
||||||
|
`Math.round(Number(rawValue) * 100)`.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Use dollar parser in eBay route**
|
||||||
|
|
||||||
|
Replace `parseNonNegativeIntegerParam` calls for eBay `minPrice`/`maxPrice` and Kijiji
|
||||||
|
`priceMin`/`priceMax` with `parseDollarPriceParam`. Keep pagination/count params on
|
||||||
|
integer parsing.
|
||||||
|
|
||||||
|
- [ ] **Step 4: Verify API tests**
|
||||||
|
|
||||||
|
Run: `bun test packages/api-server/test/routes.test.ts`
|
||||||
|
|
||||||
|
Expected: all API route tests pass.
|
||||||
|
|
||||||
|
### Task 2: MCP Schema Contract
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `packages/mcp-server/src/protocol/tools.ts`
|
||||||
|
|
||||||
|
- Test: `packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add MCP schema/forwarding tests**
|
||||||
|
|
||||||
|
Add tests that `search_ebay` describes `minPrice` and `maxPrice` as dollar filters and
|
||||||
|
forwards numeric dollar values unchanged in API query params.
|
||||||
|
|
||||||
|
Run: `bun test packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
|
Expected: description test fails until schema text changes; forwarding behavior should
|
||||||
|
already pass or reveal mapping gaps.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Update tool descriptions**
|
||||||
|
|
||||||
|
Change eBay `minPrice` and Kijiji `priceMin` descriptions to `Minimum price in dollars`.
|
||||||
|
Change eBay `maxPrice` and Kijiji `priceMax` descriptions to `Maximum price in dollars`.
|
||||||
|
|
||||||
|
- [ ] **Step 3: Verify MCP tests**
|
||||||
|
|
||||||
|
Run: `bun test packages/mcp-server/test/protocol.test.ts`
|
||||||
|
|
||||||
|
Expected: all MCP protocol tests pass.
|
||||||
|
|
||||||
|
### Task 3: Cross-Package Verification
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- No additional edits expected.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Run relevant package tests**
|
||||||
|
|
||||||
|
Run: `bun test packages/api-server/test packages/mcp-server/test`
|
||||||
|
|
||||||
|
Expected: all tests pass.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run CI**
|
||||||
|
|
||||||
|
Run: `bun run ci`
|
||||||
|
|
||||||
|
Expected: typecheck and Biome pass without changing lint config.
|
||||||
187
docs/superpowers/plans/2026-04-30-live-parser-tests.md
Normal file
187
docs/superpowers/plans/2026-04-30-live-parser-tests.md
Normal file
@@ -0,0 +1,187 @@
|
|||||||
|
# Live Parser Tests Implementation Plan
|
||||||
|
|
||||||
|
> **For agentic workers:** REQUIRED SUB-SKILL: Use
|
||||||
|
> superpowers:subagent-driven-development (recommended) or superpowers:executing-plans
|
||||||
|
> to implement this plan task-by-task.
|
||||||
|
> Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||||
|
|
||||||
|
**Goal:** Add explicit live endpoint test suites for each core marketplace scraper,
|
||||||
|
excluded from default tests and runnable through one script.
|
||||||
|
|
||||||
|
**Architecture:** Live tests live under `packages/core/test/live/` and import public
|
||||||
|
scraper entry points directly.
|
||||||
|
Normal package tests remain offline because the new files are outside current explicit
|
||||||
|
test commands and run only through `bun run test:live`.
|
||||||
|
|
||||||
|
**Tech Stack:** Bun `1.3.13`, `bun:test`, TypeScript, existing core scraper APIs.
|
||||||
|
|
||||||
|
* * *
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
- Create `packages/core/test/live/ebay.live.test.ts`: live eBay search smoke test
|
||||||
|
against `fetchEbayItems`.
|
||||||
|
- Create `packages/core/test/live/kijiji.live.test.ts`: live Kijiji search smoke test
|
||||||
|
against `fetchKijijiItems`.
|
||||||
|
- Create `packages/core/test/live/facebook.live.test.ts`: strict live Facebook search
|
||||||
|
smoke test against `fetchFacebookItems` and `FACEBOOK_COOKIE`.
|
||||||
|
- Modify `package.json`: add root script `test:live` running all files under
|
||||||
|
`packages/core/test/live`.
|
||||||
|
|
||||||
|
### Task 1: Add eBay Live Suite
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `packages/core/test/live/ebay.live.test.ts`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the live test file**
|
||||||
|
|
||||||
|
```ts
|
||||||
|
import { describe, expect, test } from "bun:test";
|
||||||
|
import fetchEbayItems from "../../src/scrapers/ebay";
|
||||||
|
|
||||||
|
describe("eBay live parser", () => {
|
||||||
|
test("scrapes live search results into listing details", async () => {
|
||||||
|
const results = await fetchEbayItems("iphone", 1, { maxItems: 3 });
|
||||||
|
|
||||||
|
expect(results.length).toBeGreaterThan(0);
|
||||||
|
for (const listing of results) {
|
||||||
|
expect(listing.url).toStartWith("https://");
|
||||||
|
expect(listing.title.length).toBeGreaterThan(0);
|
||||||
|
expect(listing.listingPrice.cents).toBeGreaterThanOrEqual(0);
|
||||||
|
expect(listing.listingPrice.currency.length).toBeGreaterThan(0);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run eBay live test**
|
||||||
|
|
||||||
|
Run: `bun test packages/core/test/live/ebay.live.test.ts` Expected: PASS when eBay
|
||||||
|
returns parseable search results; FAIL on endpoint/rate-limit/parser breakage.
|
||||||
|
|
||||||
|
### Task 2: Add Kijiji Live Suite
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `packages/core/test/live/kijiji.live.test.ts`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the live test file**
|
||||||
|
|
||||||
|
```ts
|
||||||
|
import { describe, expect, test } from "bun:test";
|
||||||
|
import fetchKijijiItems from "../../src/scrapers/kijiji";
|
||||||
|
|
||||||
|
describe("Kijiji live parser", () => {
|
||||||
|
test("scrapes live search results into detailed listings", async () => {
|
||||||
|
const results = await fetchKijijiItems(
|
||||||
|
"iphone",
|
||||||
|
1,
|
||||||
|
"https://www.kijiji.ca",
|
||||||
|
{ maxPages: 1 },
|
||||||
|
{ includeImages: false, sellerDataDepth: "basic" },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(results.length).toBeGreaterThan(0);
|
||||||
|
for (const listing of results) {
|
||||||
|
expect(listing.url).toStartWith("https://www.kijiji.ca/");
|
||||||
|
expect(listing.title.length).toBeGreaterThan(0);
|
||||||
|
expect(listing.listingPrice.cents).toBeGreaterThanOrEqual(0);
|
||||||
|
expect(listing.listingPrice.currency.length).toBeGreaterThan(0);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run Kijiji live test**
|
||||||
|
|
||||||
|
Run: `bun test packages/core/test/live/kijiji.live.test.ts` Expected: PASS when Kijiji
|
||||||
|
returns parseable search and detail pages; FAIL on endpoint/parser breakage.
|
||||||
|
|
||||||
|
### Task 3: Add Facebook Live Suite
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Create: `packages/core/test/live/facebook.live.test.ts`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Write the live test file**
|
||||||
|
|
||||||
|
```ts
|
||||||
|
import { describe, expect, test } from "bun:test";
|
||||||
|
import fetchFacebookItems from "../../src/scrapers/facebook";
|
||||||
|
|
||||||
|
describe("Facebook live parser", () => {
|
||||||
|
test("requires FACEBOOK_COOKIE for strict live testing", () => {
|
||||||
|
expect(process.env.FACEBOOK_COOKIE?.trim().length ?? 0).toBeGreaterThan(0);
|
||||||
|
});
|
||||||
|
|
||||||
|
test("scrapes live marketplace search results into listing details", async () => {
|
||||||
|
const results = await fetchFacebookItems("iphone", 1, "toronto", 3);
|
||||||
|
|
||||||
|
expect(results.length).toBeGreaterThan(0);
|
||||||
|
for (const listing of results) {
|
||||||
|
expect(listing.url).toStartWith("https://www.facebook.com/marketplace/item/");
|
||||||
|
expect(listing.title.length).toBeGreaterThan(0);
|
||||||
|
expect(listing.listingPrice.cents).toBeGreaterThanOrEqual(0);
|
||||||
|
expect(listing.listingPrice.currency.length).toBeGreaterThan(0);
|
||||||
|
}
|
||||||
|
});
|
||||||
|
});
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run Facebook live test**
|
||||||
|
|
||||||
|
Run: `bun test packages/core/test/live/facebook.live.test.ts` Expected: PASS with valid
|
||||||
|
`FACEBOOK_COOKIE`; FAIL when `FACEBOOK_COOKIE` is missing, expired, or parser output is
|
||||||
|
empty.
|
||||||
|
|
||||||
|
### Task 4: Add Root Live Test Script
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- Modify: `package.json`
|
||||||
|
|
||||||
|
- [ ] **Step 1: Add script**
|
||||||
|
|
||||||
|
Change root `scripts` to include:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"test:live": "bun test packages/core/test/live"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run all live tests through script**
|
||||||
|
|
||||||
|
Run: `bun run test:live` Expected: runs eBay, Kijiji, and Facebook live suites.
|
||||||
|
Facebook fails if `FACEBOOK_COOKIE` is unset.
|
||||||
|
|
||||||
|
### Task 5: Verify Default Suite Exclusion
|
||||||
|
|
||||||
|
**Files:**
|
||||||
|
|
||||||
|
- No code files modified.
|
||||||
|
|
||||||
|
- [ ] **Step 1: Run existing core tests**
|
||||||
|
|
||||||
|
Run: `bun test packages/core/test` Expected: existing mocked tests run.
|
||||||
|
If Bun discovers `packages/core/test/live`, change normal verification command to
|
||||||
|
explicit glob `bun test packages/core/test/*.test.ts` and document that in final notes.
|
||||||
|
|
||||||
|
- [ ] **Step 2: Run static checks**
|
||||||
|
|
||||||
|
Run: `bun run ci` Expected: typecheck and Biome pass.
|
||||||
|
Fix code issues without changing lint or TypeScript rules.
|
||||||
|
|
||||||
|
## Commit Note
|
||||||
|
|
||||||
|
Do not commit during execution unless user explicitly requests a commit.
|
||||||
|
This repo session policy overrides generic plan commit steps.
|
||||||
|
|
||||||
|
## Self-Review
|
||||||
|
|
||||||
|
- Spec coverage: eBay, Kijiji, Facebook live suites; explicit script; strict Facebook
|
||||||
|
auth; excluded from default flow.
|
||||||
|
- Placeholder scan: no `TBD`, `TODO`, or underspecified implementation steps.
|
||||||
|
- Type consistency: tests use current exported scraper signatures and shared listing
|
||||||
|
fields from `ListingDetails`.
|
||||||
@@ -1,12 +1,13 @@
|
|||||||
# Design: Adopt opencode Monorepo Config
|
# Design: Adopt opencode Monorepo Config
|
||||||
|
|
||||||
**Date:** 2025-07-14
|
**Date:** 2025-07-14\
|
||||||
**Status:** Approved
|
**Status:** Approved\
|
||||||
**Approach:** Full adoption (A)
|
**Approach:** Full adoption (A)
|
||||||
|
|
||||||
## Context
|
## Context
|
||||||
|
|
||||||
Current repo (`marketplace-scrapers-monorepo`) has basic bun workspaces with 3 packages (`core`, `api-server`, `mcp-server`). Reference: `anomalyco/opencode` monorepo patterns.
|
Current repo (`marketplace-scrapers-monorepo`) has basic bun workspaces with 3 packages
|
||||||
|
(`core`, `api-server`, `mcp-server`). Reference: `anomalyco/opencode` monorepo patterns.
|
||||||
|
|
||||||
**Gaps vs opencode:**
|
**Gaps vs opencode:**
|
||||||
- No Turbo (task orchestration, caching, dep graph)
|
- No Turbo (task orchestration, caching, dep graph)
|
||||||
@@ -20,7 +21,8 @@ Current repo (`marketplace-scrapers-monorepo`) has basic bun workspaces with 3 p
|
|||||||
### 1. Root `package.json`
|
### 1. Root `package.json`
|
||||||
|
|
||||||
- Add `workspaces.catalog` block with shared deps:
|
- Add `workspaces.catalog` block with shared deps:
|
||||||
- `@typescript/native-preview`, `@types/bun`, `@types/unidecode`, `@types/cli-progress`
|
- `@typescript/native-preview`, `@types/bun`, `@types/unidecode`,
|
||||||
|
`@types/cli-progress`
|
||||||
- Add `turbo` to `devDependencies`
|
- Add `turbo` to `devDependencies`
|
||||||
- Add `@tsconfig/bun` to `devDependencies` + catalog
|
- Add `@tsconfig/bun` to `devDependencies` + catalog
|
||||||
- Update root scripts: `typecheck` and `build` delegate to `turbo run`
|
- Update root scripts: `typecheck` and `build` delegate to `turbo run`
|
||||||
@@ -93,7 +95,8 @@ exact = true
|
|||||||
root = "./do-not-run-tests-from-root"
|
root = "./do-not-run-tests-from-root"
|
||||||
```
|
```
|
||||||
|
|
||||||
Exact installs = reproducible. Root test guard prevents accidental root-level test runs.
|
Exact installs = reproducible.
|
||||||
|
Root test guard prevents accidental root-level test runs.
|
||||||
|
|
||||||
### 6. Package `exports` field
|
### 6. Package `exports` field
|
||||||
|
|
||||||
@@ -102,7 +105,8 @@ Replace `main`/`module` with `exports` in all 3 packages:
|
|||||||
"exports": { ".": "./src/index.ts" }
|
"exports": { ".": "./src/index.ts" }
|
||||||
```
|
```
|
||||||
|
|
||||||
Remove `main` and `module` fields. Bun resolves `.ts` directly.
|
Remove `main` and `module` fields.
|
||||||
|
Bun resolves `.ts` directly.
|
||||||
|
|
||||||
### 7. Catalog references in per-package `package.json`
|
### 7. Catalog references in per-package `package.json`
|
||||||
|
|
||||||
@@ -115,7 +119,7 @@ Replace pinned versions with `"catalog:"` for shared deps:
|
|||||||
## Files Changed
|
## Files Changed
|
||||||
|
|
||||||
| File | Action |
|
| File | Action |
|
||||||
|---|---|
|
| --- | --- |
|
||||||
| `package.json` | Update (catalog, turbo dep, scripts) |
|
| `package.json` | Update (catalog, turbo dep, scripts) |
|
||||||
| `turbo.json` | Create |
|
| `turbo.json` | Create |
|
||||||
| `tsconfig.json` | Create |
|
| `tsconfig.json` | Create |
|
||||||
|
|||||||
@@ -3,7 +3,9 @@
|
|||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
Remove all file-based and request-provided cookie inputs across the repo.
|
Remove all file-based and request-provided cookie inputs across the repo.
|
||||||
The only supported authentication input becomes a raw `Cookie` header string supplied through scraper-specific environment variables such as `FACEBOOK_COOKIE` and `EBAY_COOKIE`.
|
The only supported authentication input becomes a raw `Cookie` header string supplied
|
||||||
|
through scraper-specific environment variables such as `FACEBOOK_COOKIE` and
|
||||||
|
`EBAY_COOKIE`.
|
||||||
|
|
||||||
## Goals
|
## Goals
|
||||||
|
|
||||||
@@ -17,7 +19,8 @@ The only supported authentication input becomes a raw `Cookie` header string sup
|
|||||||
|
|
||||||
- Changing scraper behavior unrelated to authentication input.
|
- Changing scraper behavior unrelated to authentication input.
|
||||||
- Adding new cookie formats or migration helpers.
|
- Adding new cookie formats or migration helpers.
|
||||||
- Preserving backward compatibility for cookie files, JSON cookie arrays, or request overrides.
|
- Preserving backward compatibility for cookie files, JSON cookie arrays, or request
|
||||||
|
overrides.
|
||||||
|
|
||||||
## Current State
|
## Current State
|
||||||
|
|
||||||
@@ -27,27 +30,33 @@ The current shared cookie utilities support three sources in priority order:
|
|||||||
2. Environment variable
|
2. Environment variable
|
||||||
3. Cookie file
|
3. Cookie file
|
||||||
|
|
||||||
`packages/core/src/utils/cookies.ts` includes file loading, JSON array parsing, and auto-detection between JSON and header-string formats.
|
`packages/core/src/utils/cookies.ts` includes file loading, JSON array parsing, and
|
||||||
Facebook also exposes deprecated `cookiePath` arguments that still reach shared loading logic.
|
auto-detection between JSON and header-string formats.
|
||||||
Docs in `cookies/AGENTS.md` still describe file-based setup and request-level overrides.
|
Facebook also exposes deprecated `cookiePath` arguments that still reach shared loading
|
||||||
|
logic. Docs in `cookies/AGENTS.md` still describe file-based setup and request-level
|
||||||
|
overrides.
|
||||||
|
|
||||||
## Chosen Approach
|
## Chosen Approach
|
||||||
|
|
||||||
Use the hard-reset approach.
|
Use the hard-reset approach.
|
||||||
Delete the shared multi-source cookie-loading model and reduce the cookie surface to env-header parsing only.
|
Delete the shared multi-source cookie-loading model and reduce the cookie surface to
|
||||||
This is a larger diff than a surgical removal, but it avoids leaving behind abstractions that imply unsupported inputs still exist.
|
env-header parsing only.
|
||||||
|
This is a larger diff than a surgical removal, but it avoids leaving behind abstractions
|
||||||
|
that imply unsupported inputs still exist.
|
||||||
|
|
||||||
## Design
|
## Design
|
||||||
|
|
||||||
### Shared Cookie Utilities
|
### Shared Cookie Utilities
|
||||||
|
|
||||||
`packages/core/src/utils/cookies.ts` will keep only the pieces needed for env-header-based auth:
|
`packages/core/src/utils/cookies.ts` will keep only the pieces needed for
|
||||||
|
env-header-based auth:
|
||||||
|
|
||||||
- `Cookie` type
|
- `Cookie` type
|
||||||
- A reduced cookie config shape containing only `name`, `domain`, and `envVar`
|
- A reduced cookie config shape containing only `name`, `domain`, and `envVar`
|
||||||
- `parseCookieString()` for raw `Cookie` header strings
|
- `parseCookieString()` for raw `Cookie` header strings
|
||||||
- `formatCookiesForHeader()` for domain filtering and request formatting
|
- `formatCookiesForHeader()` for domain filtering and request formatting
|
||||||
- An env-only loader that reads `process.env[config.envVar]`, parses it, and throws a targeted error when missing or invalid
|
- An env-only loader that reads `process.env[config.envVar]`, parses it, and throws a
|
||||||
|
targeted error when missing or invalid
|
||||||
|
|
||||||
The following shared utilities will be removed:
|
The following shared utilities will be removed:
|
||||||
|
|
||||||
@@ -68,15 +77,18 @@ For Facebook this means:
|
|||||||
|
|
||||||
For eBay this means:
|
For eBay this means:
|
||||||
|
|
||||||
- Remove any remaining fallback/file-oriented behavior from shared calls and error strings
|
- Remove any remaining fallback/file-oriented behavior from shared calls and error
|
||||||
|
strings
|
||||||
- Keep the existing env-var auth path, but make it the only path
|
- Keep the existing env-var auth path, but make it the only path
|
||||||
|
|
||||||
### Public API Surface
|
### Public API Surface
|
||||||
|
|
||||||
Exports from `packages/core/src/index.ts` should reflect the new contract.
|
Exports from `packages/core/src/index.ts` should reflect the new contract.
|
||||||
If exported functions currently advertise cookie-source or cookie-path arguments, their signatures will be tightened so callers cannot pass unsupported inputs.
|
If exported functions currently advertise cookie-source or cookie-path arguments, their
|
||||||
|
signatures will be tightened so callers cannot pass unsupported inputs.
|
||||||
|
|
||||||
Downstream adapter packages should continue calling core through the simplified signatures without adding their own cookie-loading behavior.
|
Downstream adapter packages should continue calling core through the simplified
|
||||||
|
signatures without adding their own cookie-loading behavior.
|
||||||
|
|
||||||
### Error Handling
|
### Error Handling
|
||||||
|
|
||||||
@@ -93,8 +105,8 @@ Errors should be blunt and specific:
|
|||||||
|
|
||||||
### Testing Strategy
|
### Testing Strategy
|
||||||
|
|
||||||
Follow TDD.
|
Follow TDD. Start by changing or adding core tests so the old file/request behavior is
|
||||||
Start by changing or adding core tests so the old file/request behavior is no longer accepted.
|
no longer accepted.
|
||||||
|
|
||||||
Coverage targets:
|
Coverage targets:
|
||||||
|
|
||||||
@@ -102,7 +114,8 @@ Coverage targets:
|
|||||||
2. Missing env vars fail with the new env-only error.
|
2. Missing env vars fail with the new env-only error.
|
||||||
3. Invalid env strings fail without falling back to files or request data.
|
3. Invalid env strings fail without falling back to files or request data.
|
||||||
4. Facebook APIs no longer expose or honor cookie-path/request-cookie behavior.
|
4. Facebook APIs no longer expose or honor cookie-path/request-cookie behavior.
|
||||||
5. Existing tests that depended on missing files or JSON cookie arrays are rewritten to the env-only contract.
|
5. Existing tests that depended on missing files or JSON cookie arrays are rewritten to
|
||||||
|
the env-only contract.
|
||||||
|
|
||||||
Verification target after implementation:
|
Verification target after implementation:
|
||||||
|
|
||||||
@@ -121,11 +134,15 @@ Update cookie-related docs to match the new contract:
|
|||||||
|
|
||||||
## Risks
|
## Risks
|
||||||
|
|
||||||
- External callers using request cookie overrides will break at compile time or runtime, depending on how they consume the package.
|
- External callers using request cookie overrides will break at compile time or runtime,
|
||||||
- Recent work added support for custom Facebook cookie paths, so removing that path intentionally reverses a newly introduced behavior.
|
depending on how they consume the package.
|
||||||
- Tests that currently model missing-file behavior must be rewritten rather than preserved.
|
- Recent work added support for custom Facebook cookie paths, so removing that path
|
||||||
|
intentionally reverses a newly introduced behavior.
|
||||||
|
- Tests that currently model missing-file behavior must be rewritten rather than
|
||||||
|
preserved.
|
||||||
|
|
||||||
## Rollout Notes
|
## Rollout Notes
|
||||||
|
|
||||||
This is an intentional contract break.
|
This is an intentional contract break.
|
||||||
The code, tests, and docs should all land together so there is no mixed messaging about supported cookie sources.
|
The code, tests, and docs should all land together so there is no mixed messaging about
|
||||||
|
supported cookie sources.
|
||||||
|
|||||||
@@ -2,35 +2,46 @@
|
|||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
Replace the legacy Facebook Marketplace scraper with a route-aware implementation built around current Comet bootstrap markers and route-specific extraction.
|
Replace the legacy Facebook Marketplace scraper with a route-aware implementation built
|
||||||
The new scraper will keep authenticated direct HTTP fetches as the primary transport, but it will stop treating legacy `require`, `__bbox`, and `marketplace_product_details_page` structures as the main parsing contract.
|
around current Comet bootstrap markers and route-specific extraction.
|
||||||
|
The new scraper will keep authenticated direct HTTP fetches as the primary transport,
|
||||||
|
but it will stop treating legacy `require`, `__bbox`, and
|
||||||
|
`marketplace_product_details_page` structures as the main parsing contract.
|
||||||
|
|
||||||
## Goals
|
## Goals
|
||||||
|
|
||||||
- Replace both Facebook search and item-detail extraction with a current-shape parser.
|
- Replace both Facebook search and item-detail extraction with a current-shape parser.
|
||||||
- Keep authenticated direct HTTP requests as the primary fetch strategy.
|
- Keep authenticated direct HTTP requests as the primary fetch strategy.
|
||||||
- Parse route-specific Comet bootstrap/state payloads before falling back to rendered-HTML extraction.
|
- Parse route-specific Comet bootstrap/state payloads before falling back to
|
||||||
|
rendered-HTML extraction.
|
||||||
- Detect auth-gated, unavailable, and unknown responses explicitly.
|
- Detect auth-gated, unavailable, and unknown responses explicitly.
|
||||||
- Update tests so they model current route markers and failure modes instead of legacy page objects.
|
- Update tests so they model current route markers and failure modes instead of legacy
|
||||||
|
page objects.
|
||||||
|
|
||||||
## Non-Goals
|
## Non-Goals
|
||||||
|
|
||||||
- Reworking non-Facebook scrapers.
|
- Reworking non-Facebook scrapers.
|
||||||
- Converting the scraper to browser-only automation.
|
- Converting the scraper to browser-only automation.
|
||||||
- Preserving old parser behavior for `marketplace_product_details_page` or `__bbox`-driven item extraction.
|
- Preserving old parser behavior for `marketplace_product_details_page` or
|
||||||
- Reverse-engineering every internal Facebook bootstrap payload shape exhaustively before implementation.
|
`__bbox`-driven item extraction.
|
||||||
|
- Reverse-engineering every internal Facebook bootstrap payload shape exhaustively
|
||||||
|
before implementation.
|
||||||
|
|
||||||
## Current State
|
## Current State
|
||||||
|
|
||||||
The current implementation in `packages/core/src/scrapers/facebook.ts` still uses authenticated HTTP requests, which remains correct.
|
The current implementation in `packages/core/src/scrapers/facebook.ts` still uses
|
||||||
The search path parses embedded script JSON and looks for `marketplace_search.feed_units.edges`.
|
authenticated HTTP requests, which remains correct.
|
||||||
The item-detail path is centered on legacy extraction paths such as:
|
The search path parses embedded script JSON and looks for
|
||||||
|
`marketplace_search.feed_units.edges`. The item-detail path is centered on legacy
|
||||||
|
extraction paths such as:
|
||||||
|
|
||||||
- `parsed.require[0][3].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
- `parsed.require[0][3].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
||||||
- nested `__bbox.require[...]` variations
|
- nested `__bbox.require[...]` variations
|
||||||
- recursive search through `parsed.require`
|
- recursive search through `parsed.require`
|
||||||
|
|
||||||
Live evidence gathered earlier in this session and by the isolated research subagent shows that current Facebook Marketplace pages are Comet route-driven and expose markers such as:
|
Live evidence gathered earlier in this session and by the isolated research subagent
|
||||||
|
shows that current Facebook Marketplace pages are Comet route-driven and expose markers
|
||||||
|
such as:
|
||||||
|
|
||||||
- `XCometMarketplaceSearchController`
|
- `XCometMarketplaceSearchController`
|
||||||
- `XCometMarketplacePermalinkController`
|
- `XCometMarketplacePermalinkController`
|
||||||
@@ -41,7 +52,9 @@ Live evidence gathered earlier in this session and by the isolated research suba
|
|||||||
- `data-sjs`
|
- `data-sjs`
|
||||||
- `data-btmanifest`
|
- `data-btmanifest`
|
||||||
|
|
||||||
The same live investigation also showed that authenticated item pages no longer expose the old `marketplace_product_details_page` marker reliably, while live search still returns usable results.
|
The same live investigation also showed that authenticated item pages no longer expose
|
||||||
|
the old `marketplace_product_details_page` marker reliably, while live search still
|
||||||
|
returns usable results.
|
||||||
|
|
||||||
## Chosen Approach
|
## Chosen Approach
|
||||||
|
|
||||||
@@ -52,9 +65,11 @@ The scraper will:
|
|||||||
1. Fetch authenticated HTML directly.
|
1. Fetch authenticated HTML directly.
|
||||||
2. Classify the response using current route and auth markers.
|
2. Classify the response using current route and auth markers.
|
||||||
3. Parse inline bootstrap/state payloads using route-specific probes.
|
3. Parse inline bootstrap/state payloads using route-specific probes.
|
||||||
4. Fall back to rendered-HTML extraction only when bootstrap markers are present but the payload cannot be decoded into the expected search or item shape.
|
4. Fall back to rendered-HTML extraction only when bootstrap markers are present but the
|
||||||
|
payload cannot be decoded into the expected search or item shape.
|
||||||
|
|
||||||
This keeps the cheaper direct-HTTP transport while shifting the parser contract from legacy page-object names to current Comet route structure.
|
This keeps the cheaper direct-HTTP transport while shifting the parser contract from
|
||||||
|
legacy page-object names to current Comet route structure.
|
||||||
|
|
||||||
## Design
|
## Design
|
||||||
|
|
||||||
@@ -88,7 +103,8 @@ Primary behavior:
|
|||||||
- fetch the Marketplace search HTML with auth cookies
|
- fetch the Marketplace search HTML with auth cookies
|
||||||
- confirm the response class is `search`
|
- confirm the response class is `search`
|
||||||
- extract inline bootstrap/state blobs from script tags and page attributes
|
- extract inline bootstrap/state blobs from script tags and page attributes
|
||||||
- probe for route-specific search payloads associated with `XCometMarketplaceSearchController`
|
- probe for route-specific search payloads associated with
|
||||||
|
`XCometMarketplaceSearchController`
|
||||||
- map decoded search results into summary listing records
|
- map decoded search results into summary listing records
|
||||||
|
|
||||||
Search summary fields should remain aligned with the current public output shape:
|
Search summary fields should remain aligned with the current public output shape:
|
||||||
@@ -102,7 +118,8 @@ Search summary fields should remain aligned with the current public output shape
|
|||||||
|
|
||||||
Fallback behavior:
|
Fallback behavior:
|
||||||
|
|
||||||
- if search route markers are present but structured payload decoding fails, extract listing summaries from rendered HTML anchors and text patterns
|
- if search route markers are present but structured payload decoding fails, extract
|
||||||
|
listing summaries from rendered HTML anchors and text patterns
|
||||||
- use item links matching `/marketplace/item/<id>` as the anchor for fallback extraction
|
- use item links matching `/marketplace/item/<id>` as the anchor for fallback extraction
|
||||||
- treat fallback results as summary-only data, not rich detail data
|
- treat fallback results as summary-only data, not rich detail data
|
||||||
|
|
||||||
@@ -132,9 +149,12 @@ Priority item fields:
|
|||||||
|
|
||||||
Fallback behavior:
|
Fallback behavior:
|
||||||
|
|
||||||
- if permalink route markers are present but no stable payload object is decodable, extract data from rendered HTML text structure
|
- if permalink route markers are present but no stable payload object is decodable,
|
||||||
- prioritize title, price, condition, description, location text, and seller module content
|
extract data from rendered HTML text structure
|
||||||
- return partial item data when core user-facing fields are present rather than failing solely because deeper commerce metadata is missing
|
- prioritize title, price, condition, description, location text, and seller module
|
||||||
|
content
|
||||||
|
- return partial item data when core user-facing fields are present rather than failing
|
||||||
|
solely because deeper commerce metadata is missing
|
||||||
|
|
||||||
### Bootstrap Parsing Strategy
|
### Bootstrap Parsing Strategy
|
||||||
|
|
||||||
@@ -151,11 +171,14 @@ Candidate discovery inputs:
|
|||||||
- `ServerJS` / `Bootloader` inline blobs
|
- `ServerJS` / `Bootloader` inline blobs
|
||||||
- route controller names
|
- route controller names
|
||||||
|
|
||||||
Candidate scoring for search should favor objects that contain repeated result-card semantics, item IDs, listing links, titles, prices, or location summaries.
|
Candidate scoring for search should favor objects that contain repeated result-card
|
||||||
Candidate scoring for item pages should favor objects that contain singular listing semantics, title, price, condition, description, location, seller, or permalink context.
|
semantics, item IDs, listing links, titles, prices, or location summaries.
|
||||||
|
Candidate scoring for item pages should favor objects that contain singular listing
|
||||||
|
semantics, title, price, condition, description, location, seller, or permalink context.
|
||||||
|
|
||||||
The parser should not depend on one hard-coded object name surviving forever.
|
The parser should not depend on one hard-coded object name surviving forever.
|
||||||
Instead, it should look for route-specific semantic clusters and choose the strongest candidate.
|
Instead, it should look for route-specific semantic clusters and choose the strongest
|
||||||
|
candidate.
|
||||||
|
|
||||||
### Legacy Removal
|
### Legacy Removal
|
||||||
|
|
||||||
@@ -166,7 +189,9 @@ Specifically:
|
|||||||
- delete legacy-first `require` / `__bbox` navigation tables
|
- delete legacy-first `require` / `__bbox` navigation tables
|
||||||
- delete tests whose only purpose is to preserve those legacy paths
|
- delete tests whose only purpose is to preserve those legacy paths
|
||||||
|
|
||||||
If a minimal legacy compatibility branch remains, it must be a last-resort fallback behind the new route-aware parser and should not shape test fixtures or design decisions.
|
If a minimal legacy compatibility branch remains, it must be a last-resort fallback
|
||||||
|
behind the new route-aware parser and should not shape test fixtures or design
|
||||||
|
decisions.
|
||||||
|
|
||||||
### Error Handling
|
### Error Handling
|
||||||
|
|
||||||
@@ -178,7 +203,8 @@ Facebook responses should now fail with explicit route-aware outcomes:
|
|||||||
4. Search or item route detected, but no decodable data found.
|
4. Search or item route detected, but no decodable data found.
|
||||||
5. Unknown response shape.
|
5. Unknown response shape.
|
||||||
|
|
||||||
Error messages should name the actual class of failure instead of implying that every parse miss is caused by expired cookies.
|
Error messages should name the actual class of failure instead of implying that every
|
||||||
|
parse miss is caused by expired cookies.
|
||||||
|
|
||||||
### Testing Strategy
|
### Testing Strategy
|
||||||
|
|
||||||
@@ -190,11 +216,15 @@ Coverage targets:
|
|||||||
1. Search responses classify correctly from current Comet controller markers.
|
1. Search responses classify correctly from current Comet controller markers.
|
||||||
2. Item responses classify correctly from current Comet controller markers.
|
2. Item responses classify correctly from current Comet controller markers.
|
||||||
3. Login-gated and unavailable responses are detected before parsing.
|
3. Login-gated and unavailable responses are detected before parsing.
|
||||||
4. Search bootstrap parsing produces summary listing results from current-shape fixtures.
|
4. Search bootstrap parsing produces summary listing results from current-shape
|
||||||
|
fixtures.
|
||||||
5. Item bootstrap parsing produces rich listing details from current-shape fixtures.
|
5. Item bootstrap parsing produces rich listing details from current-shape fixtures.
|
||||||
6. Search fallback extraction works when route markers exist but structured payload decoding fails.
|
6. Search fallback extraction works when route markers exist but structured payload
|
||||||
7. Item fallback extraction works when route markers exist but structured payload decoding fails.
|
decoding fails.
|
||||||
8. Old legacy-only item fixtures are removed or rewritten so they no longer define the contract.
|
7. Item fallback extraction works when route markers exist but structured payload
|
||||||
|
decoding fails.
|
||||||
|
8. Old legacy-only item fixtures are removed or rewritten so they no longer define the
|
||||||
|
contract.
|
||||||
|
|
||||||
Verification target after implementation:
|
Verification target after implementation:
|
||||||
|
|
||||||
@@ -204,23 +234,30 @@ Verification target after implementation:
|
|||||||
|
|
||||||
## Public API Surface
|
## Public API Surface
|
||||||
|
|
||||||
Keep the current public function names unless the rewrite proves that a signature change is required:
|
Keep the current public function names unless the rewrite proves that a signature change
|
||||||
|
is required:
|
||||||
|
|
||||||
- `fetchFacebookItems(...)`
|
- `fetchFacebookItems(...)`
|
||||||
- `fetchFacebookItem(...)`
|
- `fetchFacebookItem(...)`
|
||||||
- `extractFacebookMarketplaceData(...)`
|
- `extractFacebookMarketplaceData(...)`
|
||||||
- `extractFacebookItemData(...)`
|
- `extractFacebookItemData(...)`
|
||||||
|
|
||||||
The internals should change substantially, but callers should not need a new integration surface for this rewrite.
|
The internals should change substantially, but callers should not need a new integration
|
||||||
|
surface for this rewrite.
|
||||||
|
|
||||||
## Risks
|
## Risks
|
||||||
|
|
||||||
- Facebook may change bootstrap payload naming again, so route/controller markers are more stable than exact nested object paths but still not guaranteed.
|
- Facebook may change bootstrap payload naming again, so route/controller markers are
|
||||||
- Search and item pages may each contain multiple partial payloads, making candidate ranking important.
|
more stable than exact nested object paths but still not guaranteed.
|
||||||
- Fallback rendered-HTML extraction may be noisier than bootstrap decoding and needs clear precedence rules.
|
- Search and item pages may each contain multiple partial payloads, making candidate
|
||||||
- Live fixtures can drift from production quickly, so tests must model route semantics rather than exact one-off payloads where possible.
|
ranking important.
|
||||||
|
- Fallback rendered-HTML extraction may be noisier than bootstrap decoding and needs
|
||||||
|
clear precedence rules.
|
||||||
|
- Live fixtures can drift from production quickly, so tests must model route semantics
|
||||||
|
rather than exact one-off payloads where possible.
|
||||||
|
|
||||||
## Rollout Notes
|
## Rollout Notes
|
||||||
|
|
||||||
The code, fixtures, and tests should change together.
|
The code, fixtures, and tests should change together.
|
||||||
There should be no mixed state where the implementation is Comet-aware but the tests still encode `marketplace_product_details_page` as the primary contract.
|
There should be no mixed state where the implementation is Comet-aware but the tests
|
||||||
|
still encode `marketplace_product_details_page` as the primary contract.
|
||||||
|
|||||||
@@ -2,15 +2,18 @@
|
|||||||
|
|
||||||
## Summary
|
## Summary
|
||||||
|
|
||||||
Add an optional shared result mode across Facebook, eBay, and Kijiji that moves suspiciously cheap listings out of the main results into a separate `unstableResults` bucket.
|
Add an optional shared result mode across Facebook, eBay, and Kijiji that moves
|
||||||
Listings are considered unstable when their price is more than 20% below the median price of the scraper's priced search results.
|
suspiciously cheap listings out of the main results into a separate `unstableResults`
|
||||||
|
bucket. Listings are considered unstable when their price is more than 20% below the
|
||||||
|
median price of the scraper’s priced search results.
|
||||||
|
|
||||||
## Goals
|
## Goals
|
||||||
|
|
||||||
- Support the same optional unstable-listing mode across all scrapers.
|
- Support the same optional unstable-listing mode across all scrapers.
|
||||||
- Keep current default scraper and route behavior unchanged unless the mode is enabled.
|
- Keep current default scraper and route behavior unchanged unless the mode is enabled.
|
||||||
- Hide unstable listings from the main results while still returning them separately.
|
- Hide unstable listings from the main results while still returning them separately.
|
||||||
- Implement the rule once in shared core code instead of duplicating marketplace-specific logic.
|
- Implement the rule once in shared core code instead of duplicating
|
||||||
|
marketplace-specific logic.
|
||||||
- Document the option in MCP tool descriptions so callers can discover it.
|
- Document the option in MCP tool descriptions so callers can discover it.
|
||||||
|
|
||||||
## Non-Goals
|
## Non-Goals
|
||||||
@@ -24,7 +27,8 @@ Listings are considered unstable when their price is more than 20% below the med
|
|||||||
|
|
||||||
`packages/core` currently returns plain arrays from scraper search functions.
|
`packages/core` currently returns plain arrays from scraper search functions.
|
||||||
`packages/api-server` forwards those scraper results directly from marketplace routes.
|
`packages/api-server` forwards those scraper results directly from marketplace routes.
|
||||||
`packages/mcp-server` documents search tools per marketplace, but does not expose or describe any result-stability mode.
|
`packages/mcp-server` documents search tools per marketplace, but does not expose or
|
||||||
|
describe any result-stability mode.
|
||||||
|
|
||||||
There is no shared result-classification utility today.
|
There is no shared result-classification utility today.
|
||||||
Price filtering exists in some scrapers, but not a cross-marketplace median-based split.
|
Price filtering exists in some scrapers, but not a cross-marketplace median-based split.
|
||||||
@@ -33,11 +37,14 @@ Price filtering exists in some scrapers, but not a cross-marketplace median-base
|
|||||||
|
|
||||||
Use a shared core utility plus per-route and per-tool opt-in.
|
Use a shared core utility plus per-route and per-tool opt-in.
|
||||||
|
|
||||||
The shared utility will accept parsed listings, compute the median from valid positive prices, and split the data into `results` and `unstableResults`.
|
The shared utility will accept parsed listings, compute the median from valid positive
|
||||||
Each scraper will opt into that utility when the caller enables unstable-listing mode.
|
prices, and split the data into `results` and `unstableResults`. Each scraper will opt
|
||||||
API routes and MCP tools will expose the same optional mode so the feature is consistently available everywhere scraper search is surfaced.
|
into that utility when the caller enables unstable-listing mode.
|
||||||
|
API routes and MCP tools will expose the same optional mode so the feature is
|
||||||
|
consistently available everywhere scraper search is surfaced.
|
||||||
|
|
||||||
This keeps the heuristic centralized, minimizes duplicated logic, and preserves existing consumers by leaving the default path unchanged.
|
This keeps the heuristic centralized, minimizes duplicated logic, and preserves existing
|
||||||
|
consumers by leaving the default path unchanged.
|
||||||
|
|
||||||
## Design
|
## Design
|
||||||
|
|
||||||
@@ -48,14 +55,16 @@ Add a shared utility in `packages/core` for listing stability classification.
|
|||||||
Responsibilities:
|
Responsibilities:
|
||||||
|
|
||||||
- accept parsed listing arrays with `listingPrice.cents`
|
- accept parsed listing arrays with `listingPrice.cents`
|
||||||
- ignore listings whose price is missing, non-numeric, or non-positive when computing the median
|
- ignore listings whose price is missing, non-numeric, or non-positive when computing
|
||||||
|
the median
|
||||||
- compute the median price from valid priced listings
|
- compute the median price from valid priced listings
|
||||||
- classify listings as unstable when `listingPrice.cents < median * 0.8`
|
- classify listings as unstable when `listingPrice.cents < median * 0.8`
|
||||||
- return an object with:
|
- return an object with:
|
||||||
- `results`: listings that remain in the main bucket
|
- `results`: listings that remain in the main bucket
|
||||||
- `unstableResults`: listings moved out of the main bucket
|
- `unstableResults`: listings moved out of the main bucket
|
||||||
|
|
||||||
Listings excluded from median computation because their price is missing or non-positive remain in `results` unchanged.
|
Listings excluded from median computation because their price is missing or non-positive
|
||||||
|
remain in `results` unchanged.
|
||||||
|
|
||||||
### Scraper Integration
|
### Scraper Integration
|
||||||
|
|
||||||
@@ -68,7 +77,8 @@ Default behavior:
|
|||||||
Opt-in behavior:
|
Opt-in behavior:
|
||||||
|
|
||||||
- run the shared classification utility after parsing search results
|
- run the shared classification utility after parsing search results
|
||||||
- classify before final result limiting so unstable items do not consume main-result slots
|
- classify before final result limiting so unstable items do not consume main-result
|
||||||
|
slots
|
||||||
- return an object shaped like:
|
- return an object shaped like:
|
||||||
|
|
||||||
```ts
|
```ts
|
||||||
@@ -82,7 +92,8 @@ Each scraper will use its existing concrete listing subtype for these arrays.
|
|||||||
|
|
||||||
### API Surface
|
### API Surface
|
||||||
|
|
||||||
Marketplace API routes will expose an optional query parameter for unstable-listing mode.
|
Marketplace API routes will expose an optional query parameter for unstable-listing
|
||||||
|
mode.
|
||||||
|
|
||||||
Requirements:
|
Requirements:
|
||||||
|
|
||||||
@@ -90,7 +101,8 @@ Requirements:
|
|||||||
- when enabled, return the object payload with `results` and `unstableResults`
|
- when enabled, return the object payload with `results` and `unstableResults`
|
||||||
- use the same semantics across Facebook, eBay, and Kijiji routes
|
- use the same semantics across Facebook, eBay, and Kijiji routes
|
||||||
|
|
||||||
The exact parameter name should be consistent across routes and intentionally describe the behavior, for example `unstableFilter=true`.
|
The exact parameter name should be consistent across routes and intentionally describe
|
||||||
|
the behavior, for example `unstableFilter=true`.
|
||||||
|
|
||||||
### MCP Surface
|
### MCP Surface
|
||||||
|
|
||||||
@@ -100,34 +112,43 @@ Tool descriptions should explicitly document:
|
|||||||
|
|
||||||
- that the option is optional
|
- that the option is optional
|
||||||
- that it moves listings priced more than 20% below the median into `unstableResults`
|
- that it moves listings priced more than 20% below the median into `unstableResults`
|
||||||
- that enabling it changes the response shape from a plain list to an object with `results` and `unstableResults`
|
- that enabling it changes the response shape from a plain list to an object with
|
||||||
|
`results` and `unstableResults`
|
||||||
- that the behavior is available for Facebook, eBay, and Kijiji search tools
|
- that the behavior is available for Facebook, eBay, and Kijiji search tools
|
||||||
|
|
||||||
The wording should be aligned across all three tools so the feature reads as one shared capability.
|
The wording should be aligned across all three tools so the feature reads as one shared
|
||||||
|
capability.
|
||||||
|
|
||||||
### Error Handling
|
### Error Handling
|
||||||
|
|
||||||
The unstable-listing mode should be best-effort and non-failing.
|
The unstable-listing mode should be best-effort and non-failing.
|
||||||
|
|
||||||
- If there are no valid positive prices, return all listings in `results` and an empty `unstableResults` array.
|
- If there are no valid positive prices, return all listings in `results` and an empty
|
||||||
|
`unstableResults` array.
|
||||||
- If there is only one valid priced listing, do not classify it as unstable.
|
- If there is only one valid priced listing, do not classify it as unstable.
|
||||||
- Parsing failures remain governed by existing scraper behavior; the classification layer should not introduce new scraper-specific errors.
|
- Parsing failures remain governed by existing scraper behavior; the classification
|
||||||
|
layer should not introduce new scraper-specific errors.
|
||||||
|
|
||||||
### Testing Strategy
|
### Testing Strategy
|
||||||
|
|
||||||
Follow TDD.
|
Follow TDD. Start with shared utility tests, then wire the option through scraper and
|
||||||
Start with shared utility tests, then wire the option through scraper and route tests.
|
route tests.
|
||||||
|
|
||||||
Coverage targets:
|
Coverage targets:
|
||||||
|
|
||||||
1. Median calculation for odd-sized valid price sets.
|
1. Median calculation for odd-sized valid price sets.
|
||||||
2. Median calculation for even-sized valid price sets.
|
2. Median calculation for even-sized valid price sets.
|
||||||
3. Strict cutoff behavior where only listings with `price < median * 0.8` move to `unstableResults`.
|
3. Strict cutoff behavior where only listings with `price < median * 0.8` move to
|
||||||
4. Missing, invalid, zero, or negative prices are excluded from median computation and remain in `results`.
|
`unstableResults`.
|
||||||
|
4. Missing, invalid, zero, or negative prices are excluded from median computation and
|
||||||
|
remain in `results`.
|
||||||
5. Default scraper behavior still returns plain arrays when the option is disabled.
|
5. Default scraper behavior still returns plain arrays when the option is disabled.
|
||||||
6. Enabled scraper behavior returns `{ results, unstableResults }` for Facebook, eBay, and Kijiji.
|
6. Enabled scraper behavior returns `{ results, unstableResults }` for Facebook, eBay,
|
||||||
7. API routes preserve existing response shapes by default and switch to the object payload only when enabled.
|
and Kijiji.
|
||||||
8. MCP tool metadata documents the new optional mode for all three marketplace search tools.
|
7. API routes preserve existing response shapes by default and switch to the object
|
||||||
|
payload only when enabled.
|
||||||
|
8. MCP tool metadata documents the new optional mode for all three marketplace search
|
||||||
|
tools.
|
||||||
|
|
||||||
Verification target after implementation:
|
Verification target after implementation:
|
||||||
|
|
||||||
@@ -138,11 +159,15 @@ Verification target after implementation:
|
|||||||
|
|
||||||
## Risks
|
## Risks
|
||||||
|
|
||||||
- The optional mode introduces a union return shape for scraper callers, which can ripple into downstream TypeScript signatures.
|
- The optional mode introduces a union return shape for scraper callers, which can
|
||||||
- Applying classification before final limiting changes which items appear in the main bucket compared with a naive post-limit split.
|
ripple into downstream TypeScript signatures.
|
||||||
- Kijiji and eBay may have different mixes of priced and unpriced results, so excluding non-positive prices from the median must remain explicit and tested.
|
- Applying classification before final limiting changes which items appear in the main
|
||||||
|
bucket compared with a naive post-limit split.
|
||||||
|
- Kijiji and eBay may have different mixes of priced and unpriced results, so excluding
|
||||||
|
non-positive prices from the median must remain explicit and tested.
|
||||||
|
|
||||||
## Rollout Notes
|
## Rollout Notes
|
||||||
|
|
||||||
Land the shared classifier, scraper wiring, route wiring, tests, and MCP description updates together.
|
Land the shared classifier, scraper wiring, route wiring, tests, and MCP description
|
||||||
That avoids a partial rollout where the feature exists in one surface but is undocumented or inconsistent elsewhere.
|
updates together. That avoids a partial rollout where the feature exists in one surface
|
||||||
|
but is undocumented or inconsistent elsewhere.
|
||||||
|
|||||||
@@ -0,0 +1,44 @@
|
|||||||
|
# Live Parser Tests Design
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Add explicit live endpoint tests for each core scraper parser path.
|
||||||
|
These tests are excluded from normal deterministic test commands and run only through a
|
||||||
|
dedicated package script.
|
||||||
|
|
||||||
|
## Scope
|
||||||
|
|
||||||
|
- Add one live suite per parser: eBay, Kijiji, Facebook.
|
||||||
|
- Place suites under `packages/core/test/live/` so normal
|
||||||
|
`bun test packages/core/test/*.test.ts` patterns do not include them accidentally.
|
||||||
|
- Add a root `test:live` script that runs all live suites together.
|
||||||
|
- Keep existing mocked tests unchanged.
|
||||||
|
|
||||||
|
## Behavior
|
||||||
|
|
||||||
|
- Each suite calls the public scraper entry point for that marketplace with a narrow
|
||||||
|
query and low max item count.
|
||||||
|
- Assertions verify scrape output shape and parser viability, not exact listing
|
||||||
|
identity.
|
||||||
|
- eBay and Kijiji require live network access and fail on endpoint/parser breakage.
|
||||||
|
- Facebook is strict: missing or expired `FACEBOOK_COOKIE` fails the live suite instead
|
||||||
|
of skipping.
|
||||||
|
|
||||||
|
## Test Data
|
||||||
|
|
||||||
|
- Use stable broad Canadian queries such as `iphone` or `laptop` to reduce empty-result
|
||||||
|
risk.
|
||||||
|
- Use low limits to avoid unnecessary load and rate-limit pressure.
|
||||||
|
- Avoid exact prices, titles, listing IDs, or ordering assumptions.
|
||||||
|
|
||||||
|
## Failure Meaning
|
||||||
|
|
||||||
|
- Empty result arrays fail because live parser logic did not produce usable listings.
|
||||||
|
- Missing required fields fail because adapter contracts depend on those fields.
|
||||||
|
- Authentication failures fail for Facebook because selected scope is strict.
|
||||||
|
|
||||||
|
## Verification
|
||||||
|
|
||||||
|
- Normal suite remains offline: `bun test packages/core/test`.
|
||||||
|
- Live suite runs by explicit script: `bun run test:live`.
|
||||||
|
- Full static checks remain via `bun run ci`.
|
||||||
@@ -0,0 +1,173 @@
|
|||||||
|
# Facebook Marketplace Anti-Bot Challenge Solver Design
|
||||||
|
|
||||||
|
## Summary
|
||||||
|
|
||||||
|
Add a challenge-detection and challenge-solving layer to the Facebook Marketplace
|
||||||
|
scraper so it can handle anti-bot gates (checkpoint pages, token rotation, cookie
|
||||||
|
requirements) programmatically.
|
||||||
|
Build the solver in pure Bun — no browser automation in production.
|
||||||
|
Use `agent-browser` only for one-time debug reconnaissance.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
|
||||||
|
- Identify which anti-bot challenge(s) Facebook Marketplace triggers against
|
||||||
|
programmatic HTTP requests.
|
||||||
|
- Implement detection + solving for each discovered challenge type.
|
||||||
|
- Wire the solver into `fetchFacebookItems` and `fetchFacebookItem` so challenges are
|
||||||
|
handled transparently.
|
||||||
|
- Follow the same pattern as the existing `ebay-challenge.ts` (detect → solve → retry
|
||||||
|
with clearance).
|
||||||
|
- Zero browser automation at runtime.
|
||||||
|
Pure `fetch` + `Bun` APIs + npm packages only.
|
||||||
|
|
||||||
|
## Non-Goals
|
||||||
|
|
||||||
|
- Solving login/auth-wall challenges (those require fresh cookies — not solvable
|
||||||
|
programmatically).
|
||||||
|
- Full account login automation (cookies must be provided by the user).
|
||||||
|
- Browser-based scraping or Puppeteer/Playwright integration.
|
||||||
|
- Solving challenges for non-Marketplace Facebook endpoints.
|
||||||
|
|
||||||
|
## Current State
|
||||||
|
|
||||||
|
The Facebook scraper (`packages/core/src/scrapers/facebook.ts`) fetches Marketplace
|
||||||
|
search and item pages via authenticated `fetch` with cookies from `FACEBOOK_COOKIE` env
|
||||||
|
var. It:
|
||||||
|
|
||||||
|
- Sends a browser-like header set (`sec-ch-ua`, `user-agent`, etc.)
|
||||||
|
- Parses SSR HTML for embedded JSON in script tags
|
||||||
|
- Has no challenge detection — if Facebook returns a challenge page, the scraper
|
||||||
|
silently fails (no listings parsed, classifies as “unknown”)
|
||||||
|
- Depends entirely on cookie freshness
|
||||||
|
|
||||||
|
The eBay scraper already follows the challenge-solver pattern in this codebase:
|
||||||
|
`ebay.ts` uses `warmEbaySession()`, `isChallengeRedirect()`, `isChallengeHtml()`, and
|
||||||
|
`solveEbayChallenge()` from `ebay-challenge.ts`.
|
||||||
|
|
||||||
|
## Chosen Approach
|
||||||
|
|
||||||
|
**Reconnaissance-first development:**
|
||||||
|
|
||||||
|
1. Use `agent-browser` (debug only) to capture a real Facebook Marketplace browsing
|
||||||
|
session via HAR.
|
||||||
|
2. Probe programmatic `fetch` to see what Facebook returns without a browser.
|
||||||
|
3. Diff the two to identify the gap (missing headers?
|
||||||
|
missing cookies? missing JS execution?).
|
||||||
|
4. Build a modular solver in `packages/core/src/utils/facebook-challenge.ts` that
|
||||||
|
detects each challenge type and applies the appropriate fix.
|
||||||
|
5. Wire it into `facebook.ts` following the eBay pattern.
|
||||||
|
|
||||||
|
## Design
|
||||||
|
|
||||||
|
### File Plan
|
||||||
|
|
||||||
|
| File | Purpose |
|
||||||
|
| --- | --- |
|
||||||
|
| `packages/core/src/utils/facebook-challenge.ts` | Challenge detection, solving, and cookie/session utilities |
|
||||||
|
| `packages/core/src/scrapers/facebook.ts` | Modified: warmup, challenge detection before parsing, retry loop |
|
||||||
|
| `packages/core/test/facebook-challenge.test.ts` | Unit tests with mock challenge HTML fixtures |
|
||||||
|
|
||||||
|
### Flow
|
||||||
|
|
||||||
|
```
|
||||||
|
fetchFacebookItems(searchUrl)
|
||||||
|
├── warmFacebookSession() → GET facebook.com/ (collect datr + Akamai cookies)
|
||||||
|
├── fetchHtml(searchUrl) → receives response
|
||||||
|
├── detectFacebookChallenge(response)
|
||||||
|
│ ├── checkpoint/challenge HTML → solveCheckpointChallenge()
|
||||||
|
│ ├── redirect to /login → fail (cookies expired)
|
||||||
|
│ ├── missing required cookies → regenerate session
|
||||||
|
│ ├── 429 rate limit → backoff + retry (existing http.ts handles this)
|
||||||
|
│ └── no challenge → proceed to parsing
|
||||||
|
├── if solveCheckpointChallenge succeeds → retry fetchHtml with clearance cookie
|
||||||
|
└── parse results
|
||||||
|
```
|
||||||
|
|
||||||
|
### Challenge Types (to be confirmed by reconnaissance)
|
||||||
|
|
||||||
|
| Type | Expected Signal | Solving Strategy |
|
||||||
|
| --- | --- | --- |
|
||||||
|
| Login wall | Redirect to `/login` or HTML `"You must log in"` | Fail — user must provide fresh cookies |
|
||||||
|
| Checkpoint page | HTML contains `checkpoint` or `challenge` path | Parse hidden form fields, compute proof-of-work if present, submit answer endpoint |
|
||||||
|
| `datr` cookie missing | No `datr` in cookie jar → request fails | Fetch homepage first to obtain `datr` (session warmup) |
|
||||||
|
| DTSG token needed | Form submissions fail with CSRF error | Extract `fb_dtsg` from page HTML, include in request body |
|
||||||
|
| GraphQL header check | Request blocked without internal headers | Extract `x-fb-friendly-name` from browser HAR, replicate |
|
||||||
|
| Akamai/bot-manager | Redirect loops or blank pages without Akamai cookies | Homepage warmup to collect `bm_sv`, `bm_mi`, etc. |
|
||||||
|
|
||||||
|
### Key Modules
|
||||||
|
|
||||||
|
**`facebook-challenge.ts`:**
|
||||||
|
|
||||||
|
```
|
||||||
|
// Session warmup — fetch homepage to prime cookies
|
||||||
|
warmFacebookSession(): Promise<Record<string, string>>
|
||||||
|
|
||||||
|
// Challenge detection
|
||||||
|
detectFacebookChallenge(html, status, url, headers): ChallengeType | null
|
||||||
|
|
||||||
|
// Checkpoint solver
|
||||||
|
solveCheckpointChallenge(html, cookies): Promise<ChallengeResult>
|
||||||
|
|
||||||
|
// DTSG token extraction
|
||||||
|
extractDtsg(html): string | null
|
||||||
|
|
||||||
|
// Cookie jar management (shared with ebay.ts pattern)
|
||||||
|
mergeCookies(...): Record<string, string>
|
||||||
|
```
|
||||||
|
|
||||||
|
**`ChallengeResult` type:**
|
||||||
|
```ts
|
||||||
|
interface ChallengeResult {
|
||||||
|
solved: boolean;
|
||||||
|
cookies?: Record<string, string>; // clearance cookies to replay
|
||||||
|
token?: string; // challenge response token
|
||||||
|
error?: string; // why it failed
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Error Handling
|
||||||
|
|
||||||
|
- Solver failure → return `ChallengeResult { solved: false, error: "..." }`, scraper
|
||||||
|
logs warning and returns empty results (never throws).
|
||||||
|
- Unrecognized challenge → log the response URL and HTML snippet for future analysis.
|
||||||
|
- Rate limits → handled by existing `http.ts` exponential backoff (no change needed).
|
||||||
|
- Solver timeout → 30s cap on any challenge computation, fall back to `solved: false`.
|
||||||
|
|
||||||
|
### Testing
|
||||||
|
|
||||||
|
| Test | What It Verifies |
|
||||||
|
| --- | --- |
|
||||||
|
| `detectFacebookChallenge` with sample checkpoint HTML | Correctly identifies checkpoint challenge |
|
||||||
|
| `detectFacebookChallenge` with normal search HTML | Returns null (no false positives) |
|
||||||
|
| `detectFacebookChallenge` with login redirect | Identifies auth-gated |
|
||||||
|
| `solveCheckpointChallenge` with known PoW params | Produces correct answer |
|
||||||
|
| `warmFacebookSession` with mocked fetch | Collects expected cookies |
|
||||||
|
| `extractDtsg` with sample page HTML | Extracts the DTSG token |
|
||||||
|
| Integration: fetch → challenge → solve → retry → results | End-to-end mock flow |
|
||||||
|
| Solver throws → scraper returns empty, no crash | Graceful fallback |
|
||||||
|
| Solver unknown challenge → logs warning, returns empty | No unhandled challenge crashes |
|
||||||
|
|
||||||
|
Test data will use anonymized HTML fixtures (no real user data).
|
||||||
|
|
||||||
|
## Reconnaissance Steps (debug-only, one-time)
|
||||||
|
|
||||||
|
1. **Probe programmatically:** `fetch` Marketplace search with/without cookies, record
|
||||||
|
status code and HTML.
|
||||||
|
2. **Browser session:** `agent-browser` → log into Facebook → navigate Marketplace →
|
||||||
|
record HAR.
|
||||||
|
3. **Diff analysis:** Compare browser request headers vs.
|
||||||
|
our programmatic headers.
|
||||||
|
4. **Cookie inventory:** List all cookies from browser session, identify which are
|
||||||
|
essential.
|
||||||
|
5. **Challenge trigger:** Identify what change in request signature triggers a
|
||||||
|
challenge.
|
||||||
|
6. **Replay test:** Replay browser’s exact request via `fetch` to confirm
|
||||||
|
headers/cookies are the differentiator.
|
||||||
|
|
||||||
|
All reconnaissance artifacts saved under `docs/facebook-challenge/`.
|
||||||
|
|
||||||
|
## Decisions Deferred to Post-Reconnaissance
|
||||||
|
|
||||||
|
- Exact challenge types and solving strategies (depends on what Facebook actually uses).
|
||||||
|
- Whether a PoW solver, CAPTCHA solver, or token-extraction approach is needed.
|
||||||
|
- npm package dependencies (only add what the reconnaissance proves necessary).
|
||||||
@@ -12,6 +12,7 @@
|
|||||||
"build:mcp": "bun build ./packages/mcp-server/src/index.ts --target=bun --outdir=./dist/mcp --minify",
|
"build:mcp": "bun build ./packages/mcp-server/src/index.ts --target=bun --outdir=./dist/mcp --minify",
|
||||||
"build:all": "bun run build:api && bun run build:mcp",
|
"build:all": "bun run build:api && bun run build:mcp",
|
||||||
"ci": "bun run typecheck && biome check --write",
|
"ci": "bun run typecheck && biome check --write",
|
||||||
|
"test:live": "bun test --cwd packages/core test/live",
|
||||||
"clean": "rm -rf dist",
|
"clean": "rm -rf dist",
|
||||||
"start": "./scripts/start.sh"
|
"start": "./scripts/start.sh"
|
||||||
},
|
},
|
||||||
|
|||||||
@@ -3,6 +3,7 @@ import { logger } from "../logger";
|
|||||||
import {
|
import {
|
||||||
emptySearchResponse,
|
emptySearchResponse,
|
||||||
getRequiredSearchQuery,
|
getRequiredSearchQuery,
|
||||||
|
parseDollarPriceParam,
|
||||||
parseNonNegativeIntegerParam,
|
parseNonNegativeIntegerParam,
|
||||||
} from "./helpers";
|
} from "./helpers";
|
||||||
|
|
||||||
@@ -18,17 +19,11 @@ export async function ebayRoute(req: Request): Promise<Response> {
|
|||||||
return SEARCH_QUERY;
|
return SEARCH_QUERY;
|
||||||
}
|
}
|
||||||
|
|
||||||
const minPrice = parseNonNegativeIntegerParam(
|
const minPrice = parseDollarPriceParam(reqUrl.searchParams, "minPrice");
|
||||||
reqUrl.searchParams,
|
|
||||||
"minPrice",
|
|
||||||
);
|
|
||||||
if (minPrice instanceof Response) {
|
if (minPrice instanceof Response) {
|
||||||
return minPrice;
|
return minPrice;
|
||||||
}
|
}
|
||||||
const maxPrice = parseNonNegativeIntegerParam(
|
const maxPrice = parseDollarPriceParam(reqUrl.searchParams, "maxPrice");
|
||||||
reqUrl.searchParams,
|
|
||||||
"maxPrice",
|
|
||||||
);
|
|
||||||
if (maxPrice instanceof Response) {
|
if (maxPrice instanceof Response) {
|
||||||
return maxPrice;
|
return maxPrice;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -39,6 +39,23 @@ export function parseNonNegativeIntegerParam(
|
|||||||
return Number(rawValue);
|
return Number(rawValue);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export function parseDollarPriceParam(
|
||||||
|
searchParams: URLSearchParams,
|
||||||
|
name: string,
|
||||||
|
): number | undefined | Response {
|
||||||
|
const rawValue = searchParams.get(name);
|
||||||
|
if (rawValue === null) {
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
if (!/^\d+(?:\.\d{1,2})?$/.test(rawValue)) {
|
||||||
|
return Response.json(
|
||||||
|
{ message: `Invalid ${name} parameter` },
|
||||||
|
{ status: 400 },
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return Math.round(Number(rawValue) * 100);
|
||||||
|
}
|
||||||
|
|
||||||
export function emptySearchResponse(hint?: string): Response {
|
export function emptySearchResponse(hint?: string): Response {
|
||||||
const message = hint
|
const message = hint
|
||||||
? `Search didn't return any results! ${hint}`
|
? `Search didn't return any results! ${hint}`
|
||||||
|
|||||||
@@ -3,6 +3,7 @@ import { logger } from "../logger";
|
|||||||
import {
|
import {
|
||||||
emptySearchResponse,
|
emptySearchResponse,
|
||||||
getRequiredSearchQuery,
|
getRequiredSearchQuery,
|
||||||
|
parseDollarPriceParam,
|
||||||
parseNonNegativeIntegerParam,
|
parseNonNegativeIntegerParam,
|
||||||
} from "./helpers";
|
} from "./helpers";
|
||||||
|
|
||||||
@@ -26,17 +27,11 @@ export async function kijijiRoute(req: Request): Promise<Response> {
|
|||||||
if (maxPages instanceof Response) {
|
if (maxPages instanceof Response) {
|
||||||
return maxPages;
|
return maxPages;
|
||||||
}
|
}
|
||||||
const priceMin = parseNonNegativeIntegerParam(
|
const priceMin = parseDollarPriceParam(reqUrl.searchParams, "priceMin");
|
||||||
reqUrl.searchParams,
|
|
||||||
"priceMin",
|
|
||||||
);
|
|
||||||
if (priceMin instanceof Response) {
|
if (priceMin instanceof Response) {
|
||||||
return priceMin;
|
return priceMin;
|
||||||
}
|
}
|
||||||
const priceMax = parseNonNegativeIntegerParam(
|
const priceMax = parseDollarPriceParam(reqUrl.searchParams, "priceMax");
|
||||||
reqUrl.searchParams,
|
|
||||||
"priceMax",
|
|
||||||
);
|
|
||||||
if (priceMax instanceof Response) {
|
if (priceMax instanceof Response) {
|
||||||
return priceMax;
|
return priceMax;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -282,6 +282,24 @@ describe("API routes", () => {
|
|||||||
);
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("kijijiRoute forwards dollar price filters to core as cents", async () => {
|
||||||
|
const { kijijiRoute } = await import("../src/routes/kijiji");
|
||||||
|
|
||||||
|
await kijijiRoute(
|
||||||
|
new Request(
|
||||||
|
"http://localhost/api/kijiji?q=laptop&priceMin=999.99&priceMax=1000",
|
||||||
|
),
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(fetchKijijiItems).toHaveBeenCalledWith(
|
||||||
|
"laptop",
|
||||||
|
4,
|
||||||
|
"https://www.kijiji.ca",
|
||||||
|
expect.objectContaining({ priceMin: 99_999, priceMax: 100_000 }),
|
||||||
|
{},
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
test("kijijiRoute does not forward unstableFilter when false", async () => {
|
test("kijijiRoute does not forward unstableFilter when false", async () => {
|
||||||
const { kijijiRoute } = await import("../src/routes/kijiji");
|
const { kijijiRoute } = await import("../src/routes/kijiji");
|
||||||
|
|
||||||
@@ -414,6 +432,24 @@ describe("API routes", () => {
|
|||||||
);
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("ebayRoute forwards dollar price filters to core as cents", async () => {
|
||||||
|
const { ebayRoute } = await import("../src/routes/ebay");
|
||||||
|
|
||||||
|
fetchEbayItems.mockImplementation(() => Promise.resolve([{ title: "a" }]));
|
||||||
|
|
||||||
|
await ebayRoute(
|
||||||
|
new Request(
|
||||||
|
"http://localhost/api/ebay?q=macbook&minPrice=999.99&maxPrice=1000",
|
||||||
|
),
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(fetchEbayItems).toHaveBeenCalledWith(
|
||||||
|
"macbook",
|
||||||
|
1,
|
||||||
|
expect.objectContaining({ minPrice: 99_999, maxPrice: 100_000 }),
|
||||||
|
);
|
||||||
|
});
|
||||||
|
|
||||||
test("ebayRoute passes through scraper payload unchanged in unstable mode", async () => {
|
test("ebayRoute passes through scraper payload unchanged in unstable mode", async () => {
|
||||||
const { ebayRoute } = await import("../src/routes/ebay");
|
const { ebayRoute } = await import("../src/routes/ebay");
|
||||||
|
|
||||||
@@ -730,16 +766,18 @@ describe("API routes", () => {
|
|||||||
expect(body.message).toBe("Invalid minPrice parameter");
|
expect(body.message).toBe("Invalid minPrice parameter");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("ebayRoute returns 400 for decimal minPrice", async () => {
|
test("ebayRoute accepts decimal minPrice", async () => {
|
||||||
const { ebayRoute } = await import("../src/routes/ebay");
|
const { ebayRoute } = await import("../src/routes/ebay");
|
||||||
|
|
||||||
const response = await ebayRoute(
|
await ebayRoute(
|
||||||
new Request("http://localhost/api/ebay?q=laptop&minPrice=1.5"),
|
new Request("http://localhost/api/ebay?q=laptop&minPrice=1.5"),
|
||||||
);
|
);
|
||||||
|
|
||||||
expect(response.status).toBe(400);
|
expect(fetchEbayItems).toHaveBeenCalledWith(
|
||||||
const body = await response.json();
|
"laptop",
|
||||||
expect(body.message).toBe("Invalid minPrice parameter");
|
1,
|
||||||
|
expect.objectContaining({ minPrice: 150 }),
|
||||||
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
test("ebayRoute returns 400 for non-integer maxPrice", async () => {
|
test("ebayRoute returns 400 for non-integer maxPrice", async () => {
|
||||||
@@ -766,16 +804,18 @@ describe("API routes", () => {
|
|||||||
expect(body.message).toBe("Invalid maxPrice parameter");
|
expect(body.message).toBe("Invalid maxPrice parameter");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("ebayRoute returns 400 for decimal maxPrice", async () => {
|
test("ebayRoute accepts decimal maxPrice", async () => {
|
||||||
const { ebayRoute } = await import("../src/routes/ebay");
|
const { ebayRoute } = await import("../src/routes/ebay");
|
||||||
|
|
||||||
const response = await ebayRoute(
|
await ebayRoute(
|
||||||
new Request("http://localhost/api/ebay?q=laptop&maxPrice=1.5"),
|
new Request("http://localhost/api/ebay?q=laptop&maxPrice=1.5"),
|
||||||
);
|
);
|
||||||
|
|
||||||
expect(response.status).toBe(400);
|
expect(fetchEbayItems).toHaveBeenCalledWith(
|
||||||
const body = await response.json();
|
"laptop",
|
||||||
expect(body.message).toBe("Invalid maxPrice parameter");
|
1,
|
||||||
|
expect.objectContaining({ maxPrice: 150 }),
|
||||||
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
test("kijijiRoute returns 400 for decimal maxPages", async () => {
|
test("kijijiRoute returns 400 for decimal maxPages", async () => {
|
||||||
@@ -862,16 +902,20 @@ describe("API routes", () => {
|
|||||||
expect(body.message).toBe("Invalid priceMin parameter");
|
expect(body.message).toBe("Invalid priceMin parameter");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("kijijiRoute returns 400 for decimal priceMin", async () => {
|
test("kijijiRoute accepts decimal priceMin", async () => {
|
||||||
const { kijijiRoute } = await import("../src/routes/kijiji");
|
const { kijijiRoute } = await import("../src/routes/kijiji");
|
||||||
|
|
||||||
const response = await kijijiRoute(
|
await kijijiRoute(
|
||||||
new Request("http://localhost/api/kijiji?q=laptop&priceMin=1.5"),
|
new Request("http://localhost/api/kijiji?q=laptop&priceMin=1.5"),
|
||||||
);
|
);
|
||||||
|
|
||||||
expect(response.status).toBe(400);
|
expect(fetchKijijiItems).toHaveBeenCalledWith(
|
||||||
const body = await response.json();
|
"laptop",
|
||||||
expect(body.message).toBe("Invalid priceMin parameter");
|
4,
|
||||||
|
"https://www.kijiji.ca",
|
||||||
|
expect.objectContaining({ priceMin: 150 }),
|
||||||
|
{},
|
||||||
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
test("kijijiRoute returns 400 for non-integer priceMin", async () => {
|
test("kijijiRoute returns 400 for non-integer priceMin", async () => {
|
||||||
@@ -934,16 +978,20 @@ describe("API routes", () => {
|
|||||||
expect(body.message).toBe("Invalid priceMax parameter");
|
expect(body.message).toBe("Invalid priceMax parameter");
|
||||||
});
|
});
|
||||||
|
|
||||||
test("kijijiRoute returns 400 for decimal priceMax", async () => {
|
test("kijijiRoute accepts decimal priceMax", async () => {
|
||||||
const { kijijiRoute } = await import("../src/routes/kijiji");
|
const { kijijiRoute } = await import("../src/routes/kijiji");
|
||||||
|
|
||||||
const response = await kijijiRoute(
|
await kijijiRoute(
|
||||||
new Request("http://localhost/api/kijiji?q=laptop&priceMax=1.5"),
|
new Request("http://localhost/api/kijiji?q=laptop&priceMax=1.5"),
|
||||||
);
|
);
|
||||||
|
|
||||||
expect(response.status).toBe(400);
|
expect(fetchKijijiItems).toHaveBeenCalledWith(
|
||||||
const body = await response.json();
|
"laptop",
|
||||||
expect(body.message).toBe("Invalid priceMax parameter");
|
4,
|
||||||
|
"https://www.kijiji.ca",
|
||||||
|
expect.objectContaining({ priceMax: 150 }),
|
||||||
|
{},
|
||||||
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
test("kijijiRoute returns 400 for non-integer priceMax", async () => {
|
test("kijijiRoute returns 400 for non-integer priceMax", async () => {
|
||||||
|
|||||||
@@ -10,8 +10,14 @@ import {
|
|||||||
type CookieConfig,
|
type CookieConfig,
|
||||||
ensureCookies,
|
ensureCookies,
|
||||||
formatCookiesForHeader,
|
formatCookiesForHeader,
|
||||||
|
loadCookiesOptional,
|
||||||
parseCookieString,
|
parseCookieString,
|
||||||
} from "../utils/cookies";
|
} from "../utils/cookies";
|
||||||
|
import {
|
||||||
|
buildFacebookHeaders,
|
||||||
|
detectFacebookChallenge,
|
||||||
|
warmFacebookSession,
|
||||||
|
} from "../utils/facebook-challenge";
|
||||||
import { formatCentsToCurrency } from "../utils/format";
|
import { formatCentsToCurrency } from "../utils/format";
|
||||||
import { fetchHtml, HttpError, isRecord, RateLimitError } from "../utils/http";
|
import { fetchHtml, HttpError, isRecord, RateLimitError } from "../utils/http";
|
||||||
import { logger } from "../utils/logger";
|
import { logger } from "../utils/logger";
|
||||||
@@ -20,9 +26,10 @@ import { classifyUnstableListings } from "../utils/unstable";
|
|||||||
/**
|
/**
|
||||||
* Facebook Marketplace Scraper
|
* Facebook Marketplace Scraper
|
||||||
*
|
*
|
||||||
* Note: Facebook Marketplace requires authentication cookies for full access.
|
* Facebook Marketplace returns search results without authentication when
|
||||||
* This implementation will return limited or no results without proper authentication.
|
* proper browser headers are sent. Prices and seller details are hidden on
|
||||||
* This is by design to respect Facebook's authentication requirements.
|
* search results but are available on individual item pages even without
|
||||||
|
* auth cookies. For full-price search results, provide FACEBOOK_COOKIE.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
// Facebook cookie configuration
|
// Facebook cookie configuration
|
||||||
@@ -263,20 +270,14 @@ function logExtractionMetrics(success: boolean, itemId?: string) {
|
|||||||
// ----------------------------- HTTP Client -----------------------------
|
// ----------------------------- HTTP Client -----------------------------
|
||||||
|
|
||||||
function createFacebookHeaders(cookies: string): Record<string, string> {
|
function createFacebookHeaders(cookies: string): Record<string, string> {
|
||||||
return {
|
const jar: Record<string, string> = {};
|
||||||
accept:
|
if (cookies) {
|
||||||
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
|
for (const pair of cookies.split(";")) {
|
||||||
"accept-language": "en-GB,en-US;q=0.9,en;q=0.8",
|
const [name, ...rest] = pair.trim().split("=");
|
||||||
"cache-control": "no-cache",
|
if (name && rest.length > 0) jar[name.trim()] = rest.join("=").trim();
|
||||||
"upgrade-insecure-requests": "1",
|
}
|
||||||
"sec-fetch-dest": "document",
|
}
|
||||||
"sec-fetch-mode": "navigate",
|
return buildFacebookHeaders(jar);
|
||||||
"sec-fetch-site": "none",
|
|
||||||
"sec-fetch-user": "?1",
|
|
||||||
"user-agent":
|
|
||||||
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
|
|
||||||
cookie: cookies,
|
|
||||||
};
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// ----------------------------- Parsing -----------------------------
|
// ----------------------------- Parsing -----------------------------
|
||||||
@@ -286,13 +287,29 @@ export type FacebookResponseKind =
|
|||||||
| "item"
|
| "item"
|
||||||
| "auth_gated"
|
| "auth_gated"
|
||||||
| "unavailable"
|
| "unavailable"
|
||||||
|
| "checkpoint"
|
||||||
| "unknown";
|
| "unknown";
|
||||||
|
|
||||||
export function classifyFacebookResponse(
|
export function classifyFacebookResponse(
|
||||||
htmlString: HTMLString,
|
htmlString: HTMLString,
|
||||||
responseUrl: string,
|
responseUrl: string,
|
||||||
|
status = 200,
|
||||||
) {
|
) {
|
||||||
|
const challengeType = detectFacebookChallenge(
|
||||||
|
status,
|
||||||
|
htmlString,
|
||||||
|
responseUrl,
|
||||||
|
);
|
||||||
|
if (challengeType === "checkpoint") {
|
||||||
|
return {
|
||||||
|
kind: "checkpoint" as const,
|
||||||
|
authGated: false,
|
||||||
|
unavailable: false,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
const authGated =
|
const authGated =
|
||||||
|
challengeType === "login_wall" ||
|
||||||
responseUrl.includes("/login/") ||
|
responseUrl.includes("/login/") ||
|
||||||
htmlString.includes("You must log in") ||
|
htmlString.includes("You must log in") ||
|
||||||
htmlString.includes("log in to continue");
|
htmlString.includes("log in to continue");
|
||||||
@@ -764,6 +781,22 @@ export function extractFacebookItemData(
|
|||||||
return bestMatch.item;
|
return bestMatch.item;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Try marketplace_product_details_page.target path (current item page structure)
|
||||||
|
for (const candidate of candidates) {
|
||||||
|
const detailsPage = findKeyInObject(
|
||||||
|
candidate,
|
||||||
|
"marketplace_product_details_page",
|
||||||
|
) as Record<string, unknown> | undefined;
|
||||||
|
const target = detailsPage?.target as Record<string, unknown> | undefined;
|
||||||
|
if (
|
||||||
|
target &&
|
||||||
|
typeof target.id === "string" &&
|
||||||
|
typeof target.marketplace_listing_title === "string"
|
||||||
|
) {
|
||||||
|
return target as unknown as FacebookMarketplaceItem;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (htmlString.includes("XCometMarketplacePermalinkController")) {
|
if (htmlString.includes("XCometMarketplacePermalinkController")) {
|
||||||
return extractFacebookItemHtmlFallback(htmlString);
|
return extractFacebookItemHtmlFallback(htmlString);
|
||||||
}
|
}
|
||||||
@@ -771,6 +804,25 @@ export function extractFacebookItemData(
|
|||||||
return null;
|
return null;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function findKeyInObject(obj: unknown, targetKey: string): unknown {
|
||||||
|
if (obj == null) return undefined;
|
||||||
|
if (Array.isArray(obj)) {
|
||||||
|
for (const item of obj) {
|
||||||
|
const found = findKeyInObject(item, targetKey);
|
||||||
|
if (found !== undefined) return found;
|
||||||
|
}
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
if (typeof obj !== "object") return undefined;
|
||||||
|
const record = obj as Record<string, unknown>;
|
||||||
|
if (targetKey in record) return record[targetKey];
|
||||||
|
for (const [, value] of Object.entries(record)) {
|
||||||
|
const found = findKeyInObject(value, targetKey);
|
||||||
|
if (found !== undefined) return found;
|
||||||
|
}
|
||||||
|
return undefined;
|
||||||
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
Parse Facebook marketplace search results into ListingDetails[]
|
Parse Facebook marketplace search results into ListingDetails[]
|
||||||
*/
|
*/
|
||||||
@@ -1027,16 +1079,18 @@ export default async function fetchFacebookItems(
|
|||||||
};
|
};
|
||||||
};
|
};
|
||||||
|
|
||||||
const cookies = await ensureFacebookCookies();
|
const warmupCookies = await warmFacebookSession();
|
||||||
|
const warmupHeader = Object.entries(warmupCookies)
|
||||||
|
.map(([k, v]) => `${k}=${v}`)
|
||||||
|
.join("; ");
|
||||||
|
|
||||||
|
const userCookies = await loadCookiesOptional(FACEBOOK_COOKIE_CONFIG);
|
||||||
|
|
||||||
// Format cookies for HTTP header
|
|
||||||
const domain = "www.facebook.com";
|
const domain = "www.facebook.com";
|
||||||
const cookiesHeader = formatCookiesForHeader(cookies, domain);
|
const userCookiesHeader = formatCookiesForHeader(userCookies, domain);
|
||||||
if (!cookiesHeader) {
|
const cookiesHeader = [warmupHeader, userCookiesHeader]
|
||||||
throw new Error(
|
.filter(Boolean)
|
||||||
"No valid Facebook cookies found. Please check that cookies are not expired and apply to facebook.com domain.",
|
.join("; ");
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
const DELAY_MS = Math.max(1, Math.floor(1000 / requestsPerSecond));
|
const DELAY_MS = Math.max(1, Math.floor(1000 / requestsPerSecond));
|
||||||
|
|
||||||
@@ -1047,7 +1101,9 @@ export default async function fetchFacebookItems(
|
|||||||
const searchUrl = `https://www.facebook.com/marketplace/${LOCATION}/search?query=${encodedQuery}&sortBy=creation_time_descend&exact=false`;
|
const searchUrl = `https://www.facebook.com/marketplace/${LOCATION}/search?query=${encodedQuery}&sortBy=creation_time_descend&exact=false`;
|
||||||
|
|
||||||
logger.log(`Fetching Facebook marketplace: ${searchUrl}`);
|
logger.log(`Fetching Facebook marketplace: ${searchUrl}`);
|
||||||
logger.log(`Using ${cookies.length} cookies for authentication`);
|
if (userCookies.length > 0) {
|
||||||
|
logger.log(`Using ${userCookies.length} cookies for authentication`);
|
||||||
|
}
|
||||||
|
|
||||||
let searchHtml: string;
|
let searchHtml: string;
|
||||||
let searchResponseUrl = searchUrl;
|
let searchResponseUrl = searchUrl;
|
||||||
@@ -1100,6 +1156,13 @@ export default async function fetchFacebookItems(
|
|||||||
return finalizeResults([]);
|
return finalizeResults([]);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
if (classification.kind === "checkpoint") {
|
||||||
|
logger.warn(
|
||||||
|
"Facebook marketplace returned a checkpoint challenge. This may require manual verification.",
|
||||||
|
);
|
||||||
|
return finalizeResults([]);
|
||||||
|
}
|
||||||
|
|
||||||
if (classification.unavailable) {
|
if (classification.unavailable) {
|
||||||
logger.warn("Facebook marketplace search returned an unavailable route.");
|
logger.warn("Facebook marketplace search returned an unavailable route.");
|
||||||
return finalizeResults([]);
|
return finalizeResults([]);
|
||||||
@@ -1149,15 +1212,8 @@ export default async function fetchFacebookItems(
|
|||||||
export async function fetchFacebookItem(
|
export async function fetchFacebookItem(
|
||||||
itemId: string,
|
itemId: string,
|
||||||
): Promise<FacebookListingDetails | null> {
|
): Promise<FacebookListingDetails | null> {
|
||||||
const cookies = await ensureFacebookCookies();
|
const userCookies = await loadCookiesOptional(FACEBOOK_COOKIE_CONFIG);
|
||||||
|
const cookiesHeader = formatCookiesForHeader(userCookies, "www.facebook.com");
|
||||||
// Format cookies for HTTP header
|
|
||||||
const cookiesHeader = formatCookiesForHeader(cookies, "www.facebook.com");
|
|
||||||
if (!cookiesHeader) {
|
|
||||||
throw new Error(
|
|
||||||
"No valid Facebook cookies found. Please check that cookies are not expired and apply to facebook.com domain.",
|
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
const itemUrl = `https://www.facebook.com/marketplace/item/${itemId}/`;
|
const itemUrl = `https://www.facebook.com/marketplace/item/${itemId}/`;
|
||||||
|
|
||||||
@@ -1230,6 +1286,14 @@ export async function fetchFacebookItem(
|
|||||||
|
|
||||||
const classification = classifyFacebookResponse(itemHtml, itemResponseUrl);
|
const classification = classifyFacebookResponse(itemHtml, itemResponseUrl);
|
||||||
|
|
||||||
|
if (classification.kind === "checkpoint") {
|
||||||
|
logExtractionMetrics(false, itemId);
|
||||||
|
logger.warn(
|
||||||
|
`Checkpoint challenge detected for item ${itemId}. Facebook may be limiting access.`,
|
||||||
|
);
|
||||||
|
return null;
|
||||||
|
}
|
||||||
|
|
||||||
if (classification.authGated) {
|
if (classification.authGated) {
|
||||||
logExtractionMetrics(false, itemId);
|
logExtractionMetrics(false, itemId);
|
||||||
logger.warn(
|
logger.warn(
|
||||||
|
|||||||
128
packages/core/src/utils/facebook-challenge.ts
Normal file
128
packages/core/src/utils/facebook-challenge.ts
Normal file
@@ -0,0 +1,128 @@
|
|||||||
|
// Facebook Marketplace session & challenge utilities
|
||||||
|
|
||||||
|
// ------------------ Types ------------------
|
||||||
|
|
||||||
|
export type ChallengeType =
|
||||||
|
| "login_wall"
|
||||||
|
| "checkpoint"
|
||||||
|
| "bad_headers"
|
||||||
|
| "rate_limited"
|
||||||
|
| "none";
|
||||||
|
|
||||||
|
// ------------------ Constants ------------------
|
||||||
|
|
||||||
|
const FACEBOOK_BROWSER_HEADERS: Record<string, string> = {
|
||||||
|
accept:
|
||||||
|
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
|
||||||
|
"accept-language": "en-GB,en-US;q=0.9,en;q=0.8",
|
||||||
|
"cache-control": "no-cache",
|
||||||
|
"upgrade-insecure-requests": "1",
|
||||||
|
"sec-fetch-dest": "document",
|
||||||
|
"sec-fetch-mode": "navigate",
|
||||||
|
"sec-fetch-site": "none",
|
||||||
|
"sec-fetch-user": "?1",
|
||||||
|
"sec-ch-ua":
|
||||||
|
'"Google Chrome";v="131", "Chromium";v="131", "Not_A Brand";v="24"',
|
||||||
|
"sec-ch-ua-mobile": "?0",
|
||||||
|
"sec-ch-ua-platform": '"Linux"',
|
||||||
|
"user-agent":
|
||||||
|
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36",
|
||||||
|
};
|
||||||
|
|
||||||
|
// ------------------ Cookie Management ------------------
|
||||||
|
|
||||||
|
function parseSetCookies(setCookieHeaders: string[]): Record<string, string> {
|
||||||
|
const cookies: Record<string, string> = {};
|
||||||
|
for (const header of setCookieHeaders) {
|
||||||
|
const parts = header.split(";");
|
||||||
|
const firstPart = parts[0]?.trim();
|
||||||
|
if (!firstPart) continue;
|
||||||
|
const eqIdx = firstPart.indexOf("=");
|
||||||
|
if (eqIdx === -1) continue;
|
||||||
|
const name = firstPart.slice(0, eqIdx).trim();
|
||||||
|
const value = firstPart.slice(eqIdx + 1).trim();
|
||||||
|
if (name && value) {
|
||||||
|
cookies[name] = value;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return cookies;
|
||||||
|
}
|
||||||
|
|
||||||
|
function cookiesToHeader(cookies: Record<string, string>): string {
|
||||||
|
return Object.entries(cookies)
|
||||||
|
.map(([name, value]) => `${name}=${value}`)
|
||||||
|
.join("; ");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ------------------ Session Warmup ------------------
|
||||||
|
|
||||||
|
export async function warmFacebookSession(): Promise<Record<string, string>> {
|
||||||
|
try {
|
||||||
|
const res = await fetch("https://www.facebook.com/", {
|
||||||
|
method: "GET",
|
||||||
|
headers: FACEBOOK_BROWSER_HEADERS,
|
||||||
|
redirect: "manual",
|
||||||
|
signal: AbortSignal.timeout(10000),
|
||||||
|
});
|
||||||
|
|
||||||
|
const setCookies = res.headers.getSetCookie?.() ?? [];
|
||||||
|
return parseSetCookies(setCookies);
|
||||||
|
} catch {
|
||||||
|
return {};
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ------------------ Challenge Detection ------------------
|
||||||
|
|
||||||
|
export function detectFacebookChallenge(
|
||||||
|
status: number,
|
||||||
|
html: string,
|
||||||
|
responseUrl: string,
|
||||||
|
): ChallengeType {
|
||||||
|
if (status === 400) {
|
||||||
|
return "bad_headers";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (status === 429) {
|
||||||
|
return "rate_limited";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (responseUrl.includes("/login/")) {
|
||||||
|
return "login_wall";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (html.includes("You must log in") || html.includes("log in to continue")) {
|
||||||
|
return "login_wall";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (
|
||||||
|
responseUrl.includes("/checkpoint/") ||
|
||||||
|
(html.includes("checkpoint") && html.includes("challenge"))
|
||||||
|
) {
|
||||||
|
return "checkpoint";
|
||||||
|
}
|
||||||
|
|
||||||
|
return "none";
|
||||||
|
}
|
||||||
|
|
||||||
|
// ------------------ Header Construction ------------------
|
||||||
|
|
||||||
|
export function buildFacebookHeaders(
|
||||||
|
cookieJar: Record<string, string>,
|
||||||
|
extraHeaders?: Record<string, string>,
|
||||||
|
): Record<string, string> {
|
||||||
|
const headers: Record<string, string> = {
|
||||||
|
...FACEBOOK_BROWSER_HEADERS,
|
||||||
|
};
|
||||||
|
|
||||||
|
const cookieString = cookiesToHeader(cookieJar);
|
||||||
|
if (cookieString) {
|
||||||
|
headers.cookie = cookieString;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (extraHeaders) {
|
||||||
|
Object.assign(headers, extraHeaders);
|
||||||
|
}
|
||||||
|
|
||||||
|
return headers;
|
||||||
|
}
|
||||||
35
packages/core/test/live/ebay.live.test.ts
Normal file
35
packages/core/test/live/ebay.live.test.ts
Normal file
@@ -0,0 +1,35 @@
|
|||||||
|
import { describe, expect, test } from "bun:test";
|
||||||
|
import fetchEbayItems from "../../src/scrapers/ebay";
|
||||||
|
|
||||||
|
const LIVE_RESULT_LIMIT = 3;
|
||||||
|
const LIVE_TEST_TIMEOUT_MS = 30_000;
|
||||||
|
|
||||||
|
describe("eBay live parser", () => {
|
||||||
|
test(
|
||||||
|
"scrapes live search results into listing details",
|
||||||
|
async () => {
|
||||||
|
const results = await fetchEbayItems("iphone", 1, {
|
||||||
|
maxItems: LIVE_RESULT_LIMIT,
|
||||||
|
});
|
||||||
|
|
||||||
|
expect(results.length).toBeGreaterThan(0);
|
||||||
|
for (const listing of results) {
|
||||||
|
if (!listing.listingPrice) {
|
||||||
|
throw new Error(`Expected listing price for ${listing.url}`);
|
||||||
|
}
|
||||||
|
if (typeof listing.listingPrice.cents !== "number") {
|
||||||
|
throw new Error(`Expected listing cents for ${listing.url}`);
|
||||||
|
}
|
||||||
|
if (!listing.listingPrice.currency) {
|
||||||
|
throw new Error(`Expected listing currency for ${listing.url}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
expect(listing.url).toStartWith("https://");
|
||||||
|
expect(listing.title.length).toBeGreaterThan(0);
|
||||||
|
expect(listing.listingPrice.cents).toBeGreaterThanOrEqual(0);
|
||||||
|
expect(listing.listingPrice.currency.length).toBeGreaterThan(0);
|
||||||
|
}
|
||||||
|
},
|
||||||
|
LIVE_TEST_TIMEOUT_MS,
|
||||||
|
);
|
||||||
|
});
|
||||||
44
packages/core/test/live/facebook.live.test.ts
Normal file
44
packages/core/test/live/facebook.live.test.ts
Normal file
@@ -0,0 +1,44 @@
|
|||||||
|
import { describe, expect, test } from "bun:test";
|
||||||
|
import fetchFacebookItems from "../../src/scrapers/facebook";
|
||||||
|
|
||||||
|
const LIVE_RESULT_LIMIT = 3;
|
||||||
|
const LIVE_TEST_TIMEOUT_MS = 30_000;
|
||||||
|
|
||||||
|
describe("Facebook live parser", () => {
|
||||||
|
test(
|
||||||
|
"scrapes live marketplace search results into listing details",
|
||||||
|
async () => {
|
||||||
|
if (!process.env.FACEBOOK_COOKIE?.trim()) {
|
||||||
|
throw new Error("FACEBOOK_COOKIE is required for Facebook live tests");
|
||||||
|
}
|
||||||
|
|
||||||
|
const results = await fetchFacebookItems(
|
||||||
|
"iphone",
|
||||||
|
1,
|
||||||
|
"toronto",
|
||||||
|
LIVE_RESULT_LIMIT,
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(results.length).toBeGreaterThan(0);
|
||||||
|
for (const listing of results) {
|
||||||
|
if (!listing.listingPrice) {
|
||||||
|
throw new Error(`Expected listing price for ${listing.url}`);
|
||||||
|
}
|
||||||
|
if (typeof listing.listingPrice.cents !== "number") {
|
||||||
|
throw new Error(`Expected listing cents for ${listing.url}`);
|
||||||
|
}
|
||||||
|
if (!listing.listingPrice.currency) {
|
||||||
|
throw new Error(`Expected listing currency for ${listing.url}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
expect(listing.url).toStartWith(
|
||||||
|
"https://www.facebook.com/marketplace/item/",
|
||||||
|
);
|
||||||
|
expect(listing.title.length).toBeGreaterThan(0);
|
||||||
|
expect(listing.listingPrice.cents).toBeGreaterThanOrEqual(0);
|
||||||
|
expect(listing.listingPrice.currency.length).toBeGreaterThan(0);
|
||||||
|
}
|
||||||
|
},
|
||||||
|
LIVE_TEST_TIMEOUT_MS,
|
||||||
|
);
|
||||||
|
});
|
||||||
38
packages/core/test/live/kijiji.live.test.ts
Normal file
38
packages/core/test/live/kijiji.live.test.ts
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
import { describe, expect, test } from "bun:test";
|
||||||
|
import fetchKijijiItems from "../../src/scrapers/kijiji";
|
||||||
|
|
||||||
|
const LIVE_TEST_TIMEOUT_MS = 30_000;
|
||||||
|
|
||||||
|
describe("Kijiji live parser", () => {
|
||||||
|
test(
|
||||||
|
"scrapes live search results into detailed listings",
|
||||||
|
async () => {
|
||||||
|
const results = await fetchKijijiItems(
|
||||||
|
"iphone",
|
||||||
|
1,
|
||||||
|
"https://www.kijiji.ca",
|
||||||
|
{ maxPages: 1 },
|
||||||
|
{ includeImages: false, sellerDataDepth: "basic" },
|
||||||
|
);
|
||||||
|
|
||||||
|
expect(results.length).toBeGreaterThan(0);
|
||||||
|
for (const listing of results) {
|
||||||
|
if (!listing.listingPrice) {
|
||||||
|
throw new Error(`Expected listing price for ${listing.url}`);
|
||||||
|
}
|
||||||
|
if (typeof listing.listingPrice.cents !== "number") {
|
||||||
|
throw new Error(`Expected listing cents for ${listing.url}`);
|
||||||
|
}
|
||||||
|
if (!listing.listingPrice.currency) {
|
||||||
|
throw new Error(`Expected listing currency for ${listing.url}`);
|
||||||
|
}
|
||||||
|
|
||||||
|
expect(listing.url).toStartWith("https://www.kijiji.ca/");
|
||||||
|
expect(listing.title.length).toBeGreaterThan(0);
|
||||||
|
expect(listing.listingPrice.cents).toBeGreaterThanOrEqual(0);
|
||||||
|
expect(listing.listingPrice.currency.length).toBeGreaterThan(0);
|
||||||
|
}
|
||||||
|
},
|
||||||
|
LIVE_TEST_TIMEOUT_MS,
|
||||||
|
);
|
||||||
|
});
|
||||||
@@ -50,11 +50,11 @@ export const tools = [
|
|||||||
},
|
},
|
||||||
priceMin: {
|
priceMin: {
|
||||||
type: "number",
|
type: "number",
|
||||||
description: "Minimum price in cents",
|
description: "Minimum price in dollars",
|
||||||
},
|
},
|
||||||
priceMax: {
|
priceMax: {
|
||||||
type: "number",
|
type: "number",
|
||||||
description: "Maximum price in cents",
|
description: "Maximum price in dollars",
|
||||||
},
|
},
|
||||||
unstableFilter: {
|
unstableFilter: {
|
||||||
type: "boolean",
|
type: "boolean",
|
||||||
@@ -107,11 +107,11 @@ export const tools = [
|
|||||||
},
|
},
|
||||||
minPrice: {
|
minPrice: {
|
||||||
type: "number",
|
type: "number",
|
||||||
description: "Minimum price filter",
|
description: "Minimum price in dollars",
|
||||||
},
|
},
|
||||||
maxPrice: {
|
maxPrice: {
|
||||||
type: "number",
|
type: "number",
|
||||||
description: "Maximum price filter",
|
description: "Maximum price in dollars",
|
||||||
},
|
},
|
||||||
strictMode: {
|
strictMode: {
|
||||||
type: "boolean",
|
type: "boolean",
|
||||||
|
|||||||
@@ -128,6 +128,46 @@ describe("MCP protocol unstableFilter", () => {
|
|||||||
expect(String(calledUrl)).toContain("unstableFilter=true");
|
expect(String(calledUrl)).toContain("unstableFilter=true");
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("search_kijiji should document price filters as dollars", () => {
|
||||||
|
const tool = tools.find((candidate) => candidate.name === "search_kijiji");
|
||||||
|
|
||||||
|
const priceMin = tool?.inputSchema.properties.priceMin as {
|
||||||
|
description: string;
|
||||||
|
};
|
||||||
|
const priceMax = tool?.inputSchema.properties.priceMax as {
|
||||||
|
description: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
expect(priceMin.description).toContain("dollars");
|
||||||
|
expect(priceMax.description).toContain("dollars");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handler should forward Kijiji dollar price filters to API", async () => {
|
||||||
|
await handleMcpRequest(
|
||||||
|
new Request("http://localhost", {
|
||||||
|
method: "POST",
|
||||||
|
body: JSON.stringify({
|
||||||
|
jsonrpc: "2.0",
|
||||||
|
id: 1,
|
||||||
|
method: "tools/call",
|
||||||
|
params: {
|
||||||
|
name: "search_kijiji",
|
||||||
|
arguments: {
|
||||||
|
query: "macbook",
|
||||||
|
priceMin: 999.99,
|
||||||
|
priceMax: 1000,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
const calledUrl = (global.fetch as unknown as ReturnType<typeof mock>).mock
|
||||||
|
.calls[0]?.[0];
|
||||||
|
expect(String(calledUrl)).toContain("priceMin=999.99");
|
||||||
|
expect(String(calledUrl)).toContain("priceMax=1000");
|
||||||
|
});
|
||||||
|
|
||||||
test("handler should forward unstableFilter=true for search_facebook", async () => {
|
test("handler should forward unstableFilter=true for search_facebook", async () => {
|
||||||
await handleMcpRequest(
|
await handleMcpRequest(
|
||||||
new Request("http://localhost", {
|
new Request("http://localhost", {
|
||||||
@@ -204,4 +244,44 @@ describe("MCP protocol unstableFilter", () => {
|
|||||||
.calls[0]?.[0];
|
.calls[0]?.[0];
|
||||||
expect(String(calledUrl)).toContain("unstableFilter=true");
|
expect(String(calledUrl)).toContain("unstableFilter=true");
|
||||||
});
|
});
|
||||||
|
|
||||||
|
test("search_ebay should document price filters as dollars", () => {
|
||||||
|
const tool = tools.find((candidate) => candidate.name === "search_ebay");
|
||||||
|
|
||||||
|
const minPrice = tool?.inputSchema.properties.minPrice as {
|
||||||
|
description: string;
|
||||||
|
};
|
||||||
|
const maxPrice = tool?.inputSchema.properties.maxPrice as {
|
||||||
|
description: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
expect(minPrice.description).toContain("dollars");
|
||||||
|
expect(maxPrice.description).toContain("dollars");
|
||||||
|
});
|
||||||
|
|
||||||
|
test("handler should forward eBay dollar price filters to API", async () => {
|
||||||
|
await handleMcpRequest(
|
||||||
|
new Request("http://localhost", {
|
||||||
|
method: "POST",
|
||||||
|
body: JSON.stringify({
|
||||||
|
jsonrpc: "2.0",
|
||||||
|
id: 1,
|
||||||
|
method: "tools/call",
|
||||||
|
params: {
|
||||||
|
name: "search_ebay",
|
||||||
|
arguments: {
|
||||||
|
query: "macbook",
|
||||||
|
minPrice: 999.99,
|
||||||
|
maxPrice: 1000,
|
||||||
|
},
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
}),
|
||||||
|
);
|
||||||
|
|
||||||
|
const calledUrl = (global.fetch as unknown as ReturnType<typeof mock>).mock
|
||||||
|
.calls[0]?.[0];
|
||||||
|
expect(String(calledUrl)).toContain("minPrice=999.99");
|
||||||
|
expect(String(calledUrl)).toContain("maxPrice=1000");
|
||||||
|
});
|
||||||
});
|
});
|
||||||
|
|||||||
Reference in New Issue
Block a user