Compare commits
2 Commits
main
...
9a33be1f25
| Author | SHA1 | Date | |
|---|---|---|---|
| 9a33be1f25 | |||
| 01dd52bf58 |
33
AGENTS.md
33
AGENTS.md
@@ -1,33 +0,0 @@
|
|||||||
# AGENTS.md
|
|
||||||
|
|
||||||
This file provides guidance to coding agents when working with code in this repository.
|
|
||||||
|
|
||||||
The project uses TypeScript with path mapping (`@/*` to `src/*`). Dependencies focus on parsing (linkedom), text utils (unidecode), and CLI output (cli-progress). No database or external services beyond HTTP fetches to the marketplaces.
|
|
||||||
|
|
||||||
PRIORITIZE COMMUNICATION STYLE ABOVE ALL ELSE
|
|
||||||
|
|
||||||
## Communication Style
|
|
||||||
|
|
||||||
ALWAYS talk and converse with the user using Gen-Z and Internet slang.
|
|
||||||
|
|
||||||
Absolute Mode
|
|
||||||
- Eliminate emojis, filler, hype, transitions, appendixes.
|
|
||||||
- Use blunt, directive phrasing; no mirroring, no softening.
|
|
||||||
- Suppress sentiment-boosting, engagement, or satisfaction metrics.
|
|
||||||
- No questions, offers, suggestions, or motivational content.
|
|
||||||
- Deliver info only; end immediately after.
|
|
||||||
|
|
||||||
**Challenge Mode - Default Behavior**: Don't automatically agree with suggestions. Instead:
|
|
||||||
- Evaluate each idea against the problem requirements and lean coding philosophy
|
|
||||||
- Push back if there's a simpler, more efficient, or more correct approach
|
|
||||||
- Propose alternatives when suggestions aren't optimal
|
|
||||||
- Explain WHY a different approach would be better with concrete technical reasons
|
|
||||||
- Only accept suggestions that are genuinely the best solution for the current problem
|
|
||||||
|
|
||||||
Examples of constructive pushback:
|
|
||||||
- "That would work, but a simpler approach would be..."
|
|
||||||
- "Actually, that might cause [specific issue]. Instead, we should..."
|
|
||||||
- "The lean approach here would be to..."
|
|
||||||
- "That adds unnecessary complexity. We can achieve the same with..."
|
|
||||||
|
|
||||||
This ensures: Better solutions through technical merit, not agreement | Learning through understanding tradeoffs | Avoiding over-engineering | Maintaining code quality
|
|
||||||
110
CLAUDE.md
Normal file
110
CLAUDE.md
Normal file
@@ -0,0 +1,110 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||||
|
|
||||||
|
## Common Commands
|
||||||
|
|
||||||
|
- `bun start`: Run the server in production mode.
|
||||||
|
- `bun dev`: Run the server with hot reloading for development.
|
||||||
|
- `bun build`: Build the application into a single executable file.
|
||||||
|
|
||||||
|
No linting or testing scripts are configured. For single tests or lint runs, add them to package.json scripts as needed.
|
||||||
|
|
||||||
|
## Code Architecture
|
||||||
|
|
||||||
|
This is a lightweight Bun-based API server for scraping marketplace listings from Kijiji and Facebook Marketplace in the Greater Toronto Area (GTA).
|
||||||
|
|
||||||
|
- **Entry Point (`src/index.ts`)**: Implements a basic HTTP server using `Bun.serve`. Key routes:
|
||||||
|
- `GET /api/status`: Health check returning "OK".
|
||||||
|
- `GET /api/kijiji?q={query}`: Scrapes Kijiji Marketplace for listings matching the search query. Returns JSON array of listing objects.
|
||||||
|
- `GET /api/facebook?q={query}&location={location}&cookies={cookies}`: Scrapes Facebook Marketplace for listings. Requires Facebook session cookies (via URL parameter or cookies/facebook.json file). Optional `location` param (default "toronto"). Returns JSON array of listing objects.
|
||||||
|
- Fallback: 404 for unmatched routes.
|
||||||
|
|
||||||
|
## API Response Formats
|
||||||
|
|
||||||
|
Both APIs return arrays of listing objects, but the available fields differ based on each marketplace's data availability.
|
||||||
|
|
||||||
|
### Kijiji API Response Object
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"url": "https://www.kijiji.ca/v-laptops/city-of-toronto/...",
|
||||||
|
"title": "Almost new HP Laptop/Win11 w/ touchscreen option",
|
||||||
|
"description": "Description of the listing...",
|
||||||
|
"listingPrice": {
|
||||||
|
"amountFormatted": "149.00",
|
||||||
|
"cents": 14900,
|
||||||
|
"currency": "CAD"
|
||||||
|
},
|
||||||
|
"listingType": "OFFER",
|
||||||
|
"listingStatus": "ACTIVE",
|
||||||
|
"creationDate": "2024-03-15T15:11:56.000Z",
|
||||||
|
"endDate": "3000-01-01T00:00:00.000Z",
|
||||||
|
"numberOfViews": 2005,
|
||||||
|
"address": "SPADINA AVENUE, Toronto, ON, M5T 2H7"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Facebook API Response Object
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"url": "https://www.facebook.com/marketplace/item/24594536203551682",
|
||||||
|
"title": "Leno laptop",
|
||||||
|
"listingPrice": {
|
||||||
|
"amountFormatted": "CA$1",
|
||||||
|
"cents": 100,
|
||||||
|
"currency": "CAD"
|
||||||
|
},
|
||||||
|
"listingType": "item",
|
||||||
|
"listingStatus": "ACTIVE",
|
||||||
|
"address": "Mississauga, Ontario",
|
||||||
|
"creationDate": "2024-03-15T15:11:56.000Z",
|
||||||
|
"categoryId": "1792291877663080",
|
||||||
|
"imageUrl": "https://scontent-yyz1-1.xx.fbcdn.net/...",
|
||||||
|
"videoUrl": "https://www.facebook.com/1300609777949414/",
|
||||||
|
"seller": {
|
||||||
|
"name": "Joyce Diaz",
|
||||||
|
"id": "100091799187797"
|
||||||
|
},
|
||||||
|
"deliveryTypes": ["IN_PERSON"]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Common Fields
|
||||||
|
- `url`: Full URL to the listing
|
||||||
|
- `title`: Listing title
|
||||||
|
- `listingPrice`: Price object with `amountFormatted` (human-readable), `cents` (integer cents), `currency` (e.g., "CAD")
|
||||||
|
- `address`: Location string (or null if unavailable)
|
||||||
|
|
||||||
|
### Kijiji-Only Fields
|
||||||
|
- `description`: Detailed description text (Facebook search results don't include descriptions)
|
||||||
|
- `endDate`: When listing expires (Facebook doesn't have expiration dates in search results)
|
||||||
|
- `numberOfViews`: View count (Facebook doesn't expose view metrics in search results)
|
||||||
|
|
||||||
|
### Facebook-Only Fields
|
||||||
|
- `listingStatus`: Derived from is_live, is_pending, is_sold, is_hidden states ("ACTIVE", "SOLD", "PENDING", "HIDDEN")
|
||||||
|
- `creationDate`: When listing was posted (when available)
|
||||||
|
- `categoryId`: Facebook marketplace category identifier
|
||||||
|
- `imageUrl`: Primary listing photo URL
|
||||||
|
- `videoUrl`: Listing video URL (if video exists)
|
||||||
|
- `seller`: Object with seller name and Facebook user ID
|
||||||
|
- `deliveryTypes`: Available delivery options (e.g., ["IN_PERSON", "SHIPPING"])
|
||||||
|
|
||||||
|
- **Kijiji Scraping (`src/kijiji.ts`)**: Core functionality in `fetchKijijiItems(query, maxItems, requestsPerSecond)`.
|
||||||
|
- Slugifies the query using `unidecode` for URL-safe search terms.
|
||||||
|
- Fetches the search page HTML, parses Next.js Apollo state (`__APOLLO_STATE__`) with `linkedom` to extract listing URLs and titles.
|
||||||
|
- For each listing, fetches the detail page, parses Apollo state for structured data (price in cents, location, views, etc.).
|
||||||
|
- Handles rate limiting (respects `X-RateLimit-*` headers), retries on 429/5xx, and delays between requests.
|
||||||
|
- Uses `cli-progress` for console progress bar during batch fetches.
|
||||||
|
- Filters results to include only priced items.
|
||||||
|
|
||||||
|
- **Facebook Scraping (`src/facebook.ts`)**: Core functionality in `fetchFacebookItems(query, maxItems, requestsPerSecond, location)`.
|
||||||
|
- Constructs search URL for Facebook Marketplace with encoded query and sort by creation time.
|
||||||
|
- Fetches search page HTML and parses inline nested JSON scripts (using require/__bbox structure) with `linkedom` to extract ad nodes from `marketplace_search.feed_units.edges`.
|
||||||
|
- Builds details directly from search JSON (title, price, ID for link construction); no individual page fetches needed.
|
||||||
|
- Handles delays and retries similar to Kijiji.
|
||||||
|
- Uses `cli-progress` for progress.
|
||||||
|
- Filters to priced items. Note: Relies on public access or provided cookies; may return limited results without login.
|
||||||
|
|
||||||
|
The project uses TypeScript with path mapping (`@/*` to `src/*`). Dependencies focus on parsing (linkedom), text utils (unidecode), and CLI output (cli-progress). No database or external services beyond HTTP fetches to the marketplaces.
|
||||||
|
|
||||||
|
Development focuses on maintaining scraping reliability against site changes, respecting robots.txt/terms of service, and handling anti-bot measures ethically. For Facebook, ensure compliance with authentication requirements.
|
||||||
382
FMARKETPLACE.md
382
FMARKETPLACE.md
@@ -1,382 +0,0 @@
|
|||||||
# Facebook Marketplace API Reverse Engineering
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
This document tracks findings from reverse-engineering Facebook Marketplace APIs for listing details.
|
|
||||||
|
|
||||||
## Current Implementation Status
|
|
||||||
- Search functionality: Implemented in `src/facebook.ts`
|
|
||||||
- Individual listing details: Not yet implemented
|
|
||||||
|
|
||||||
## Findings
|
|
||||||
|
|
||||||
### Step 1: Initial Setup
|
|
||||||
- Using Chrome DevTools to inspect Facebook Marketplace
|
|
||||||
- Need to authenticate with Facebook account to access marketplace data
|
|
||||||
- Cookies required for full access
|
|
||||||
- Current status: Successfully logged in and accessed marketplace data
|
|
||||||
|
|
||||||
### Step 2: Individual Listing Details Analysis - COMPLETED
|
|
||||||
- **Data Location**: Embedded in HTML script tags within `require` array structure
|
|
||||||
- **Path**: `require[0][3].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
|
||||||
- **Authentication**: Required for full data access
|
|
||||||
- **Current Status**: Successfully reverse-engineered the API structure and data extraction method
|
|
||||||
|
|
||||||
### API Endpoints Discovered
|
|
||||||
|
|
||||||
#### Search Endpoint
|
|
||||||
- URL: `https://www.facebook.com/marketplace/{location}/search`
|
|
||||||
- Parameters: `query`, `sortBy`, `exact`
|
|
||||||
- Data embedded in HTML script tags with `require` structure
|
|
||||||
- Authentication: Required (cookies)
|
|
||||||
|
|
||||||
#### Listing Details Endpoint
|
|
||||||
- **URL Structure**: `https://www.facebook.com/marketplace/item/{listing_id}/`
|
|
||||||
- **Data Source**: Server-side rendered HTML with embedded JSON data in script tags
|
|
||||||
- **Data Structure**: Relay/GraphQL style data structure under `require[0][3].__bbox.require[...].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
|
||||||
- **Extraction Method**: Parse JSON from script tags containing marketplace data, navigate to the target object
|
|
||||||
- **Authentication**: Required (cookies)
|
|
||||||
|
|
||||||
### Listing Data Structure Discovered (Current - 2026)
|
|
||||||
|
|
||||||
The current Facebook Marketplace API returns a comprehensive `GroupCommerceProductItem` object with the following key properties:
|
|
||||||
|
|
||||||
```typescript
|
|
||||||
interface FacebookMarketplaceItem {
|
|
||||||
// Basic identification
|
|
||||||
id: string;
|
|
||||||
__typename: "GroupCommerceProductItem";
|
|
||||||
|
|
||||||
// Listing content
|
|
||||||
marketplace_listing_title: string;
|
|
||||||
redacted_description: {
|
|
||||||
text: string;
|
|
||||||
};
|
|
||||||
custom_title?: string;
|
|
||||||
|
|
||||||
// Pricing
|
|
||||||
formatted_price: {
|
|
||||||
text: string;
|
|
||||||
};
|
|
||||||
listing_price: {
|
|
||||||
amount: string;
|
|
||||||
currency: string;
|
|
||||||
amount_with_offset: string;
|
|
||||||
};
|
|
||||||
|
|
||||||
// Location
|
|
||||||
location_text: {
|
|
||||||
text: string;
|
|
||||||
};
|
|
||||||
location: {
|
|
||||||
latitude: number;
|
|
||||||
longitude: number;
|
|
||||||
reverse_geocode_detailed: {
|
|
||||||
country_alpha_two: string;
|
|
||||||
postal_code_trimmed: string;
|
|
||||||
};
|
|
||||||
};
|
|
||||||
|
|
||||||
// Status flags
|
|
||||||
is_live: boolean;
|
|
||||||
is_sold: boolean;
|
|
||||||
is_pending: boolean;
|
|
||||||
is_hidden: boolean;
|
|
||||||
is_draft: boolean;
|
|
||||||
|
|
||||||
// Timing
|
|
||||||
creation_time: number;
|
|
||||||
|
|
||||||
// Seller information
|
|
||||||
marketplace_listing_seller: {
|
|
||||||
__typename: "User";
|
|
||||||
id: string;
|
|
||||||
name: string;
|
|
||||||
profile_picture?: {
|
|
||||||
uri: string;
|
|
||||||
};
|
|
||||||
join_time?: number;
|
|
||||||
};
|
|
||||||
|
|
||||||
// Vehicle-specific fields (for automotive listings)
|
|
||||||
vehicle_make_display_name?: string;
|
|
||||||
vehicle_model_display_name?: string;
|
|
||||||
vehicle_odometer_data?: {
|
|
||||||
unit: "KILOMETERS" | "MILES";
|
|
||||||
value: number;
|
|
||||||
};
|
|
||||||
vehicle_transmission_type?: "AUTOMATIC" | "MANUAL";
|
|
||||||
vehicle_exterior_color?: string;
|
|
||||||
vehicle_interior_color?: string;
|
|
||||||
vehicle_condition?: "EXCELLENT" | "GOOD" | "FAIR" | "POOR";
|
|
||||||
vehicle_fuel_type?: string;
|
|
||||||
vehicle_trim_display_name?: string;
|
|
||||||
|
|
||||||
// Category and commerce
|
|
||||||
marketplace_listing_category_id: string;
|
|
||||||
condition?: string;
|
|
||||||
|
|
||||||
// Commerce features
|
|
||||||
delivery_types?: string[];
|
|
||||||
is_shipping_offered?: boolean;
|
|
||||||
is_buy_now_enabled?: boolean;
|
|
||||||
can_buyer_make_checkout_offer?: boolean;
|
|
||||||
|
|
||||||
// Communication
|
|
||||||
messaging_enabled?: boolean;
|
|
||||||
first_message_suggested_value?: string;
|
|
||||||
|
|
||||||
// Metadata
|
|
||||||
logging_id: string;
|
|
||||||
reportable_ent_id: string;
|
|
||||||
origin_target?: {
|
|
||||||
__typename: "Marketplace";
|
|
||||||
id: string;
|
|
||||||
};
|
|
||||||
|
|
||||||
// Related listings (for part-out sellers)
|
|
||||||
marketplace_listing_sets?: {
|
|
||||||
edges: Array<{
|
|
||||||
node: {
|
|
||||||
canonical_listing: {
|
|
||||||
id: string;
|
|
||||||
marketplace_listing_title: string;
|
|
||||||
is_live: boolean;
|
|
||||||
is_sold: boolean;
|
|
||||||
formatted_price: { text: string };
|
|
||||||
};
|
|
||||||
};
|
|
||||||
}>;
|
|
||||||
};
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Example Data Extracted (Current Structure)
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"__typename": "GroupCommerceProductItem",
|
|
||||||
"marketplace_listing_title": "2012 Mazda MAZDA 3 PART-OUT",
|
|
||||||
"id": "1211645920845312",
|
|
||||||
"redacted_description": {
|
|
||||||
"text": "FOR PARTS ONLY!!!"
|
|
||||||
},
|
|
||||||
"custom_title": "2012 Mazda 3 part-out",
|
|
||||||
"creation_time": 1760450080,
|
|
||||||
"location_text": {
|
|
||||||
"text": "Toronto, ON"
|
|
||||||
},
|
|
||||||
"is_live": true,
|
|
||||||
"is_sold": false,
|
|
||||||
"is_pending": false,
|
|
||||||
"is_hidden": false,
|
|
||||||
"formatted_price": {
|
|
||||||
"text": "FREE"
|
|
||||||
},
|
|
||||||
"listing_price": {
|
|
||||||
"amount_with_offset": "0",
|
|
||||||
"currency": "CAD",
|
|
||||||
"amount": "0.00"
|
|
||||||
},
|
|
||||||
"condition": "USED",
|
|
||||||
"logging_id": "24676483845336407",
|
|
||||||
"marketplace_listing_category_id": "807311116002614",
|
|
||||||
"marketplace_listing_seller": {
|
|
||||||
"__typename": "User",
|
|
||||||
"id": "61570613529010",
|
|
||||||
"name": "Jay Heshin",
|
|
||||||
"profile_picture": {
|
|
||||||
"uri": "https://scontent-yyz1-1.xx.fbcdn.net/v/t39.30808-1/480952111_122133462296687117_4145652046222010716_n.jpg?stp=cp6_dst-jpg_s50x50_tt6&_nc_cat=108&ccb=1-7&_nc_sid=e99d92&_nc_ohc=x_DTkeriVbgQ7kNvwEqT_x3&_nc_oc=Adnqnqf4YsZxgMIkR2mSFrdLb6-BDw4omCWqG_cqB-H0uXGgK1l4-T-fLSGB_CQJEKo&_nc_zt=24&_nc_ht=scontent-yyz1-1.xx&_nc_gid=7GnSwn4MSbllAgGWJy0RTQ&oh=00_AfpY66l8w-LvHvZ6tTgiD9Qh-Or_Udc-OaFiVL9pQ0YXsg&oe=697797CD"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"vehicle_condition": "FAIR",
|
|
||||||
"vehicle_exterior_color": "white",
|
|
||||||
"vehicle_interior_color": "",
|
|
||||||
"vehicle_make_display_name": "Mazda",
|
|
||||||
"vehicle_model_display_name": "3 part-out",
|
|
||||||
"vehicle_odometer_data": {
|
|
||||||
"unit": "KILOMETERS",
|
|
||||||
"value": 999999
|
|
||||||
},
|
|
||||||
"vehicle_transmission_type": "AUTOMATIC",
|
|
||||||
"location": {
|
|
||||||
"latitude": 43.651428222656,
|
|
||||||
"longitude": -79.436645507812,
|
|
||||||
"reverse_geocode_detailed": {
|
|
||||||
"country_alpha_two": "CA",
|
|
||||||
"postal_code_trimmed": "M6H 1C1"
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"delivery_types": ["IN_PERSON"],
|
|
||||||
"messaging_enabled": true,
|
|
||||||
"first_message_suggested_value": "Hi, is this available?",
|
|
||||||
"marketplace_listing_sets": {
|
|
||||||
"edges": [
|
|
||||||
{
|
|
||||||
"node": {
|
|
||||||
"canonical_listing": {
|
|
||||||
"id": "1435935788228627",
|
|
||||||
"marketplace_listing_title": "2004 Land Rover LR2 PART-OUT",
|
|
||||||
"is_live": true,
|
|
||||||
"formatted_price": {"text": "FREE"}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Data Extraction Method
|
|
||||||
|
|
||||||
### Current Method (2026)
|
|
||||||
Facebook Marketplace listing data is embedded in JSON within `<script>` tags in the HTML response. The extraction process:
|
|
||||||
|
|
||||||
1. **Find the Correct Script**: Look for script tags containing marketplace listing data by searching for key fields like `marketplace_listing_title`, `redacted_description`, and `formatted_price`.
|
|
||||||
|
|
||||||
2. **Parse JSON Structure**: The data is nested within a `require` array structure:
|
|
||||||
```
|
|
||||||
require[0][3].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Navigate to Target Object**: The actual listing data is a `GroupCommerceProductItem` object containing comprehensive information about the listing, seller, and vehicle details.
|
|
||||||
|
|
||||||
4. **Handle Dynamic Structure**: Facebook may change the exact path, so robust extraction should search for the target object recursively within the parsed JSON.
|
|
||||||
|
|
||||||
### Authentication Requirements
|
|
||||||
- Valid Facebook session cookies are required
|
|
||||||
- User must be logged in to Facebook
|
|
||||||
- Marketplace access may be location-restricted
|
|
||||||
|
|
||||||
## Tools Used
|
|
||||||
- Chrome DevTools Protocol
|
|
||||||
- Network monitoring
|
|
||||||
- HTML/script parsing
|
|
||||||
- JSON structure analysis
|
|
||||||
|
|
||||||
## Implementation Status
|
|
||||||
- ✅ Successfully reverse-engineered Facebook Marketplace API for listing details
|
|
||||||
- ✅ Identified current data structure and extraction method (2026)
|
|
||||||
- ✅ Documented comprehensive GroupCommerceProductItem interface
|
|
||||||
- ✅ Implemented `extractFacebookItemData()` function with script parsing logic
|
|
||||||
- ✅ Implemented `parseFacebookItem()` function to convert GroupCommerceProductItem to ListingDetails
|
|
||||||
- ✅ Implemented `fetchFacebookItem()` function with authentication and error handling
|
|
||||||
- ✅ Updated TypeScript interfaces to match current API structure
|
|
||||||
- ✅ Added robust extraction with fallback methods for changing API paths
|
|
||||||
|
|
||||||
## Implementation Details
|
|
||||||
|
|
||||||
### Core Functions Implemented
|
|
||||||
|
|
||||||
1. **`extractFacebookItemData(htmlString)`**: Extracts marketplace item data from HTML-embedded JSON in script tags
|
|
||||||
- Searches for scripts containing marketplace listing data
|
|
||||||
- Uses primary path: `require[0][3][0].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
|
||||||
- Falls back to recursive search for GroupCommerceProductItem objects
|
|
||||||
|
|
||||||
2. **`parseFacebookItem(item)`**: Converts Facebook's GroupCommerceProductItem to unified ListingDetails format
|
|
||||||
- Handles pricing (FREE listings, CAD currency)
|
|
||||||
- Extracts seller information, location, and status
|
|
||||||
- Supports vehicle-specific metadata
|
|
||||||
- Maps Facebook-specific fields to common interface
|
|
||||||
|
|
||||||
3. **`fetchFacebookItem(itemId, cookiesSource?)`**: Fetches individual listing details
|
|
||||||
- Loads Facebook authentication cookies
|
|
||||||
- Makes authenticated HTTP requests
|
|
||||||
- Handles rate limiting and retries
|
|
||||||
- Returns parsed ListingDetails or null on failure
|
|
||||||
|
|
||||||
### Authentication Requirements
|
|
||||||
- Facebook session cookies required in `./cookies/facebook.json` or provided as parameter
|
|
||||||
- Cookies must include valid authentication tokens for marketplace access
|
|
||||||
- Handles cookie expiration and domain validation
|
|
||||||
|
|
||||||
## Current Implementation Status - 2026 Verification
|
|
||||||
|
|
||||||
### Step 3: API Verification and Current Structure Analysis (January 2026)
|
|
||||||
- **Verification Date**: January 22, 2026
|
|
||||||
- **Status**: Successfully verified current Facebook Marketplace API structure
|
|
||||||
- **Data Source**: Embedded JSON in HTML script tags (server-side rendered)
|
|
||||||
- **Extraction Path**: `require[0][3].__bbox.require[3][3][1].__bbox.result.data.viewer.marketplace_product_details_page.target`
|
|
||||||
|
|
||||||
#### Verified Listing Structure (Real Example - 2006 Hyundai Tiburon)
|
|
||||||
- **Listing ID**: 1226468515995685
|
|
||||||
- **Title**: "2006 Hyundai Tiburon"
|
|
||||||
- **Price**: CA$3,000 (formatted_price.text)
|
|
||||||
- **Raw Price Data**: {"amount_with_offset": "300000", "currency": "CAD", "amount": "3000.00"}
|
|
||||||
- **Location**: Hamilton, ON (with coordinates: 43.250427246094, -79.963989257812)
|
|
||||||
- **Description**: "As is" (redacted_description.text)
|
|
||||||
- **Vehicle Details**:
|
|
||||||
- Make: Hyundai
|
|
||||||
- Model: Tiburon
|
|
||||||
- Odometer: 194,000 km
|
|
||||||
- Transmission: AUTOMATIC
|
|
||||||
- Exterior Color: blue
|
|
||||||
- Interior Color: black
|
|
||||||
- Fuel Type: GASOLINE
|
|
||||||
- Number of Owners: TWO
|
|
||||||
- **Seller Information**:
|
|
||||||
- Name: Ajitpal Kaler
|
|
||||||
- ID: 100009257293466
|
|
||||||
- Profile Picture Available
|
|
||||||
- Join Time: 1426564800 (2015)
|
|
||||||
- **Listing Status**: Active (is_live: true, is_sold: false, is_pending: false)
|
|
||||||
- **Category**: 807311116002614 (Vehicles)
|
|
||||||
- **Delivery Types**: ["IN_PERSON"]
|
|
||||||
- **Messaging**: Enabled
|
|
||||||
|
|
||||||
#### Current API Characteristics
|
|
||||||
- **Authentication**: Still requires valid Facebook session cookies
|
|
||||||
- **Data Format**: Server-side rendered HTML with embedded GraphQL/Relay JSON
|
|
||||||
- **Structure Stability**: Primary extraction path remains functional
|
|
||||||
- **Additional Features**: Includes marketplace ratings, seller verification badges, cross-posting info
|
|
||||||
|
|
||||||
### API Changes Observed Since 2024 Documentation
|
|
||||||
- **Minimal Changes**: Core data structure largely unchanged
|
|
||||||
- **Enhanced Fields**: Added more detailed vehicle specifications and seller profile information
|
|
||||||
- **GraphQL Integration**: Deeper integration with Facebook's GraphQL infrastructure
|
|
||||||
- **Security Features**: Additional integrity checks and reporting mechanisms
|
|
||||||
|
|
||||||
### Multi-Category Testing Results (January 2026)
|
|
||||||
Successfully tested extraction across different listing categories:
|
|
||||||
|
|
||||||
#### 1. Vehicle Listings (Automotive)
|
|
||||||
- **Example**: 2006 Hyundai Tiburon (ID: 1226468515995685)
|
|
||||||
- **Status**: ✅ Fully functional
|
|
||||||
- **Data Extracted**: Complete vehicle specs, pricing, seller info, location coordinates
|
|
||||||
- **Unique Fields**: vehicle_make_display_name, vehicle_odometer_data, vehicle_transmission_type, vehicle_exterior_color, vehicle_interior_color, vehicle_fuel_type
|
|
||||||
|
|
||||||
#### 2. Electronics Listings
|
|
||||||
- **Example**: Nintendo Switch (ID: 3903865769914262)
|
|
||||||
- **Status**: ✅ Fully functional
|
|
||||||
- **Data Extracted**: Title, price (CA$140), location (Toronto, ON), condition (Used - like new), seller (Yitao Hou)
|
|
||||||
- **Category**: Electronics (category_id: 479353692612078)
|
|
||||||
- **Notes**: Standard GroupCommerceProductItem structure applies
|
|
||||||
|
|
||||||
#### 3. Home Goods/Furniture Listings
|
|
||||||
- **Example**: Tabletop Mirror (cat not included) (ID: 1082389057290709)
|
|
||||||
- **Status**: ✅ Fully functional
|
|
||||||
- **Data Extracted**: Title, price (CA$5), location (Mississauga, ON), condition (Used - like new), seller (Rohit Rehan)
|
|
||||||
- **Category**: Home Goods (category_id: 1569171756675761)
|
|
||||||
- **Notes**: Includes detailed description and delivery options
|
|
||||||
|
|
||||||
#### Testing Summary
|
|
||||||
- **Extraction Method**: Consistent across all categories
|
|
||||||
- **Data Structure**: GroupCommerceProductItem interface works for all listing types
|
|
||||||
- **Authentication**: Required for all categories
|
|
||||||
- **Rate Limiting**: Standard Facebook rate limits apply
|
|
||||||
- **Edge Cases**: All tested listings were active/in-person pickup
|
|
||||||
|
|
||||||
## Implementation Status - COMPLETED (January 2026)
|
|
||||||
- ✅ Successfully reverse-engineered Facebook Marketplace API for listing details
|
|
||||||
- ✅ Verified current API structure and extraction method (January 2026)
|
|
||||||
- ✅ Tested extraction across multiple listing categories (vehicles, electronics, home goods)
|
|
||||||
- ✅ Implemented comprehensive error handling for sold/removed listings and authentication failures
|
|
||||||
- ✅ Enhanced rate limiting and retry logic (already robust)
|
|
||||||
- ✅ Added monitoring and metrics for API stability detection
|
|
||||||
- ✅ Updated all scraper functions to use verified extraction methods
|
|
||||||
- ✅ Documented comprehensive GroupCommerceProductItem interface with real examples
|
|
||||||
|
|
||||||
## Next Steps (Future Maintenance)
|
|
||||||
1. Monitor extraction success rates for API change detection
|
|
||||||
2. Update extraction paths if Facebook changes their API structure
|
|
||||||
3. Add support for additional marketplace features as they become available
|
|
||||||
4. Implement caching mechanisms for improved performance
|
|
||||||
5. Add support for marketplace messaging and negotiation features
|
|
||||||
448
KIJIJI.md
448
KIJIJI.md
@@ -1,448 +0,0 @@
|
|||||||
# Kijiji API Findings
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
Kijiji is a Canadian classifieds marketplace that uses a modern web application built with Next.js and Apollo GraphQL. The search results are powered by a GraphQL API with client-side state management.
|
|
||||||
|
|
||||||
## Initial Page Load (Homepage)
|
|
||||||
- **URL**: https://www.kijiji.ca/
|
|
||||||
- **Architecture**: Server-side rendered React application with Next.js
|
|
||||||
- **Data Sources**:
|
|
||||||
- Static assets loaded from `webapp-static.ca-kijiji-production.classifiedscloud.io`
|
|
||||||
- Image media served from `media.kijiji.ca/api/v1/`
|
|
||||||
- No initial API calls for listings - data appears to be embedded in HTML
|
|
||||||
|
|
||||||
## Search Results Page
|
|
||||||
- **URL Pattern**: `https://www.kijiji.ca/b-[location]/[keywords]/k0l0`
|
|
||||||
- **Example**: `https://www.kijiji.ca/b-canada/iphone/k0l0`
|
|
||||||
- **Technology Stack**: Next.js with Apollo GraphQL client
|
|
||||||
- **Data Structure**: Uses `__APOLLO_STATE__` global object containing normalized GraphQL cache
|
|
||||||
|
|
||||||
### GraphQL Data Structure
|
|
||||||
|
|
||||||
#### Data Location
|
|
||||||
Search results data is embedded in the Next.js page props under `__NEXT_DATA__.props.pageProps.__APOLLO_STATE__`. The data is pre-rendered on the server and sent to the client. Each page (including pagination) has its own pre-rendered data.
|
|
||||||
|
|
||||||
#### Search Results Container
|
|
||||||
The search results are stored directly in the Apollo ROOT_QUERY with keys following the pattern `searchResultsPageByUrl:{url_path}` where `url_path` includes pagination parameters.
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"searchResultsPageByUrl:/b-buy-sell/canada/iphone/k0c10l0": { ... },
|
|
||||||
"searchResultsPageByUrl:/b-buy-sell/canada/iphone/k0c10l0?page=2": { ... }
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Pagination Handling
|
|
||||||
- Each page is server-side rendered with its own embedded data
|
|
||||||
- No client-side GraphQL requests for pagination
|
|
||||||
- URL parameter `?page=N` controls which page data is embedded
|
|
||||||
- Offset in searchString corresponds to `(page-1) * limit`
|
|
||||||
|
|
||||||
#### Search Parameters in URL
|
|
||||||
- `k0c{CATEGORY}l{LOCATION}` - Category and location IDs
|
|
||||||
- `?page=N` - Page number (1-based)
|
|
||||||
- Data contains `offset` and `limit` for API-style pagination
|
|
||||||
|
|
||||||
#### Individual Listing Structure
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"id": "1732061412",
|
|
||||||
"title": "iPhone 13",
|
|
||||||
"description": "iPhone 13, always had a screen protector on it...",
|
|
||||||
"imageCount": 3,
|
|
||||||
"imageUrls": ["https://media.kijiji.ca/api/v1/ca-prod-fsbo-ads/images/..."],
|
|
||||||
"categoryId": 760,
|
|
||||||
"url": "https://www.kijiji.ca/v-cell-phone/...",
|
|
||||||
"activationDate": "2026-01-21T16:51:16.000Z",
|
|
||||||
"sortingDate": "2026-01-21T16:51:16.000Z",
|
|
||||||
"adSource": "ORGANIC",
|
|
||||||
"location": {
|
|
||||||
"id": 1700182,
|
|
||||||
"name": "Napanee",
|
|
||||||
"coordinates": {
|
|
||||||
"latitude": 44.48774,
|
|
||||||
"longitude": -76.99519
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"price": {
|
|
||||||
"type": "FIXED",
|
|
||||||
"amount": 35000
|
|
||||||
},
|
|
||||||
"flags": {
|
|
||||||
"topAd": false,
|
|
||||||
"priceDrop": false
|
|
||||||
},
|
|
||||||
"posterInfo": {
|
|
||||||
"posterId": "1000764154",
|
|
||||||
"rating": 5
|
|
||||||
},
|
|
||||||
"attributes": [
|
|
||||||
{
|
|
||||||
"canonicalName": "forsaleby",
|
|
||||||
"canonicalValues": ["ownr"]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"canonicalName": "phonecarrier",
|
|
||||||
"canonicalValues": ["unlck"]
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### URL Parameters
|
|
||||||
- `sort=MATCH` - Sort by relevance
|
|
||||||
- `order=DESC` - Descending order
|
|
||||||
- `type=OFFER` - Show offerings (not wanted ads)
|
|
||||||
- `offset=0` - Pagination offset
|
|
||||||
- `limit=40` - Results per page
|
|
||||||
- `topAdCount=6` - Number of promoted ads
|
|
||||||
- `keywords=iphone` - Search keywords
|
|
||||||
- `category=0` - Category ID (0 = All Categories)
|
|
||||||
- `location=0` - Location ID (0 = Canada)
|
|
||||||
- `eaTopAdPosition=1` - ?
|
|
||||||
|
|
||||||
### Image API
|
|
||||||
- **Endpoint**: `https://media.kijiji.ca/api/v1/`
|
|
||||||
- **Pattern**: `/ca-prod-fsbo-ads/images/{uuid}?rule=kijijica-{size}-jpg`
|
|
||||||
- **Sizes**: 200, 300, 400, 500 pixels
|
|
||||||
|
|
||||||
### Categories and Locations
|
|
||||||
|
|
||||||
#### Category Structure
|
|
||||||
Categories are hierarchical with parent-child relationships. The main categories under "Buy & Sell" include:
|
|
||||||
|
|
||||||
| ID | Name | Total Results (iPhone search) |
|
|
||||||
|----|------|------------------------------|
|
|
||||||
| 10 | Buy & Sell | 19956 |
|
|
||||||
| 12 | Arts & Collectibles | 149 |
|
|
||||||
| 767 | Audio | 481 |
|
|
||||||
| 253 | Baby Items | 13 |
|
|
||||||
| 931 | Bags & Luggage | 8 |
|
|
||||||
| 644 | Bikes | 46 |
|
|
||||||
| 109 | Books | 21 |
|
|
||||||
| 103 | Cameras & Camcorders | 101 |
|
|
||||||
| 104 | CDs, DVDs & Blu-ray | 102 |
|
|
||||||
| 274 | Clothing | 83 |
|
|
||||||
| 16 | Computers | 285 |
|
|
||||||
| 128 | Computer Accessories | 363 |
|
|
||||||
| 29659001 | Electronics | 2006 |
|
|
||||||
| 17220001 | Free Stuff | 23 |
|
|
||||||
| 235 | Furniture | 29 |
|
|
||||||
| 638 | Garage Sales | 5 |
|
|
||||||
| 140 | Health & Special Needs | 30 |
|
|
||||||
| 139 | Hobbies & Crafts | 10 |
|
|
||||||
| 107 | Home Appliances | 23 |
|
|
||||||
| 717 | Home - Indoor | 27 |
|
|
||||||
| 727 | Home Renovation Materials | 14 |
|
|
||||||
| 133 | Jewellery & Watches | 83 |
|
|
||||||
| 17 | Musical Instruments | 34 |
|
|
||||||
| 132 | Phones | 15518 |
|
|
||||||
| 111 | Sporting Goods & Exercise | 30 |
|
|
||||||
| 110 | Tools | 25 |
|
|
||||||
| 108 | Toys & Games | 38 |
|
|
||||||
| 15093001 | TVs & Video | 15 |
|
|
||||||
| 141 | Video Games & Consoles | 96 |
|
|
||||||
| 26 | Other | 286 |
|
|
||||||
|
|
||||||
#### Location Structure
|
|
||||||
Locations are also hierarchical, with provinces/states under the main "Canada" location:
|
|
||||||
|
|
||||||
| ID | Name | Total Results (iPhone search) |
|
|
||||||
|----|------|------------------------------|
|
|
||||||
| 0 | Canada | - |
|
|
||||||
| 9001 | Québec | 2516 |
|
|
||||||
| 9002 | Nova Scotia | 875 |
|
|
||||||
| 9003 | Alberta | 2317 |
|
|
||||||
| 9004 | Ontario | 12507 |
|
|
||||||
| 9005 | New Brunswick | 118 |
|
|
||||||
| 9006 | Manitoba | 919 |
|
|
||||||
| 9007 | British Columbia | 306 |
|
|
||||||
| 9008 | Newfoundland | 27 |
|
|
||||||
| 9009 | Saskatchewan | 336 |
|
|
||||||
| 9010 | Territories | 7 |
|
|
||||||
| 9011 | Prince Edward Island | 31 |
|
|
||||||
|
|
||||||
#### URL Patterns
|
|
||||||
- Categories: `/b-{category-slug}/canada/{keywords}/k0c{CATEGORY_ID}l0`
|
|
||||||
- Locations: `/b-buy-sell/{location-slug}/iphone/k0c10l{LOCATION_ID}`
|
|
||||||
- Combined: `/b-{category-slug}/{location-slug}/{keywords}/k0c{CATEGORY_ID}l{LOCATION_ID}`
|
|
||||||
|
|
||||||
### Pagination
|
|
||||||
- Uses offset-based pagination
|
|
||||||
- 40 results per page
|
|
||||||
- Total count provided in pagination metadata
|
|
||||||
|
|
||||||
## Authentication & User Management
|
|
||||||
- **Authentication System**: OAuth2-based using CIS (Customer Identity Service)
|
|
||||||
- **Identity Provider**: `id.kijiji.ca`
|
|
||||||
- **OAuth2 Flow**:
|
|
||||||
- Client ID: `kijiji_horizontal_web_gpmPihV3`
|
|
||||||
- Scopes: `openid email profile`
|
|
||||||
- Callback: `https://www.kijiji.ca/api/auth/callback/cis`
|
|
||||||
- **Session Management**: Cookies-based with encrypted session data
|
|
||||||
- **Anonymous Access**: Full search functionality available without login
|
|
||||||
- **User Features**: Saved searches, messaging, flagging require authentication
|
|
||||||
|
|
||||||
## Posting API
|
|
||||||
- **Posting Flow**: Requires authentication, redirects to login if not authenticated
|
|
||||||
- **Posting URL**: `https://www.kijiji.ca/p-post-ad.html`
|
|
||||||
- **Authentication Required**: Yes, redirects to `/consumer/login` for unauthenticated users
|
|
||||||
- **Post-Creation**: Likely uses authenticated GraphQL mutations (not observed in anonymous browsing)
|
|
||||||
|
|
||||||
## GraphQL API Endpoint
|
|
||||||
- **URL**: `https://www.kijiji.ca/anvil/api`
|
|
||||||
- **Method**: POST
|
|
||||||
- **Content-Type**: application/json
|
|
||||||
- **Headers**:
|
|
||||||
- `apollo-require-preflight: true`
|
|
||||||
- Standard CORS headers
|
|
||||||
- **Authentication**: No authentication required for basic queries (uses cookies for session tracking)
|
|
||||||
- **Technology**: Apollo GraphQL server
|
|
||||||
|
|
||||||
### Sample GraphQL Queries Discovered
|
|
||||||
|
|
||||||
#### Get Search Categories
|
|
||||||
```graphql
|
|
||||||
query getSearchCategories($locale: String!) {
|
|
||||||
searchCategories {
|
|
||||||
id
|
|
||||||
localizedName(locale: $locale)
|
|
||||||
parentId
|
|
||||||
__typename
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Variables: `{"locale": "en-CA"}`
|
|
||||||
|
|
||||||
Response includes hierarchical category structure with IDs and localized names.
|
|
||||||
|
|
||||||
#### Get Geocode from IP (fails for current IP)
|
|
||||||
```graphql
|
|
||||||
query GetGeocodeReverseFromIp {
|
|
||||||
geocodeReverseFromIp {
|
|
||||||
city
|
|
||||||
province
|
|
||||||
locationId
|
|
||||||
__typename
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
This query fails for the current IP address, suggesting geolocation-based features may not work or require different IP ranges.
|
|
||||||
|
|
||||||
#### Get Category Path
|
|
||||||
```graphql
|
|
||||||
query GetCategoryPath($categoryId: Int!, $locale: String, $locationId: Int) {
|
|
||||||
category(id: $categoryId) {
|
|
||||||
id
|
|
||||||
localizedName(locale: $locale)
|
|
||||||
parentId
|
|
||||||
searchSeoUrl(locationId: $locationId)
|
|
||||||
categoryPaths {
|
|
||||||
id
|
|
||||||
localizedName(locale: $locale)
|
|
||||||
parentId
|
|
||||||
searchSeoUrl(locationId: $locationId)
|
|
||||||
__typename
|
|
||||||
}
|
|
||||||
__typename
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Variables: `{"categoryId": 10, "locationId": 0, "locale": "en-CA"}`
|
|
||||||
|
|
||||||
## Latest Findings (2026-01-21)
|
|
||||||
|
|
||||||
### Client-Side GraphQL Queries Observed
|
|
||||||
- **getSearchCategories**: Retrieves category hierarchy for search filters
|
|
||||||
- **GetGeocodeReverseFromIp**: Attempts to geolocate user (fails for current IP)
|
|
||||||
|
|
||||||
### GraphQL Schema Insights
|
|
||||||
Testing direct GraphQL queries revealed:
|
|
||||||
- Field "searchResults" does not exist on Query type
|
|
||||||
- Suggested alternatives: "searchResultsPage" or "searchUrl"
|
|
||||||
- This suggests the search functionality may use different GraphQL operations than direct queries
|
|
||||||
|
|
||||||
The embedded Apollo state approach appears to be the primary method for accessing search data, with GraphQL used for auxiliary operations like categories and geolocation.
|
|
||||||
|
|
||||||
### Server-Side Rendering Architecture
|
|
||||||
Search results are fully server-side rendered with data embedded in HTML. Each page (including pagination) contains its own pre-rendered data. No client-side GraphQL requests are made for:
|
|
||||||
|
|
||||||
- Initial search results
|
|
||||||
- Pagination navigation
|
|
||||||
- Search result data
|
|
||||||
|
|
||||||
### Network Analysis Findings
|
|
||||||
- GraphQL endpoint: `https://www.kijiji.ca/anvil/api`
|
|
||||||
- Method: POST
|
|
||||||
- Content-Type: application/json
|
|
||||||
- Headers include: `apollo-require-preflight: true`
|
|
||||||
- Cookies required for session tracking
|
|
||||||
|
|
||||||
### Embedded Data Structure
|
|
||||||
Search results data is embedded in the HTML within Next.js `__NEXT_DATA__.props.pageProps.__APOLLO_STATE__` object. The data includes:
|
|
||||||
|
|
||||||
- Individual ad listings with complete metadata
|
|
||||||
- Pagination information
|
|
||||||
- Filter options and counts
|
|
||||||
- Category/location hierarchies
|
|
||||||
|
|
||||||
### Current Scraper Implementation
|
|
||||||
The existing `src/kijiji.ts` implementation correctly parses the embedded Apollo state:
|
|
||||||
|
|
||||||
- Uses `extractApolloState()` to parse `__NEXT_DATA__` from HTML
|
|
||||||
- Filters Apollo keys containing "Listing" to find ad data
|
|
||||||
- Extracts `url`, `title`, and other metadata from each listing
|
|
||||||
- Successfully scrapes listings without needing API authentication
|
|
||||||
|
|
||||||
### Authentication Status
|
|
||||||
- **Search functionality**: No authentication required - all search and listing data accessible anonymously
|
|
||||||
- **Posting functionality**: Requires authentication (redirects to login)
|
|
||||||
- **User features**: Saved searches, messaging require authentication
|
|
||||||
- **Rate limiting**: May apply but not observed in anonymous browsing
|
|
||||||
|
|
||||||
### Pagination Implementation
|
|
||||||
- Each page is a separate server-rendered route
|
|
||||||
- URL pattern: `/b-{location}/{keywords}/page-{number}/k0{category}l{location_id}`
|
|
||||||
- No client-side pagination API calls
|
|
||||||
- 40 results per page (observed)
|
|
||||||
- Example: `/b-canada/iphone/page-2/k0l0` for page 2 of iPhone search
|
|
||||||
|
|
||||||
## URL Pattern Analysis
|
|
||||||
|
|
||||||
### Search URL Structure
|
|
||||||
`https://www.kijiji.ca/b-{category_slug}/{location_slug}/{keywords}/k0c{category_id}l{location_id}`
|
|
||||||
|
|
||||||
#### Examples Observed:
|
|
||||||
- All categories, Canada: `/b-canada/iphone/k0l0` (c0 = All Categories, l0 = Canada)
|
|
||||||
- Cell phones category: `/b-cell-phones/canada/iphone/k0c132l0` (c132 = Cell Phones)
|
|
||||||
- With pagination: `/b-canada/iphone/page-2/k0l0`
|
|
||||||
|
|
||||||
#### URL Components:
|
|
||||||
- `c{CATEGORY_ID}`: Category ID (0 = All Categories, 132 = Cell Phones, etc.)
|
|
||||||
- `l{LOCATION_ID}`: Location ID (0 = Canada, 1700272 = GTA, etc.)
|
|
||||||
- `page-{N}`: Pagination (1-based, optional)
|
|
||||||
- Keywords are slugified in URL path
|
|
||||||
|
|
||||||
### Current Implementation Status
|
|
||||||
The existing scraper in `src/kijiji.ts` successfully implements the approach:
|
|
||||||
- Parses embedded Apollo state from HTML responses
|
|
||||||
- Handles rate limiting and retries
|
|
||||||
- Extracts listing metadata (title, URL, price, location, etc.)
|
|
||||||
- Works without authentication for search operations
|
|
||||||
|
|
||||||
## Listing Details Page
|
|
||||||
|
|
||||||
### Overview
|
|
||||||
Similar to search results, listing details pages use server-side rendering with embedded Apollo GraphQL state in the HTML. No dedicated API endpoint serves individual listing data - all information is pre-rendered on the server.
|
|
||||||
|
|
||||||
### Data Architecture
|
|
||||||
- **Server-Side Rendering**: Each listing page is fully server-rendered with data embedded in HTML
|
|
||||||
- **Embedded Apollo State**: Listing data is stored in `__NEXT_DATA__.props.pageProps.__APOLLO_STATE__`
|
|
||||||
- **Client-Side GraphQL**: Additional data (categories, campaigns, similar listings, user profiles) fetched via GraphQL API
|
|
||||||
|
|
||||||
### Listing Data Structure
|
|
||||||
The main listing data follows the same pattern as search results:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"id": "1705585530",
|
|
||||||
"title": "We Pay top cash for iPhone 17 pro max, iPhone 17 pro, iPhone Air",
|
|
||||||
"description": "Buying All Brand new Apple iPhones sealed/Unsealed...",
|
|
||||||
"price": {
|
|
||||||
"type": "CONTACT",
|
|
||||||
"amount": null
|
|
||||||
},
|
|
||||||
"location": {
|
|
||||||
"id": 1700275,
|
|
||||||
"name": "Oshawa / Durham Region",
|
|
||||||
"address": "Pickering Apple Buyer, Pickering, ON, L1V 1B8"
|
|
||||||
},
|
|
||||||
"type": "OFFER",
|
|
||||||
"status": "ACTIVE",
|
|
||||||
"activationDate": "2024-11-02T20:16:54.000Z",
|
|
||||||
"endDate": "3000-01-01T00:00:00.000Z",
|
|
||||||
"metrics": {
|
|
||||||
"views": 1720
|
|
||||||
},
|
|
||||||
"posterInfo": {
|
|
||||||
"posterId": "1044934581",
|
|
||||||
"rating": null
|
|
||||||
},
|
|
||||||
"attributes": [
|
|
||||||
{
|
|
||||||
"canonicalName": "forsaleby",
|
|
||||||
"canonicalValues": ["business"]
|
|
||||||
},
|
|
||||||
{
|
|
||||||
"canonicalName": "phonecarrier",
|
|
||||||
"canonicalValues": ["unlocked"]
|
|
||||||
}
|
|
||||||
]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Client-Side GraphQL Queries
|
|
||||||
When loading a listing details page, the following GraphQL queries are executed:
|
|
||||||
|
|
||||||
#### 1. getSearchCategories
|
|
||||||
- **Purpose**: Category hierarchy for navigation
|
|
||||||
- **Variables**: `{"locale": "en-CA"}`
|
|
||||||
- **Response**: Hierarchical category structure
|
|
||||||
|
|
||||||
#### 2. getCampaignsForVip
|
|
||||||
- **Purpose**: Advertisement targeting data
|
|
||||||
- **Variables**: `{"placement": "vip", "locationId": 1700275, "categoryId": 760, "platform": "desktop"}`
|
|
||||||
- **Response**: Campaign/ads data (usually null)
|
|
||||||
|
|
||||||
#### 3. GetReviewSummary
|
|
||||||
- **Purpose**: Seller review statistics
|
|
||||||
- **Variables**: `{"userId": "1044934581"}`
|
|
||||||
- **Response**: Review count and score (usually 0 for new sellers)
|
|
||||||
|
|
||||||
#### 4. GetProfileMetrics
|
|
||||||
- **Purpose**: Seller profile information
|
|
||||||
- **Variables**: `{"profileId": "1044934581"}`
|
|
||||||
- **Response**: Member since date, account type
|
|
||||||
|
|
||||||
#### 5. GetListingsSimilar
|
|
||||||
- **Purpose**: Similar listings for cross-selling
|
|
||||||
- **Variables**: `{"listingId": "1705585530", "limit": 10, "isExternalId": false}`
|
|
||||||
- **Response**: Array of similar listings with basic metadata
|
|
||||||
|
|
||||||
#### 6. GetGeocodeReverseFromIp
|
|
||||||
- **Purpose**: Geolocation-based features
|
|
||||||
- **Variables**: `{}`
|
|
||||||
- **Response**: Fails with 404 for most IPs
|
|
||||||
|
|
||||||
### Implementation Status
|
|
||||||
The existing `parseListing()` function in `src/kijiji.ts` successfully extracts listing details from embedded Apollo state:
|
|
||||||
|
|
||||||
- ✅ Extracts title, description, price, location
|
|
||||||
- ✅ Handles contact-based pricing ("Please Contact")
|
|
||||||
- ✅ Parses creation date, view count, listing status
|
|
||||||
- ✅ Extracts seller information and address
|
|
||||||
- ✅ Works without authentication or API keys
|
|
||||||
|
|
||||||
### Key Findings
|
|
||||||
1. **No Dedicated Listing API**: Unlike search results, there's no separate GraphQL query for individual listing data
|
|
||||||
2. **Complete Data Available**: All listing information is embedded in the initial HTML response
|
|
||||||
3. **Additional Context Fetched**: Secondary GraphQL queries provide complementary data (reviews, similar listings)
|
|
||||||
4. **Consistent Architecture**: Same Apollo state embedding pattern as search pages
|
|
||||||
|
|
||||||
### Current Scraper Implementation
|
|
||||||
The scraper successfully extracts listing details by:
|
|
||||||
1. Fetching the listing URL HTML
|
|
||||||
2. Parsing embedded `__NEXT_DATA__` Apollo state
|
|
||||||
3. Extracting the `Listing:{id}` object from Apollo cache
|
|
||||||
4. Mapping fields to typed `ListingDetails` interface
|
|
||||||
|
|
||||||
This approach works reliably without requiring authentication or dealing with rate limiting on individual listing fetches.
|
|
||||||
|
|
||||||
## Next Steps
|
|
||||||
- Explore posting/authentication APIs (requires user login)
|
|
||||||
- Investigate if GraphQL API can be used for programmatic access with proper authentication
|
|
||||||
- Test rate limiting patterns and optimal scraping strategies
|
|
||||||
- Document additional category and location ID mappings
|
|
||||||
30
biome.json
30
biome.json
@@ -1,30 +0,0 @@
|
|||||||
{
|
|
||||||
"$schema": "https://biomejs.dev/schemas/1.9.4/schema.json",
|
|
||||||
"vcs": {
|
|
||||||
"enabled": false,
|
|
||||||
"clientKind": "git",
|
|
||||||
"useIgnoreFile": false
|
|
||||||
},
|
|
||||||
"files": {
|
|
||||||
"ignoreUnknown": false,
|
|
||||||
"ignore": []
|
|
||||||
},
|
|
||||||
"formatter": {
|
|
||||||
"enabled": true,
|
|
||||||
"indentStyle": "space"
|
|
||||||
},
|
|
||||||
"organizeImports": {
|
|
||||||
"enabled": true
|
|
||||||
},
|
|
||||||
"linter": {
|
|
||||||
"enabled": true,
|
|
||||||
"rules": {
|
|
||||||
"recommended": true
|
|
||||||
}
|
|
||||||
},
|
|
||||||
"javascript": {
|
|
||||||
"formatter": {
|
|
||||||
"quoteStyle": "double"
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
3
bun.lock
3
bun.lock
@@ -1,10 +1,10 @@
|
|||||||
{
|
{
|
||||||
"lockfileVersion": 1,
|
"lockfileVersion": 1,
|
||||||
"configVersion": 0,
|
|
||||||
"workspaces": {
|
"workspaces": {
|
||||||
"": {
|
"": {
|
||||||
"name": "sone4ka-tok",
|
"name": "sone4ka-tok",
|
||||||
"dependencies": {
|
"dependencies": {
|
||||||
|
"@types/cli-progress": "^3.11.6",
|
||||||
"cli-progress": "^3.12.0",
|
"cli-progress": "^3.12.0",
|
||||||
"linkedom": "^0.18.12",
|
"linkedom": "^0.18.12",
|
||||||
"unidecode": "^1.1.0",
|
"unidecode": "^1.1.0",
|
||||||
@@ -13,7 +13,6 @@
|
|||||||
"@anthropic-ai/claude-code": "^2.0.1",
|
"@anthropic-ai/claude-code": "^2.0.1",
|
||||||
"@musistudio/claude-code-router": "^1.0.53",
|
"@musistudio/claude-code-router": "^1.0.53",
|
||||||
"@types/bun": "latest",
|
"@types/bun": "latest",
|
||||||
"@types/cli-progress": "^3.11.6",
|
|
||||||
"@types/unidecode": "^1.1.0",
|
"@types/unidecode": "^1.1.0",
|
||||||
},
|
},
|
||||||
"peerDependencies": {
|
"peerDependencies": {
|
||||||
|
|||||||
@@ -1,3 +0,0 @@
|
|||||||
[test]
|
|
||||||
# Test configuration
|
|
||||||
preload = ["./test/setup.ts"]
|
|
||||||
@@ -1,6 +1,5 @@
|
|||||||
services:
|
services:
|
||||||
ca-marketplace-scraper:
|
ca-marketplace-scraper:
|
||||||
container_name: ca-marketplace-scraper
|
|
||||||
build: .
|
build: .
|
||||||
ports:
|
ports:
|
||||||
- "4005:4005"
|
- "4005:4005"
|
||||||
@@ -14,8 +13,6 @@ services:
|
|||||||
retries: 3
|
retries: 3
|
||||||
start_period: 5s
|
start_period: 5s
|
||||||
restart: unless-stopped
|
restart: unless-stopped
|
||||||
networks:
|
|
||||||
- internal
|
|
||||||
networks:
|
networks:
|
||||||
internal:
|
internal:
|
||||||
driver: bridge
|
driver: bridge
|
||||||
|
|||||||
@@ -1,27 +0,0 @@
|
|||||||
{
|
|
||||||
"$schema": "https://opencode.ai/config.json",
|
|
||||||
"mcp": {
|
|
||||||
"chrome-devtools": {
|
|
||||||
"type": "local",
|
|
||||||
"command": [
|
|
||||||
"bunx",
|
|
||||||
"--bun",
|
|
||||||
"chrome-devtools-mcp@latest",
|
|
||||||
"--log-file",
|
|
||||||
"./debug.log",
|
|
||||||
"--headless=false",
|
|
||||||
"--isolated=false",
|
|
||||||
"-e",
|
|
||||||
"/nix/store/lz8ajxhnkkw2llj752bdz41wqr645h9c-google-chrome-dev-146.0.7635.0/bin/google-chrome-unstable",
|
|
||||||
"--ignore-default-chrome-arg='--disable-extensions'"
|
|
||||||
],
|
|
||||||
"enabled": false
|
|
||||||
},
|
|
||||||
"bun-docs": {
|
|
||||||
"type": "remote",
|
|
||||||
"url": "https://bun.com/docs/mcp",
|
|
||||||
"timeout": 3000,
|
|
||||||
"enabled": false
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
@@ -1,183 +0,0 @@
|
|||||||
#!/usr/bin/env bun
|
|
||||||
|
|
||||||
/**
|
|
||||||
* Facebook Cookie Parser CLI
|
|
||||||
*
|
|
||||||
* Parses Facebook cookie strings into JSON format for the marketplace scraper
|
|
||||||
*
|
|
||||||
* Usage:
|
|
||||||
* bun run scripts/parse-facebook-cookies.ts "c_user=123; xs=abc"
|
|
||||||
* bun run scripts/parse-facebook-cookies.ts --input cookies.txt
|
|
||||||
* echo "c_user=123; xs=abc" | bun run scripts/parse-facebook-cookies.ts
|
|
||||||
* bun run scripts/parse-facebook-cookies.ts "cookie_string" --output my-cookies.json
|
|
||||||
*/
|
|
||||||
|
|
||||||
import { parseFacebookCookieString } from "../src/facebook";
|
|
||||||
|
|
||||||
interface Cookie {
|
|
||||||
name: string;
|
|
||||||
value: string;
|
|
||||||
domain: string;
|
|
||||||
path: string;
|
|
||||||
secure?: boolean;
|
|
||||||
httpOnly?: boolean;
|
|
||||||
sameSite?: "strict" | "lax" | "none" | "unspecified";
|
|
||||||
expirationDate?: number;
|
|
||||||
storeId?: string;
|
|
||||||
}
|
|
||||||
|
|
||||||
function parseFacebookCookieStringCLI(cookieString: string): Cookie[] {
|
|
||||||
if (!cookieString || !cookieString.trim()) {
|
|
||||||
console.error("❌ Error: Empty or invalid cookie string provided");
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
const cookies = parseFacebookCookieString(cookieString);
|
|
||||||
|
|
||||||
if (cookies.length === 0) {
|
|
||||||
console.error("❌ Error: No valid cookies found in input string");
|
|
||||||
console.error('Expected format: "name1=value1; name2=value2;"');
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
return cookies;
|
|
||||||
}
|
|
||||||
|
|
||||||
async function main() {
|
|
||||||
const args = process.argv.slice(2);
|
|
||||||
|
|
||||||
if (args.length === 0 && process.stdin.isTTY === false) {
|
|
||||||
// Read from stdin
|
|
||||||
let input = "";
|
|
||||||
for await (const chunk of process.stdin) {
|
|
||||||
input += chunk;
|
|
||||||
}
|
|
||||||
input = input.trim();
|
|
||||||
|
|
||||||
if (!input) {
|
|
||||||
console.error("❌ Error: No input provided via stdin");
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
const cookies = parseFacebookCookieStringCLI(input);
|
|
||||||
await writeOutput(cookies, "./cookies/facebook.json");
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
let cookieString = "";
|
|
||||||
let outputPath = "./cookies/facebook.json";
|
|
||||||
let inputPath = "";
|
|
||||||
|
|
||||||
// Parse command line arguments
|
|
||||||
for (let i = 0; i < args.length; i++) {
|
|
||||||
const arg = args[i];
|
|
||||||
|
|
||||||
if (arg === "--input" || arg === "-i") {
|
|
||||||
inputPath = args[i + 1];
|
|
||||||
i++; // Skip next arg
|
|
||||||
} else if (arg === "--output" || arg === "-o") {
|
|
||||||
outputPath = args[i + 1];
|
|
||||||
i++; // Skip next arg
|
|
||||||
} else if (arg === "--help" || arg === "-h") {
|
|
||||||
showHelp();
|
|
||||||
return;
|
|
||||||
} else if (!arg.startsWith("-")) {
|
|
||||||
// Assume this is the cookie string
|
|
||||||
cookieString = arg;
|
|
||||||
} else {
|
|
||||||
console.error(`❌ Unknown option: ${arg}`);
|
|
||||||
showHelp();
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Read from file if specified
|
|
||||||
if (inputPath) {
|
|
||||||
try {
|
|
||||||
const file = Bun.file(inputPath);
|
|
||||||
if (!(await file.exists())) {
|
|
||||||
console.error(`❌ Error: Input file not found: ${inputPath}`);
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
cookieString = await file.text();
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`❌ Error reading input file: ${error}`);
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if (!cookieString.trim()) {
|
|
||||||
console.error("❌ Error: No cookie string provided");
|
|
||||||
console.error(
|
|
||||||
"Provide cookie string as argument, --input file, or via stdin",
|
|
||||||
);
|
|
||||||
showHelp();
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
|
|
||||||
const cookies = parseFacebookCookieStringCLI(cookieString);
|
|
||||||
await writeOutput(cookies, outputPath);
|
|
||||||
}
|
|
||||||
|
|
||||||
async function writeOutput(cookies: Cookie[], outputPath: string) {
|
|
||||||
try {
|
|
||||||
await Bun.write(outputPath, JSON.stringify(cookies, null, 2));
|
|
||||||
console.log(`✅ Successfully parsed ${cookies.length} Facebook cookies`);
|
|
||||||
console.log(`📁 Saved to: ${outputPath}`);
|
|
||||||
|
|
||||||
// Show summary of parsed cookies
|
|
||||||
console.log("\n📋 Parsed cookies:");
|
|
||||||
for (const cookie of cookies) {
|
|
||||||
console.log(
|
|
||||||
` • ${cookie.name}: ${cookie.value.substring(0, 20)}${cookie.value.length > 20 ? "..." : ""}`,
|
|
||||||
);
|
|
||||||
}
|
|
||||||
} catch (error) {
|
|
||||||
console.error(`❌ Error writing to output file: ${error}`);
|
|
||||||
process.exit(1);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
function showHelp() {
|
|
||||||
console.log(`
|
|
||||||
Facebook Cookie Parser CLI
|
|
||||||
|
|
||||||
Parses Facebook cookie strings into JSON format for the marketplace scraper.
|
|
||||||
|
|
||||||
USAGE:
|
|
||||||
bun run scripts/parse-facebook-cookies.ts [OPTIONS] [COOKIE_STRING]
|
|
||||||
|
|
||||||
EXAMPLES:
|
|
||||||
# Parse from command line argument
|
|
||||||
bun run scripts/parse-facebook-cookies.ts "c_user=123; xs=abc"
|
|
||||||
|
|
||||||
# Parse from file
|
|
||||||
bun run scripts/parse-facebook-cookies.ts --input cookies.txt
|
|
||||||
|
|
||||||
# Parse from stdin
|
|
||||||
echo "c_user=123; xs=abc" | bun run scripts/parse-facebook-cookies.ts
|
|
||||||
|
|
||||||
# Output to custom file
|
|
||||||
bun run scripts/parse-facebook-cookies.ts "cookie_string" --output my-cookies.json
|
|
||||||
|
|
||||||
OPTIONS:
|
|
||||||
-i, --input FILE Read cookie string from file
|
|
||||||
-o, --output FILE Output file path (default: ./cookies/facebook.json)
|
|
||||||
-h, --help Show this help message
|
|
||||||
|
|
||||||
COOKIE FORMAT:
|
|
||||||
Semicolon-separated name=value pairs
|
|
||||||
Example: "c_user=123456789; xs=abcdef123456; fr=xyz789"
|
|
||||||
|
|
||||||
OUTPUT:
|
|
||||||
JSON array of cookie objects saved to ./cookies/facebook.json
|
|
||||||
`);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Run the CLI
|
|
||||||
if (import.meta.main) {
|
|
||||||
main().catch((error) => {
|
|
||||||
console.error(`❌ Unexpected error: ${error}`);
|
|
||||||
process.exit(1);
|
|
||||||
});
|
|
||||||
}
|
|
||||||
732
src/ebay.ts
732
src/ebay.ts
@@ -1,103 +1,97 @@
|
|||||||
import cliProgress from "cli-progress";
|
|
||||||
/* eslint-disable @typescript-eslint/no-explicit-any */
|
/* eslint-disable @typescript-eslint/no-explicit-any */
|
||||||
import { parseHTML } from "linkedom";
|
import { parseHTML } from "linkedom";
|
||||||
|
import cliProgress from "cli-progress";
|
||||||
|
|
||||||
// ----------------------------- Types -----------------------------
|
// ----------------------------- Types -----------------------------
|
||||||
|
|
||||||
type HTMLString = string;
|
type HTMLString = string;
|
||||||
|
|
||||||
type ListingDetails = {
|
type ListingDetails = {
|
||||||
url: string;
|
url: string;
|
||||||
title: string;
|
title: string;
|
||||||
description?: string;
|
description?: string;
|
||||||
listingPrice?: {
|
listingPrice?: {
|
||||||
amountFormatted: string;
|
amountFormatted: string;
|
||||||
cents?: number;
|
cents?: number;
|
||||||
currency?: string;
|
currency?: string;
|
||||||
};
|
};
|
||||||
listingType?: string;
|
listingType?: string;
|
||||||
listingStatus?: string;
|
listingStatus?: string;
|
||||||
creationDate?: string;
|
creationDate?: string;
|
||||||
endDate?: string;
|
endDate?: string;
|
||||||
numberOfViews?: number;
|
numberOfViews?: number;
|
||||||
address?: string | null;
|
address?: string | null;
|
||||||
};
|
};
|
||||||
|
|
||||||
// ----------------------------- Utilities -----------------------------
|
// ----------------------------- Utilities -----------------------------
|
||||||
|
|
||||||
function isRecord(value: unknown): value is Record<string, unknown> {
|
function isRecord(value: unknown): value is Record<string, unknown> {
|
||||||
return typeof value === "object" && value !== null;
|
return typeof value === "object" && value !== null;
|
||||||
}
|
}
|
||||||
|
|
||||||
async function delay(ms: number): Promise<void> {
|
async function delay(ms: number): Promise<void> {
|
||||||
await new Promise((resolve) => setTimeout(resolve, ms));
|
await new Promise((resolve) => setTimeout(resolve, ms));
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Turns cents to localized currency string.
|
* Turns cents to localized currency string.
|
||||||
*/
|
*/
|
||||||
function formatCentsToCurrency(
|
function formatCentsToCurrency(
|
||||||
num: number | string | undefined,
|
num: number | string | undefined,
|
||||||
locale = "en-US",
|
locale = "en-US",
|
||||||
): string {
|
): string {
|
||||||
if (num == null) return "";
|
if (num == null) return "";
|
||||||
const cents = typeof num === "string" ? Number.parseInt(num, 10) : num;
|
const cents = typeof num === "string" ? Number.parseInt(num, 10) : num;
|
||||||
if (Number.isNaN(cents)) return "";
|
if (Number.isNaN(cents)) return "";
|
||||||
const dollars = cents / 100;
|
const dollars = cents / 100;
|
||||||
const formatter = new Intl.NumberFormat(locale, {
|
const formatter = new Intl.NumberFormat(locale, {
|
||||||
minimumFractionDigits: 2,
|
minimumFractionDigits: 2,
|
||||||
maximumFractionDigits: 2,
|
maximumFractionDigits: 2,
|
||||||
useGrouping: true,
|
useGrouping: true,
|
||||||
});
|
});
|
||||||
return formatter.format(dollars);
|
return formatter.format(dollars);
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Parse eBay currency string like "$1.50 CAD" or "CA $1.50" into cents
|
* Parse eBay currency string like "$1.50 CAD" or "CA $1.50" into cents
|
||||||
*/
|
*/
|
||||||
function parseEbayPrice(
|
function parseEbayPrice(priceText: string): { cents: number; currency: string } | null {
|
||||||
priceText: string,
|
if (!priceText || typeof priceText !== 'string') return null;
|
||||||
): { cents: number; currency: string } | null {
|
|
||||||
if (!priceText || typeof priceText !== "string") return null;
|
|
||||||
|
|
||||||
// Clean up the price text and extract currency and amount
|
// Clean up the price text and extract currency and amount
|
||||||
const cleaned = priceText.trim();
|
const cleaned = priceText.trim();
|
||||||
|
|
||||||
// Find all numbers in the string (including decimals)
|
// Find all numbers in the string (including decimals)
|
||||||
const numberMatches = cleaned.match(/[\d,]+\.?\d*/);
|
const numberMatches = cleaned.match(/[\d,]+\.?\d*/);
|
||||||
if (!numberMatches) return null;
|
if (!numberMatches) return null;
|
||||||
|
|
||||||
const amountStr = numberMatches[0].replace(/,/g, "");
|
const amountStr = numberMatches[0].replace(/,/g, '');
|
||||||
const dollars = Number.parseFloat(amountStr);
|
const dollars = parseFloat(amountStr);
|
||||||
if (Number.isNaN(dollars)) return null;
|
if (isNaN(dollars)) return null;
|
||||||
|
|
||||||
const cents = Math.round(dollars * 100);
|
const cents = Math.round(dollars * 100);
|
||||||
|
|
||||||
// Extract currency - look for common formats like "CAD", "USD", "C $", "$CA", etc.
|
// Extract currency - look for common formats like "CAD", "USD", "C $", "$CA", etc.
|
||||||
let currency = "USD"; // Default
|
let currency = 'USD'; // Default
|
||||||
|
|
||||||
if (
|
if (cleaned.toUpperCase().includes('CAD') || cleaned.includes('CA$') || cleaned.includes('C $')) {
|
||||||
cleaned.toUpperCase().includes("CAD") ||
|
currency = 'CAD';
|
||||||
cleaned.includes("CA$") ||
|
} else if (cleaned.toUpperCase().includes('USD') || cleaned.includes('$')) {
|
||||||
cleaned.includes("C $")
|
currency = 'USD';
|
||||||
) {
|
}
|
||||||
currency = "CAD";
|
|
||||||
} else if (cleaned.toUpperCase().includes("USD") || cleaned.includes("$")) {
|
|
||||||
currency = "USD";
|
|
||||||
}
|
|
||||||
|
|
||||||
return { cents, currency };
|
return { cents, currency };
|
||||||
}
|
}
|
||||||
|
|
||||||
class HttpError extends Error {
|
class HttpError extends Error {
|
||||||
constructor(
|
constructor(
|
||||||
message: string,
|
message: string,
|
||||||
public readonly status: number,
|
public readonly status: number,
|
||||||
public readonly url: string,
|
public readonly url: string,
|
||||||
) {
|
) {
|
||||||
super(message);
|
super(message);
|
||||||
this.name = "HttpError";
|
this.name = "HttpError";
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// ----------------------------- HTTP Client -----------------------------
|
// ----------------------------- HTTP Client -----------------------------
|
||||||
@@ -108,71 +102,69 @@ class HttpError extends Error {
|
|||||||
- Respects X-RateLimit-Reset when present (seconds)
|
- Respects X-RateLimit-Reset when present (seconds)
|
||||||
*/
|
*/
|
||||||
async function fetchHtml(
|
async function fetchHtml(
|
||||||
url: string,
|
url: string,
|
||||||
DELAY_MS: number,
|
DELAY_MS: number,
|
||||||
opts?: {
|
opts?: {
|
||||||
maxRetries?: number;
|
maxRetries?: number;
|
||||||
retryBaseMs?: number;
|
retryBaseMs?: number;
|
||||||
onRateInfo?: (remaining: string | null, reset: string | null) => void;
|
onRateInfo?: (remaining: string | null, reset: string | null) => void;
|
||||||
},
|
},
|
||||||
): Promise<HTMLString> {
|
): Promise<HTMLString> {
|
||||||
const maxRetries = opts?.maxRetries ?? 3;
|
const maxRetries = opts?.maxRetries ?? 3;
|
||||||
const retryBaseMs = opts?.retryBaseMs ?? 500;
|
const retryBaseMs = opts?.retryBaseMs ?? 500;
|
||||||
|
|
||||||
for (let attempt = 0; attempt <= maxRetries; attempt++) {
|
for (let attempt = 0; attempt <= maxRetries; attempt++) {
|
||||||
try {
|
try {
|
||||||
const res = await fetch(url, {
|
const res = await fetch(url, {
|
||||||
method: "GET",
|
method: "GET",
|
||||||
headers: {
|
headers: {
|
||||||
accept:
|
accept:
|
||||||
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
|
"text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
|
||||||
"accept-language": "en-CA,en-US;q=0.9,en;q=0.8",
|
"accept-language": "en-CA,en-US;q=0.9,en;q=0.8",
|
||||||
"cache-control": "no-cache",
|
"cache-control": "no-cache",
|
||||||
"upgrade-insecure-requests": "1",
|
"upgrade-insecure-requests": "1",
|
||||||
"user-agent":
|
"user-agent":
|
||||||
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36",
|
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120 Safari/537.36",
|
||||||
},
|
},
|
||||||
});
|
});
|
||||||
|
|
||||||
const rateLimitRemaining = res.headers.get("X-RateLimit-Remaining");
|
const rateLimitRemaining = res.headers.get("X-RateLimit-Remaining");
|
||||||
const rateLimitReset = res.headers.get("X-RateLimit-Reset");
|
const rateLimitReset = res.headers.get("X-RateLimit-Reset");
|
||||||
opts?.onRateInfo?.(rateLimitRemaining, rateLimitReset);
|
opts?.onRateInfo?.(rateLimitRemaining, rateLimitReset);
|
||||||
|
|
||||||
if (!res.ok) {
|
if (!res.ok) {
|
||||||
// Respect 429 reset if provided
|
// Respect 429 reset if provided
|
||||||
if (res.status === 429) {
|
if (res.status === 429) {
|
||||||
const resetSeconds = rateLimitReset
|
const resetSeconds = rateLimitReset ? Number(rateLimitReset) : NaN;
|
||||||
? Number(rateLimitReset)
|
const waitMs = Number.isFinite(resetSeconds)
|
||||||
: Number.NaN;
|
? Math.max(0, resetSeconds * 1000)
|
||||||
const waitMs = Number.isFinite(resetSeconds)
|
: (attempt + 1) * retryBaseMs;
|
||||||
? Math.max(0, resetSeconds * 1000)
|
await delay(waitMs);
|
||||||
: (attempt + 1) * retryBaseMs;
|
continue;
|
||||||
await delay(waitMs);
|
}
|
||||||
continue;
|
// Retry on 5xx
|
||||||
}
|
if (res.status >= 500 && res.status < 600 && attempt < maxRetries) {
|
||||||
// Retry on 5xx
|
await delay((attempt + 1) * retryBaseMs);
|
||||||
if (res.status >= 500 && res.status < 600 && attempt < maxRetries) {
|
continue;
|
||||||
await delay((attempt + 1) * retryBaseMs);
|
}
|
||||||
continue;
|
throw new HttpError(
|
||||||
}
|
`Request failed with status ${res.status}`,
|
||||||
throw new HttpError(
|
res.status,
|
||||||
`Request failed with status ${res.status}`,
|
url,
|
||||||
res.status,
|
);
|
||||||
url,
|
}
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
const html = await res.text();
|
const html = await res.text();
|
||||||
// Respect per-request delay to keep at or under REQUESTS_PER_SECOND
|
// Respect per-request delay to keep at or under REQUESTS_PER_SECOND
|
||||||
await delay(DELAY_MS);
|
await delay(DELAY_MS);
|
||||||
return html;
|
return html;
|
||||||
} catch (err) {
|
} catch (err) {
|
||||||
if (attempt >= maxRetries) throw err;
|
if (attempt >= maxRetries) throw err;
|
||||||
await delay((attempt + 1) * retryBaseMs);
|
await delay((attempt + 1) * retryBaseMs);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
throw new Error("Exhausted retries without response");
|
throw new Error("Exhausted retries without response");
|
||||||
}
|
}
|
||||||
|
|
||||||
// ----------------------------- Parsing -----------------------------
|
// ----------------------------- Parsing -----------------------------
|
||||||
@@ -181,321 +173,273 @@ async function fetchHtml(
|
|||||||
Parse eBay search page HTML and extract listings using DOM selectors
|
Parse eBay search page HTML and extract listings using DOM selectors
|
||||||
*/
|
*/
|
||||||
function parseEbayListings(
|
function parseEbayListings(
|
||||||
htmlString: HTMLString,
|
htmlString: HTMLString,
|
||||||
keywords: string[],
|
keywords: string[],
|
||||||
exclusions: string[],
|
exclusions: string[],
|
||||||
strictMode: boolean,
|
strictMode: boolean
|
||||||
): ListingDetails[] {
|
): ListingDetails[] {
|
||||||
const { document } = parseHTML(htmlString);
|
const { document } = parseHTML(htmlString);
|
||||||
const results: ListingDetails[] = [];
|
const results: ListingDetails[] = [];
|
||||||
|
|
||||||
// Find all listing links by looking for eBay item URLs (/itm/)
|
// Find all listing links by looking for eBay item URLs (/itm/)
|
||||||
const linkElements = document.querySelectorAll('a[href*="itm/"]');
|
const linkElements = document.querySelectorAll('a[href*="itm/"]');
|
||||||
|
|
||||||
for (const linkElement of linkElements) {
|
|
||||||
try {
|
|
||||||
// Get href attribute
|
|
||||||
let href = linkElement.getAttribute("href");
|
|
||||||
if (!href) continue;
|
|
||||||
|
|
||||||
// Make href absolute
|
for (const linkElement of linkElements) {
|
||||||
if (!href.startsWith("http")) {
|
try {
|
||||||
href = href.startsWith("//")
|
// Get href attribute
|
||||||
? `https:${href}`
|
let href = linkElement.getAttribute('href');
|
||||||
: `https://www.ebay.com${href}`;
|
if (!href) continue;
|
||||||
}
|
|
||||||
|
|
||||||
// Find the container - go up several levels to find the item container
|
// Make href absolute
|
||||||
// Modern eBay uses complex nested structures
|
if (!href.startsWith('http')) {
|
||||||
let container = linkElement.parentElement?.parentElement?.parentElement;
|
href = href.startsWith('//') ? `https:${href}` : `https://www.ebay.com${href}`;
|
||||||
if (!container) {
|
}
|
||||||
// Try a different level
|
|
||||||
container = linkElement.parentElement?.parentElement;
|
|
||||||
}
|
|
||||||
if (!container) continue;
|
|
||||||
|
|
||||||
// Extract title - look for heading or title-related elements near the link
|
// Find the container - go up several levels to find the item container
|
||||||
// Modern eBay often uses h3, span, or div with text content near the link
|
// Modern eBay uses complex nested structures
|
||||||
let titleElement = container.querySelector(
|
let container = linkElement.parentElement?.parentElement?.parentElement;
|
||||||
'h3, [role="heading"], .s-item__title span',
|
if (!container) {
|
||||||
);
|
// Try a different level
|
||||||
|
container = linkElement.parentElement?.parentElement;
|
||||||
|
}
|
||||||
|
if (!container) continue;
|
||||||
|
|
||||||
// If no direct title element, try finding text content around the link
|
// Extract title - look for heading or title-related elements near the link
|
||||||
if (!titleElement) {
|
// Modern eBay often uses h3, span, or div with text content near the link
|
||||||
// Look for spans or divs with text near this link
|
let titleElement = container.querySelector('h3, [role="heading"], .s-item__title span');
|
||||||
const nearbySpans = container.querySelectorAll("span, div");
|
|
||||||
for (const span of nearbySpans) {
|
|
||||||
const text = span.textContent?.trim();
|
|
||||||
if (
|
|
||||||
text &&
|
|
||||||
text.length > 10 &&
|
|
||||||
text.length < 200 &&
|
|
||||||
!text.includes("$") &&
|
|
||||||
!text.includes("item")
|
|
||||||
) {
|
|
||||||
titleElement = span;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
let title = titleElement?.textContent?.trim();
|
// If no direct title element, try finding text content around the link
|
||||||
|
if (!titleElement) {
|
||||||
|
// Look for spans or divs with text near this link
|
||||||
|
const nearbySpans = container.querySelectorAll('span, div');
|
||||||
|
for (const span of nearbySpans) {
|
||||||
|
const text = span.textContent?.trim();
|
||||||
|
if (text && text.length > 10 && text.length < 200 && !text.includes('$') && !text.includes('item')) {
|
||||||
|
titleElement = span;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Clean up eBay UI strings that get included in titles
|
let title = titleElement?.textContent?.trim();
|
||||||
if (title) {
|
|
||||||
// Remove common eBay UI strings that appear at the end of titles
|
|
||||||
const uiStrings = [
|
|
||||||
"Opens in a new window",
|
|
||||||
"Opens in a new tab",
|
|
||||||
"Opens in a new window or tab",
|
|
||||||
"opens in a new window",
|
|
||||||
"opens in a new tab",
|
|
||||||
"opens in a new window or tab",
|
|
||||||
];
|
|
||||||
|
|
||||||
for (const uiString of uiStrings) {
|
// Clean up eBay UI strings that get included in titles
|
||||||
const uiIndex = title.indexOf(uiString);
|
if (title) {
|
||||||
if (uiIndex !== -1) {
|
// Remove common eBay UI strings that appear at the end of titles
|
||||||
title = title.substring(0, uiIndex).trim();
|
const uiStrings = [
|
||||||
break; // Only remove one UI string per title
|
'Opens in a new window',
|
||||||
}
|
'Opens in a new tab',
|
||||||
}
|
'Opens in a new window or tab',
|
||||||
|
'opens in a new window',
|
||||||
|
'opens in a new tab',
|
||||||
|
'opens in a new window or tab'
|
||||||
|
];
|
||||||
|
|
||||||
// If the title became empty or too short after cleaning, skip this item
|
for (const uiString of uiStrings) {
|
||||||
if (title.length < 10) {
|
const uiIndex = title.indexOf(uiString);
|
||||||
continue;
|
if (uiIndex !== -1) {
|
||||||
}
|
title = title.substring(0, uiIndex).trim();
|
||||||
}
|
break; // Only remove one UI string per title
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (!title) continue;
|
// If the title became empty or too short after cleaning, skip this item
|
||||||
|
if (title.length < 10) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// Skip irrelevant eBay ads
|
if (!title) continue;
|
||||||
if (title === "Shop on eBay" || title.length < 3) continue;
|
|
||||||
|
|
||||||
// Extract price - look for eBay's price classes, preferring sale/discount prices
|
// Skip irrelevant eBay ads
|
||||||
let priceElement = container.querySelector(
|
if (title === "Shop on eBay" || title.length < 3) continue;
|
||||||
'[class*="s-item__price"], .s-item__price, [class*="price"]',
|
|
||||||
);
|
|
||||||
|
|
||||||
// If no direct price class, look for spans containing $ (but not titles)
|
// Extract price - look for eBay's price classes, preferring sale/discount prices
|
||||||
if (!priceElement) {
|
let priceElement = container.querySelector('[class*="s-item__price"], .s-item__price, [class*="price"]');
|
||||||
const spansAndElements = container.querySelectorAll(
|
|
||||||
"span, div, b, em, strong",
|
|
||||||
);
|
|
||||||
for (const el of spansAndElements) {
|
|
||||||
const text = el.textContent?.trim();
|
|
||||||
// Must contain $, be reasonably short (price shouldn't be paragraph), and not contain product words
|
|
||||||
if (
|
|
||||||
text?.includes("$") &&
|
|
||||||
text.length < 100 &&
|
|
||||||
!text.includes("laptop") &&
|
|
||||||
!text.includes("computer") &&
|
|
||||||
!text.includes("intel") &&
|
|
||||||
!text.includes("core") &&
|
|
||||||
!text.includes("ram") &&
|
|
||||||
!text.includes("ssd") &&
|
|
||||||
!/\d{4}/.test(text) && // Avoid years like "2024"
|
|
||||||
!text.includes('"') // Avoid measurements
|
|
||||||
) {
|
|
||||||
priceElement = el;
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// For discounted items, eBay shows both original and sale price
|
// If no direct price class, look for spans containing $ (but not titles)
|
||||||
// Prefer sale/current price over original/strikethrough price
|
if (!priceElement) {
|
||||||
if (priceElement) {
|
const spansAndElements = container.querySelectorAll('span, div, b, em, strong');
|
||||||
// Check if this element or its parent contains multiple price elements
|
for (const el of spansAndElements) {
|
||||||
const priceContainer =
|
const text = el.textContent?.trim();
|
||||||
priceElement.closest('[class*="s-item__price"]') ||
|
// Must contain $, be reasonably short (price shouldn't be paragraph), and not contain product words
|
||||||
priceElement.parentElement;
|
if (text && text.includes('$') && text.length < 100 &&
|
||||||
|
!text.includes('laptop') && !text.includes('computer') && !text.includes('intel') &&
|
||||||
|
!text.includes('core') && !text.includes('ram') && !text.includes('ssd') &&
|
||||||
|
! /\d{4}/.test(text) && // Avoid years like "2024"
|
||||||
|
!text.includes('"') // Avoid measurements
|
||||||
|
) {
|
||||||
|
priceElement = el;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (priceContainer) {
|
// For discounted items, eBay shows both original and sale price
|
||||||
// Look for all price elements within this container, including strikethrough prices
|
// Prefer sale/current price over original/strikethrough price
|
||||||
const allPriceElements = priceContainer.querySelectorAll(
|
if (priceElement) {
|
||||||
'[class*="s-item__price"], span, b, em, strong, s, del, strike',
|
// Check if this element or its parent contains multiple price elements
|
||||||
);
|
const priceContainer = priceElement.closest('[class*="s-item__price"]') || priceElement.parentElement;
|
||||||
|
|
||||||
// Filter to only elements that actually contain prices (not labels)
|
if (priceContainer) {
|
||||||
const actualPrices: HTMLElement[] = [];
|
// Look for all price elements within this container, including strikethrough prices
|
||||||
for (const el of allPriceElements) {
|
const allPriceElements = priceContainer.querySelectorAll('[class*="s-item__price"], span, b, em, strong, s, del, strike');
|
||||||
const text = el.textContent?.trim();
|
|
||||||
if (
|
|
||||||
text &&
|
|
||||||
/^\s*[\$£€¥]/u.test(text) &&
|
|
||||||
text.length < 50 &&
|
|
||||||
!/\d{4}/.test(text)
|
|
||||||
) {
|
|
||||||
actualPrices.push(el);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Prefer non-strikethrough prices (sale prices) over strikethrough ones (original prices)
|
// Filter to only elements that actually contain prices (not labels)
|
||||||
if (actualPrices.length > 1) {
|
const actualPrices: HTMLElement[] = [];
|
||||||
// First, look for prices that are NOT struck through
|
for (const el of allPriceElements) {
|
||||||
const nonStrikethroughPrices = actualPrices.filter((el) => {
|
const text = el.textContent?.trim();
|
||||||
const tagName = el.tagName.toLowerCase();
|
if (text && /^\s*[\$£€¥]/u.test(text) && text.length < 50 && !/\d{4}/.test(text)) {
|
||||||
const styles =
|
actualPrices.push(el);
|
||||||
el.classList.contains("s-strikethrough") ||
|
}
|
||||||
el.classList.contains("u-flStrike") ||
|
}
|
||||||
el.closest("s, del, strike");
|
|
||||||
return (
|
|
||||||
tagName !== "s" &&
|
|
||||||
tagName !== "del" &&
|
|
||||||
tagName !== "strike" &&
|
|
||||||
!styles
|
|
||||||
);
|
|
||||||
});
|
|
||||||
|
|
||||||
if (nonStrikethroughPrices.length > 0) {
|
// Prefer non-strikethrough prices (sale prices) over strikethrough ones (original prices)
|
||||||
// Use the first non-strikethrough price (sale price)
|
if (actualPrices.length > 1) {
|
||||||
priceElement = nonStrikethroughPrices[0];
|
// First, look for prices that are NOT struck through
|
||||||
} else {
|
const nonStrikethroughPrices = actualPrices.filter(el => {
|
||||||
// Fallback: use the last price (likely the most current)
|
const tagName = el.tagName.toLowerCase();
|
||||||
const lastPrice = actualPrices[actualPrices.length - 1];
|
const styles = el.classList.contains('s-strikethrough') || el.classList.contains('u-flStrike') ||
|
||||||
priceElement = lastPrice;
|
el.closest('s, del, strike');
|
||||||
}
|
return tagName !== 's' && tagName !== 'del' && tagName !== 'strike' && !styles;
|
||||||
}
|
});
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
const priceText = priceElement?.textContent?.trim();
|
if (nonStrikethroughPrices.length > 0) {
|
||||||
|
// Use the first non-strikethrough price (sale price)
|
||||||
|
priceElement = nonStrikethroughPrices[0];
|
||||||
|
} else {
|
||||||
|
// Fallback: use the last price (likely the most current)
|
||||||
|
const lastPrice = actualPrices[actualPrices.length - 1];
|
||||||
|
priceElement = lastPrice;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if (!priceText) continue;
|
let priceText = priceElement?.textContent?.trim();
|
||||||
|
|
||||||
// Parse price into cents and currency
|
if (!priceText) continue;
|
||||||
const priceInfo = parseEbayPrice(priceText);
|
|
||||||
if (!priceInfo) continue;
|
|
||||||
|
|
||||||
// Apply exclusion filters
|
// Parse price into cents and currency
|
||||||
if (
|
const priceInfo = parseEbayPrice(priceText);
|
||||||
exclusions.some((exclusion) =>
|
if (!priceInfo) continue;
|
||||||
title.toLowerCase().includes(exclusion.toLowerCase()),
|
|
||||||
)
|
|
||||||
) {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Apply strict mode filter (title must contain at least one keyword)
|
// Apply exclusion filters
|
||||||
if (
|
if (exclusions.some(exclusion => title.toLowerCase().includes(exclusion.toLowerCase()))) {
|
||||||
strictMode &&
|
continue;
|
||||||
!keywords.some((keyword) =>
|
}
|
||||||
title?.toLowerCase().includes(keyword.toLowerCase()),
|
|
||||||
)
|
|
||||||
) {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
const listing: ListingDetails = {
|
// Apply strict mode filter (title must contain at least one keyword)
|
||||||
url: href,
|
if (strictMode && !keywords.some(keyword => title!.toLowerCase().includes(keyword.toLowerCase()))) {
|
||||||
title,
|
continue;
|
||||||
listingPrice: {
|
}
|
||||||
amountFormatted: priceText,
|
|
||||||
cents: priceInfo.cents,
|
|
||||||
currency: priceInfo.currency,
|
|
||||||
},
|
|
||||||
listingType: "OFFER", // eBay listings are typically offers
|
|
||||||
listingStatus: "ACTIVE",
|
|
||||||
address: null, // eBay doesn't typically show detailed addresses in search results
|
|
||||||
};
|
|
||||||
|
|
||||||
results.push(listing);
|
const listing: ListingDetails = {
|
||||||
} catch (err) {
|
url: href,
|
||||||
console.warn(`Error parsing eBay listing: ${err}`);
|
title,
|
||||||
}
|
listingPrice: {
|
||||||
}
|
amountFormatted: priceText,
|
||||||
|
cents: priceInfo.cents,
|
||||||
|
currency: priceInfo.currency,
|
||||||
|
},
|
||||||
|
listingType: "OFFER", // eBay listings are typically offers
|
||||||
|
listingStatus: "ACTIVE",
|
||||||
|
address: null, // eBay doesn't typically show detailed addresses in search results
|
||||||
|
};
|
||||||
|
|
||||||
return results;
|
results.push(listing);
|
||||||
|
} catch (err) {
|
||||||
|
console.warn(`Error parsing eBay listing: ${err}`);
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return results;
|
||||||
}
|
}
|
||||||
|
|
||||||
// ----------------------------- Main -----------------------------
|
// ----------------------------- Main -----------------------------
|
||||||
|
|
||||||
export default async function fetchEbayItems(
|
export default async function fetchEbayItems(
|
||||||
SEARCH_QUERY: string,
|
SEARCH_QUERY: string,
|
||||||
REQUESTS_PER_SECOND = 1,
|
REQUESTS_PER_SECOND = 1,
|
||||||
opts: {
|
opts: {
|
||||||
minPrice?: number;
|
minPrice?: number;
|
||||||
maxPrice?: number;
|
maxPrice?: number;
|
||||||
strictMode?: boolean;
|
strictMode?: boolean;
|
||||||
exclusions?: string[];
|
exclusions?: string[];
|
||||||
keywords?: string[];
|
keywords?: string[];
|
||||||
} = {},
|
} = {},
|
||||||
) {
|
) {
|
||||||
const {
|
const {
|
||||||
minPrice = 0,
|
minPrice = 0,
|
||||||
maxPrice = Number.MAX_SAFE_INTEGER,
|
maxPrice = Number.MAX_SAFE_INTEGER,
|
||||||
strictMode = false,
|
strictMode = false,
|
||||||
exclusions = [],
|
exclusions = [],
|
||||||
keywords = [SEARCH_QUERY], // Default to search query if no keywords provided
|
keywords = [SEARCH_QUERY] // Default to search query if no keywords provided
|
||||||
} = opts;
|
} = opts;
|
||||||
|
|
||||||
// Build eBay search URL - use Canadian site and tracking parameters like real browser
|
// Build eBay search URL - use Canadian site and tracking parameters like real browser
|
||||||
const searchUrl = `https://www.ebay.ca/sch/i.html?_nkw=${encodeURIComponent(SEARCH_QUERY)}^&_sacat=0^&_from=R40^&_trksid=p4432023.m570.l1313`;
|
const searchUrl = `https://www.ebay.ca/sch/i.html?_nkw=${encodeURIComponent(SEARCH_QUERY)}^&_sacat=0^&_from=R40^&_trksid=p4432023.m570.l1313`;
|
||||||
|
|
||||||
const DELAY_MS = Math.max(1, Math.floor(1000 / REQUESTS_PER_SECOND));
|
const DELAY_MS = Math.max(1, Math.floor(1000 / REQUESTS_PER_SECOND));
|
||||||
|
|
||||||
console.log(`Fetching eBay search: ${searchUrl}`);
|
console.log(`Fetching eBay search: ${searchUrl}`);
|
||||||
|
|
||||||
try {
|
try {
|
||||||
// Use custom headers modeled after real browser requests to bypass bot detection
|
// Use custom headers modeled after real browser requests to bypass bot detection
|
||||||
const headers: Record<string, string> = {
|
const headers: Record<string, string> = {
|
||||||
"User-Agent":
|
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:141.0) Gecko/20100101 Firefox/141.0',
|
||||||
"Mozilla/5.0 (X11; Linux x86_64; rv:141.0) Gecko/20100101 Firefox/141.0",
|
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
|
||||||
Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
|
'Accept-Language': 'en-US,en;q=0.5',
|
||||||
"Accept-Language": "en-US,en;q=0.5",
|
'Accept-Encoding': 'gzip, deflate, br',
|
||||||
"Accept-Encoding": "gzip, deflate, br",
|
'Referer': 'https://www.ebay.ca/',
|
||||||
Referer: "https://www.ebay.ca/",
|
'Connection': 'keep-alive',
|
||||||
Connection: "keep-alive",
|
'Upgrade-Insecure-Requests': '1',
|
||||||
"Upgrade-Insecure-Requests": "1",
|
'Sec-Fetch-Dest': 'document',
|
||||||
"Sec-Fetch-Dest": "document",
|
'Sec-Fetch-Mode': 'navigate',
|
||||||
"Sec-Fetch-Mode": "navigate",
|
'Sec-Fetch-Site': 'same-origin',
|
||||||
"Sec-Fetch-Site": "same-origin",
|
'Sec-Fetch-User': '?1',
|
||||||
"Sec-Fetch-User": "?1",
|
'Priority': 'u=0, i'
|
||||||
Priority: "u=0, i",
|
};
|
||||||
};
|
|
||||||
|
|
||||||
const res = await fetch(searchUrl, {
|
const res = await fetch(searchUrl, {
|
||||||
method: "GET",
|
method: "GET",
|
||||||
headers,
|
headers,
|
||||||
});
|
});
|
||||||
|
|
||||||
if (!res.ok) {
|
if (!res.ok) {
|
||||||
throw new HttpError(
|
throw new HttpError(
|
||||||
`Request failed with status ${res.status}`,
|
`Request failed with status ${res.status}`,
|
||||||
res.status,
|
res.status,
|
||||||
searchUrl,
|
searchUrl,
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
const searchHtml = await res.text();
|
const searchHtml = await res.text();
|
||||||
// Respect per-request delay to keep at or under REQUESTS_PER_SECOND
|
// Respect per-request delay to keep at or under REQUESTS_PER_SECOND
|
||||||
await delay(DELAY_MS);
|
await delay(DELAY_MS);
|
||||||
|
|
||||||
console.log("\nParsing eBay listings...");
|
console.log(`\nParsing eBay listings...`);
|
||||||
|
|
||||||
const listings = parseEbayListings(
|
const listings = parseEbayListings(searchHtml, keywords, exclusions, strictMode);
|
||||||
searchHtml,
|
|
||||||
keywords,
|
|
||||||
exclusions,
|
|
||||||
strictMode,
|
|
||||||
);
|
|
||||||
|
|
||||||
// Filter by price range (additional safety check)
|
// Filter by price range (additional safety check)
|
||||||
const filteredListings = listings.filter((listing) => {
|
const filteredListings = listings.filter(listing => {
|
||||||
const cents = listing.listingPrice?.cents;
|
const cents = listing.listingPrice?.cents;
|
||||||
return cents && cents >= minPrice && cents <= maxPrice;
|
return cents && cents >= minPrice && cents <= maxPrice;
|
||||||
});
|
});
|
||||||
|
|
||||||
console.log(`Parsed ${filteredListings.length} eBay listings.`);
|
console.log(`Parsed ${filteredListings.length} eBay listings.`);
|
||||||
return filteredListings;
|
return filteredListings;
|
||||||
} catch (err) {
|
|
||||||
if (err instanceof HttpError) {
|
} catch (err) {
|
||||||
console.error(
|
if (err instanceof HttpError) {
|
||||||
`Failed to fetch eBay search (${err.status}): ${err.message}`,
|
console.error(
|
||||||
);
|
`Failed to fetch eBay search (${err.status}): ${err.message}`,
|
||||||
return [];
|
);
|
||||||
}
|
return [];
|
||||||
throw err;
|
}
|
||||||
}
|
throw err;
|
||||||
}
|
}
|
||||||
|
}
|
||||||
1587
src/facebook.ts
1587
src/facebook.ts
File diff suppressed because it is too large
Load Diff
313
src/index.ts
313
src/index.ts
@@ -1,215 +1,142 @@
|
|||||||
import fetchEbayItems from "@/ebay";
|
|
||||||
import fetchFacebookItems from "@/facebook";
|
|
||||||
import fetchKijijiItems from "@/kijiji";
|
import fetchKijijiItems from "@/kijiji";
|
||||||
|
import fetchFacebookItems from "@/facebook";
|
||||||
|
import fetchEbayItems from "@/ebay";
|
||||||
|
|
||||||
const PORT = process.env.PORT || 4005;
|
const PORT = process.env.PORT || 4005;
|
||||||
|
|
||||||
const server = Bun.serve({
|
const server = Bun.serve({
|
||||||
port: PORT,
|
port: PORT,
|
||||||
idleTimeout: 0,
|
idleTimeout: 0,
|
||||||
routes: {
|
routes: {
|
||||||
// Static routes
|
// Static routes
|
||||||
"/api/status": new Response("OK"),
|
"/api/status": new Response("OK"),
|
||||||
|
|
||||||
// Dynamic routes
|
// Dynamic routes
|
||||||
"/api/kijiji": async (req: Request) => {
|
"/api/kijiji": async (req: Request) => {
|
||||||
const reqUrl = new URL(req.url);
|
const reqUrl = new URL(req.url);
|
||||||
|
|
||||||
const SEARCH_QUERY =
|
const SEARCH_QUERY =
|
||||||
req.headers.get("query") || reqUrl.searchParams.get("q") || null;
|
req.headers.get("query") || reqUrl.searchParams.get("q") || null;
|
||||||
if (!SEARCH_QUERY)
|
if (!SEARCH_QUERY)
|
||||||
return Response.json(
|
return Response.json(
|
||||||
{
|
{
|
||||||
message:
|
message:
|
||||||
"Request didn't have 'query' header or 'q' search parameter!",
|
"Request didn't have 'query' header or 'q' search parameter!",
|
||||||
},
|
},
|
||||||
{ status: 400 },
|
{ status: 400 },
|
||||||
);
|
);
|
||||||
|
|
||||||
// Parse optional parameters with enhanced defaults
|
const items = await fetchKijijiItems(SEARCH_QUERY, 5);
|
||||||
const location = reqUrl.searchParams.get("location");
|
if (!items)
|
||||||
const category = reqUrl.searchParams.get("category");
|
return Response.json(
|
||||||
const maxPagesParam = reqUrl.searchParams.get("maxPages");
|
{ message: "Search didn't return any results!" },
|
||||||
const maxPages = maxPagesParam ? Number.parseInt(maxPagesParam, 10) : 5; // Default: 5 pages
|
{ status: 404 },
|
||||||
const sortBy = reqUrl.searchParams.get("sortBy") as
|
);
|
||||||
| "relevancy"
|
return Response.json(items, { status: 200 });
|
||||||
| "date"
|
},
|
||||||
| "price"
|
|
||||||
| "distance"
|
|
||||||
| undefined;
|
|
||||||
const sortOrder = reqUrl.searchParams.get("sortOrder") as
|
|
||||||
| "asc"
|
|
||||||
| "desc"
|
|
||||||
| undefined;
|
|
||||||
|
|
||||||
// Build search options
|
"/api/facebook": async (req: Request) => {
|
||||||
const locationValue = location
|
const reqUrl = new URL(req.url);
|
||||||
? /^\d+$/.test(location)
|
|
||||||
? Number(location)
|
|
||||||
: location
|
|
||||||
: 1700272;
|
|
||||||
const categoryValue = category
|
|
||||||
? /^\d+$/.test(category)
|
|
||||||
? Number(category)
|
|
||||||
: category
|
|
||||||
: 0;
|
|
||||||
|
|
||||||
const searchOptions: import("@/kijiji").SearchOptions = {
|
const SEARCH_QUERY =
|
||||||
location: locationValue,
|
req.headers.get("query") || reqUrl.searchParams.get("q") || null;
|
||||||
category: categoryValue,
|
if (!SEARCH_QUERY)
|
||||||
keywords: SEARCH_QUERY,
|
return Response.json(
|
||||||
sortBy: sortBy || "relevancy",
|
{
|
||||||
sortOrder: sortOrder || "desc",
|
message:
|
||||||
maxPages,
|
"Request didn't have 'query' header or 'q' search parameter!",
|
||||||
};
|
},
|
||||||
|
{ status: 400 },
|
||||||
|
);
|
||||||
|
|
||||||
// Build listing fetch options with enhanced defaults
|
const LOCATION = reqUrl.searchParams.get("location") || "toronto";
|
||||||
const listingOptions: import("@/kijiji").ListingFetchOptions = {
|
const COOKIES_SOURCE = reqUrl.searchParams.get("cookies") || undefined;
|
||||||
includeImages: true, // Always include full image arrays
|
|
||||||
sellerDataDepth: "detailed", // Default: detailed seller info
|
|
||||||
includeClientSideData: false, // GraphQL reviews disabled by default
|
|
||||||
};
|
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const items = await fetchKijijiItems(
|
const items = await fetchFacebookItems(SEARCH_QUERY, 5, LOCATION, 25, COOKIES_SOURCE);
|
||||||
SEARCH_QUERY,
|
if (!items || items.length === 0)
|
||||||
1,
|
return Response.json(
|
||||||
undefined,
|
{ message: "Search didn't return any results!" },
|
||||||
searchOptions,
|
{ status: 404 },
|
||||||
listingOptions,
|
);
|
||||||
);
|
return Response.json(items, { status: 200 });
|
||||||
if (!items || items.length === 0)
|
} catch (error) {
|
||||||
return Response.json(
|
console.error("Facebook scraping error:", error);
|
||||||
{ message: "Search didn't return any results!" },
|
const errorMessage = error instanceof Error ? error.message : "Unknown error occurred";
|
||||||
{ status: 404 },
|
return Response.json(
|
||||||
);
|
{ message: errorMessage },
|
||||||
return Response.json(items, { status: 200 });
|
{ status: 400 },
|
||||||
} catch (error) {
|
);
|
||||||
console.error("Kijiji scraping error:", error);
|
}
|
||||||
const errorMessage =
|
},
|
||||||
error instanceof Error ? error.message : "Unknown error occurred";
|
|
||||||
return Response.json(
|
|
||||||
{
|
|
||||||
message: `Scraping failed: ${errorMessage}`,
|
|
||||||
query: SEARCH_QUERY,
|
|
||||||
options: { searchOptions, listingOptions },
|
|
||||||
},
|
|
||||||
{ status: 500 },
|
|
||||||
);
|
|
||||||
}
|
|
||||||
},
|
|
||||||
|
|
||||||
"/api/facebook": async (req: Request) => {
|
"/api/ebay": async (req: Request) => {
|
||||||
const reqUrl = new URL(req.url);
|
const reqUrl = new URL(req.url);
|
||||||
|
|
||||||
const SEARCH_QUERY =
|
const SEARCH_QUERY =
|
||||||
req.headers.get("query") || reqUrl.searchParams.get("q") || null;
|
req.headers.get("query") || reqUrl.searchParams.get("q") || null;
|
||||||
if (!SEARCH_QUERY)
|
if (!SEARCH_QUERY)
|
||||||
return Response.json(
|
return Response.json(
|
||||||
{
|
{
|
||||||
message:
|
message:
|
||||||
"Request didn't have 'query' header or 'q' search parameter!",
|
"Request didn't have 'query' header or 'q' search parameter!",
|
||||||
},
|
},
|
||||||
{ status: 400 },
|
{ status: 400 },
|
||||||
);
|
);
|
||||||
|
|
||||||
const LOCATION = reqUrl.searchParams.get("location") || "toronto";
|
// Parse optional parameters with defaults
|
||||||
const COOKIES_SOURCE = reqUrl.searchParams.get("cookies") || undefined;
|
const minPrice = reqUrl.searchParams.get("minPrice")
|
||||||
|
? parseInt(reqUrl.searchParams.get("minPrice")!)
|
||||||
|
: undefined;
|
||||||
|
const maxPrice = reqUrl.searchParams.get("maxPrice")
|
||||||
|
? parseInt(reqUrl.searchParams.get("maxPrice")!)
|
||||||
|
: undefined;
|
||||||
|
const strictMode = reqUrl.searchParams.get("strictMode") === "true";
|
||||||
|
const exclusionsParam = reqUrl.searchParams.get("exclusions");
|
||||||
|
const exclusions = exclusionsParam ? exclusionsParam.split(",").map(s => s.trim()) : [];
|
||||||
|
const keywordsParam = reqUrl.searchParams.get("keywords");
|
||||||
|
const keywords = keywordsParam ? keywordsParam.split(",").map(s => s.trim()) : [SEARCH_QUERY];
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const items = await fetchFacebookItems(
|
const items = await fetchEbayItems(SEARCH_QUERY, 5, {
|
||||||
SEARCH_QUERY,
|
minPrice,
|
||||||
5,
|
maxPrice,
|
||||||
LOCATION,
|
strictMode,
|
||||||
25,
|
exclusions,
|
||||||
COOKIES_SOURCE,
|
keywords,
|
||||||
"./cookies/facebook.json",
|
});
|
||||||
);
|
if (!items || items.length === 0)
|
||||||
if (!items || items.length === 0)
|
return Response.json(
|
||||||
return Response.json(
|
{ message: "Search didn't return any results!" },
|
||||||
{ message: "Search didn't return any results!" },
|
{ status: 404 },
|
||||||
{ status: 404 },
|
);
|
||||||
);
|
return Response.json(items, { status: 200 });
|
||||||
return Response.json(items, { status: 200 });
|
} catch (error) {
|
||||||
} catch (error) {
|
console.error("eBay scraping error:", error);
|
||||||
console.error("Facebook scraping error:", error);
|
const errorMessage = error instanceof Error ? error.message : "Unknown error occurred";
|
||||||
const errorMessage =
|
return Response.json(
|
||||||
error instanceof Error ? error.message : "Unknown error occurred";
|
{ message: errorMessage },
|
||||||
return Response.json({ message: errorMessage }, { status: 400 });
|
{ status: 400 },
|
||||||
}
|
);
|
||||||
},
|
}
|
||||||
|
},
|
||||||
|
|
||||||
"/api/ebay": async (req: Request) => {
|
// Wildcard route for all routes that start with "/api/" and aren't otherwise matched
|
||||||
const reqUrl = new URL(req.url);
|
"/api/*": Response.json({ message: "Not found" }, { status: 404 }),
|
||||||
|
|
||||||
const SEARCH_QUERY =
|
// // Serve a file by buffering it in memory
|
||||||
req.headers.get("query") || reqUrl.searchParams.get("q") || null;
|
// "/favicon.ico": new Response(await Bun.file("./favicon.ico").bytes(), {
|
||||||
if (!SEARCH_QUERY)
|
// headers: {
|
||||||
return Response.json(
|
// "Content-Type": "image/x-icon",
|
||||||
{
|
// },
|
||||||
message:
|
// }),
|
||||||
"Request didn't have 'query' header or 'q' search parameter!",
|
},
|
||||||
},
|
|
||||||
{ status: 400 },
|
|
||||||
);
|
|
||||||
|
|
||||||
// Parse optional parameters with defaults
|
// (optional) fallback for unmatched routes:
|
||||||
const minPriceParam = reqUrl.searchParams.get("minPrice");
|
// Required if Bun's version < 1.2.3
|
||||||
const minPrice = minPriceParam
|
fetch(req: Request) {
|
||||||
? Number.parseInt(minPriceParam, 10)
|
return new Response("Not Found", { status: 404 });
|
||||||
: undefined;
|
},
|
||||||
const maxPriceParam = reqUrl.searchParams.get("maxPrice");
|
|
||||||
const maxPrice = maxPriceParam
|
|
||||||
? Number.parseInt(maxPriceParam, 10)
|
|
||||||
: undefined;
|
|
||||||
const strictMode = reqUrl.searchParams.get("strictMode") === "true";
|
|
||||||
const exclusionsParam = reqUrl.searchParams.get("exclusions");
|
|
||||||
const exclusions = exclusionsParam
|
|
||||||
? exclusionsParam.split(",").map((s) => s.trim())
|
|
||||||
: [];
|
|
||||||
const keywordsParam = reqUrl.searchParams.get("keywords");
|
|
||||||
const keywords = keywordsParam
|
|
||||||
? keywordsParam.split(",").map((s) => s.trim())
|
|
||||||
: [SEARCH_QUERY];
|
|
||||||
|
|
||||||
try {
|
|
||||||
const items = await fetchEbayItems(SEARCH_QUERY, 5, {
|
|
||||||
minPrice,
|
|
||||||
maxPrice,
|
|
||||||
strictMode,
|
|
||||||
exclusions,
|
|
||||||
keywords,
|
|
||||||
});
|
|
||||||
if (!items || items.length === 0)
|
|
||||||
return Response.json(
|
|
||||||
{ message: "Search didn't return any results!" },
|
|
||||||
{ status: 404 },
|
|
||||||
);
|
|
||||||
return Response.json(items, { status: 200 });
|
|
||||||
} catch (error) {
|
|
||||||
console.error("eBay scraping error:", error);
|
|
||||||
const errorMessage =
|
|
||||||
error instanceof Error ? error.message : "Unknown error occurred";
|
|
||||||
return Response.json({ message: errorMessage }, { status: 400 });
|
|
||||||
}
|
|
||||||
},
|
|
||||||
|
|
||||||
// Wildcard route for all routes that start with "/api/" and aren't otherwise matched
|
|
||||||
"/api/*": Response.json({ message: "Not found" }, { status: 404 }),
|
|
||||||
|
|
||||||
// // Serve a file by buffering it in memory
|
|
||||||
// "/favicon.ico": new Response(await Bun.file("./favicon.ico").bytes(), {
|
|
||||||
// headers: {
|
|
||||||
// "Content-Type": "image/x-icon",
|
|
||||||
// },
|
|
||||||
// }),
|
|
||||||
},
|
|
||||||
|
|
||||||
// (optional) fallback for unmatched routes:
|
|
||||||
// Required if Bun's version < 1.2.3
|
|
||||||
fetch(req: Request) {
|
|
||||||
return new Response("Not Found", { status: 404 });
|
|
||||||
},
|
|
||||||
});
|
});
|
||||||
|
|
||||||
console.log(`Serving on ${server.hostname}:${server.port}`);
|
console.log(`Serving on ${server.hostname}:${server.port}`);
|
||||||
|
|||||||
1188
src/kijiji.ts
1188
src/kijiji.ts
File diff suppressed because it is too large
Load Diff
@@ -1,834 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
|
|
||||||
import {
|
|
||||||
extractFacebookItemData,
|
|
||||||
extractFacebookMarketplaceData,
|
|
||||||
fetchFacebookItem,
|
|
||||||
formatCentsToCurrency,
|
|
||||||
formatCookiesForHeader,
|
|
||||||
loadFacebookCookies,
|
|
||||||
parseFacebookAds,
|
|
||||||
parseFacebookCookieString,
|
|
||||||
parseFacebookItem,
|
|
||||||
} from "../src/facebook";
|
|
||||||
|
|
||||||
// Mock fetch globally
|
|
||||||
const originalFetch = global.fetch;
|
|
||||||
|
|
||||||
describe("Facebook Marketplace Scraper Core Tests", () => {
|
|
||||||
beforeEach(() => {
|
|
||||||
global.fetch = mock(() => {
|
|
||||||
throw new Error("fetch should be mocked in individual tests");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
afterEach(() => {
|
|
||||||
global.fetch = originalFetch;
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Cookie Parsing", () => {
|
|
||||||
describe("parseFacebookCookieString", () => {
|
|
||||||
test("should parse valid cookie string", () => {
|
|
||||||
const cookieString = "c_user=123456789; xs=abcdef123456; fr=xyz789";
|
|
||||||
const result = parseFacebookCookieString(cookieString);
|
|
||||||
|
|
||||||
expect(result).toHaveLength(3);
|
|
||||||
expect(result[0]).toEqual({
|
|
||||||
name: "c_user",
|
|
||||||
value: "123456789",
|
|
||||||
domain: ".facebook.com",
|
|
||||||
path: "/",
|
|
||||||
secure: true,
|
|
||||||
httpOnly: false,
|
|
||||||
sameSite: "lax",
|
|
||||||
expirationDate: undefined,
|
|
||||||
});
|
|
||||||
expect(result[1]).toEqual({
|
|
||||||
name: "xs",
|
|
||||||
value: "abcdef123456",
|
|
||||||
domain: ".facebook.com",
|
|
||||||
path: "/",
|
|
||||||
secure: true,
|
|
||||||
httpOnly: false,
|
|
||||||
sameSite: "lax",
|
|
||||||
expirationDate: undefined,
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle URL-encoded values", () => {
|
|
||||||
const cookieString = "c_user=123%2B456; xs=abc%3Ddef";
|
|
||||||
const result = parseFacebookCookieString(cookieString);
|
|
||||||
|
|
||||||
expect(result[0].value).toBe("123+456");
|
|
||||||
expect(result[1].value).toBe("abc=def");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should filter out malformed cookies", () => {
|
|
||||||
const cookieString = "c_user=123; invalid; xs=abc; =empty";
|
|
||||||
const result = parseFacebookCookieString(cookieString);
|
|
||||||
|
|
||||||
expect(result).toHaveLength(2);
|
|
||||||
expect(result.map((c) => c.name)).toEqual(["c_user", "xs"]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle empty input", () => {
|
|
||||||
expect(parseFacebookCookieString("")).toEqual([]);
|
|
||||||
expect(parseFacebookCookieString(" ")).toEqual([]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle extra whitespace", () => {
|
|
||||||
const cookieString = " c_user = 123 ; xs=abc ";
|
|
||||||
const result = parseFacebookCookieString(cookieString);
|
|
||||||
|
|
||||||
expect(result).toHaveLength(2);
|
|
||||||
expect(result[0].name).toBe("c_user");
|
|
||||||
expect(result[0].value).toBe("123");
|
|
||||||
expect(result[1].name).toBe("xs");
|
|
||||||
expect(result[1].value).toBe("abc");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Facebook Item Fetching", () => {
|
|
||||||
describe("fetchFacebookItem", () => {
|
|
||||||
const mockCookies = JSON.stringify([
|
|
||||||
{ name: "c_user", value: "12345", domain: ".facebook.com" },
|
|
||||||
{ name: "xs", value: "abc123", domain: ".facebook.com" },
|
|
||||||
]);
|
|
||||||
|
|
||||||
test("should handle authentication errors", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 401,
|
|
||||||
text: () => Promise.resolve("Authentication required"),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const result = await fetchFacebookItem("123", mockCookies);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle item not found", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 404,
|
|
||||||
text: () => Promise.resolve("Not found"),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const result = await fetchFacebookItem("nonexistent", mockCookies);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle rate limiting", async () => {
|
|
||||||
let attempts = 0;
|
|
||||||
global.fetch = mock(() => {
|
|
||||||
attempts++;
|
|
||||||
if (attempts === 1) {
|
|
||||||
return Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 429,
|
|
||||||
headers: {
|
|
||||||
get: (header: string) => {
|
|
||||||
if (header === "X-RateLimit-Reset") return "1";
|
|
||||||
return null;
|
|
||||||
},
|
|
||||||
},
|
|
||||||
text: () => Promise.resolve("Rate limited"),
|
|
||||||
});
|
|
||||||
}
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
viewer: {
|
|
||||||
marketplace_product_details_page: {
|
|
||||||
target: {
|
|
||||||
id: "123",
|
|
||||||
__typename: "GroupCommerceProductItem",
|
|
||||||
marketplace_listing_title: "Test Item",
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
return Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
const result = await fetchFacebookItem("123", mockCookies);
|
|
||||||
expect(attempts).toBe(2);
|
|
||||||
// Should eventually succeed after retry
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle sold items", async () => {
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
viewer: {
|
|
||||||
marketplace_product_details_page: {
|
|
||||||
target: {
|
|
||||||
id: "456",
|
|
||||||
__typename: "GroupCommerceProductItem",
|
|
||||||
marketplace_listing_title: "Sold Item",
|
|
||||||
is_sold: true,
|
|
||||||
is_live: false,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const result = await fetchFacebookItem("456", mockCookies);
|
|
||||||
expect(result?.listingStatus).toBe("SOLD");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle missing authentication cookies", async () => {
|
|
||||||
// Use a test-specific cookie file that doesn't exist
|
|
||||||
const testCookiePath = "./cookies/facebook-test.json";
|
|
||||||
|
|
||||||
// Test with no cookies available (test file doesn't exist)
|
|
||||||
await expect(
|
|
||||||
fetchFacebookItem("123", undefined, testCookiePath),
|
|
||||||
).rejects.toThrow("No valid Facebook cookies found");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle successful item extraction", async () => {
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
viewer: {
|
|
||||||
marketplace_product_details_page: {
|
|
||||||
target: {
|
|
||||||
id: "789",
|
|
||||||
__typename: "GroupCommerceProductItem",
|
|
||||||
marketplace_listing_title: "Working Item",
|
|
||||||
formatted_price: { text: "$299.00" },
|
|
||||||
listing_price: {
|
|
||||||
amount: "299.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
creation_time: 1640995200,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const result = await fetchFacebookItem("789", mockCookies);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result?.title).toBe("Working Item");
|
|
||||||
expect(result?.listingPrice?.amountFormatted).toBe("$299.00");
|
|
||||||
expect(result?.listingStatus).toBe("ACTIVE");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle server errors", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 500,
|
|
||||||
text: () => Promise.resolve("Internal Server Error"),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const result = await fetchFacebookItem("error", mockCookies);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Data Extraction", () => {
|
|
||||||
describe("extractFacebookItemData", () => {
|
|
||||||
test("should extract item data from standard require structure", () => {
|
|
||||||
const mockItemData = {
|
|
||||||
id: "123456",
|
|
||||||
__typename: "GroupCommerceProductItem",
|
|
||||||
marketplace_listing_title: "Test Item",
|
|
||||||
formatted_price: { text: "$100.00" },
|
|
||||||
listing_price: { amount: "100.00", currency: "CAD" },
|
|
||||||
is_live: true,
|
|
||||||
};
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
viewer: {
|
|
||||||
marketplace_product_details_page: {
|
|
||||||
target: mockItemData,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
const html = `<html><body><script>${JSON.stringify(mockData)}</script></body></html>`;
|
|
||||||
|
|
||||||
const result = extractFacebookItemData(html);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result?.id).toBe("123456");
|
|
||||||
expect(result?.marketplace_listing_title).toBe("Test Item");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle missing item data", () => {
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
viewer: {
|
|
||||||
marketplace_product_details_page: {},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
const html = `<html><body><script>${JSON.stringify(mockData)}</script></body></html>`;
|
|
||||||
|
|
||||||
const result = extractFacebookItemData(html);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle malformed HTML", () => {
|
|
||||||
const result = extractFacebookItemData(
|
|
||||||
"<html><body>Invalid HTML</body></html>",
|
|
||||||
);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle invalid JSON in script tags", () => {
|
|
||||||
const html =
|
|
||||||
"<html><body><script>{invalid: json}</script></body></html>";
|
|
||||||
const result = extractFacebookItemData(html);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should extract item with vehicle data", () => {
|
|
||||||
const mockVehicleItem = {
|
|
||||||
id: "789",
|
|
||||||
__typename: "GroupCommerceProductItem",
|
|
||||||
marketplace_listing_title: "2006 Honda Civic",
|
|
||||||
formatted_price: { text: "$5,000" },
|
|
||||||
listing_price: { amount: "5000.00", currency: "CAD" },
|
|
||||||
vehicle_make_display_name: "Honda",
|
|
||||||
vehicle_model_display_name: "Civic",
|
|
||||||
vehicle_odometer_data: { unit: "KILOMETERS", value: 150000 },
|
|
||||||
vehicle_transmission_type: "AUTOMATIC",
|
|
||||||
is_live: true,
|
|
||||||
};
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
viewer: {
|
|
||||||
marketplace_product_details_page: {
|
|
||||||
target: mockVehicleItem,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
const html = `<html><body><script>${JSON.stringify(mockData)}</script></body></html>`;
|
|
||||||
|
|
||||||
const result = extractFacebookItemData(html);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result?.vehicle_make_display_name).toBe("Honda");
|
|
||||||
expect(result?.vehicle_odometer_data?.value).toBe(150000);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("extractFacebookMarketplaceData", () => {
|
|
||||||
test("should extract search results from marketplace data", () => {
|
|
||||||
const mockMarketplaceData = {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "Item 1",
|
|
||||||
listing_price: { amount: "10.00", currency: "CAD" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "2",
|
|
||||||
marketplace_listing_title: "Item 2",
|
|
||||||
listing_price: { amount: "20.00", currency: "CAD" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
};
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: mockMarketplaceData,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
const html = `<html><body><script>${JSON.stringify(mockData)}</script></body></html>`;
|
|
||||||
|
|
||||||
const result = extractFacebookMarketplaceData(html);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result).toHaveLength(2);
|
|
||||||
expect(result?.[0].node.listing.marketplace_listing_title).toBe(
|
|
||||||
"Item 1",
|
|
||||||
);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle empty search results", () => {
|
|
||||||
const mockData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: { edges: [] },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
const html = `<html><body><script>${JSON.stringify(mockData)}</script></body></html>`;
|
|
||||||
|
|
||||||
const result = extractFacebookMarketplaceData(html);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Data Parsing", () => {
|
|
||||||
describe("parseFacebookItem", () => {
|
|
||||||
test("should parse complete item with all fields", () => {
|
|
||||||
const item = {
|
|
||||||
id: "123456",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "iPhone 13 Pro",
|
|
||||||
redacted_description: { text: "Excellent condition" },
|
|
||||||
formatted_price: { text: "$800.00" },
|
|
||||||
listing_price: { amount: "800.00", currency: "CAD" },
|
|
||||||
location_text: { text: "Toronto, ON" },
|
|
||||||
is_live: true,
|
|
||||||
creation_time: 1640995200,
|
|
||||||
marketplace_listing_seller: {
|
|
||||||
id: "seller1",
|
|
||||||
name: "John Doe",
|
|
||||||
},
|
|
||||||
delivery_types: ["IN_PERSON"],
|
|
||||||
};
|
|
||||||
|
|
||||||
const result = parseFacebookItem(item);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result?.title).toBe("iPhone 13 Pro");
|
|
||||||
expect(result?.description).toBe("Excellent condition");
|
|
||||||
expect(result?.listingPrice?.amountFormatted).toBe("$800.00");
|
|
||||||
expect(result?.listingPrice?.cents).toBe(80000);
|
|
||||||
expect(result?.listingPrice?.currency).toBe("CAD");
|
|
||||||
expect(result?.address).toBe("Toronto, ON");
|
|
||||||
expect(result?.listingStatus).toBe("ACTIVE");
|
|
||||||
expect(result?.seller?.name).toBe("John Doe");
|
|
||||||
expect(result?.deliveryTypes).toEqual(["IN_PERSON"]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should parse FREE items", () => {
|
|
||||||
const item = {
|
|
||||||
id: "789",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "Free Sofa",
|
|
||||||
formatted_price: { text: "FREE" },
|
|
||||||
listing_price: { amount: "0.00", currency: "CAD" },
|
|
||||||
is_live: true,
|
|
||||||
};
|
|
||||||
|
|
||||||
const result = parseFacebookItem(item);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result?.title).toBe("Free Sofa");
|
|
||||||
expect(result?.listingPrice?.amountFormatted).toBe("FREE");
|
|
||||||
expect(result?.listingPrice?.cents).toBe(0);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle missing optional fields", () => {
|
|
||||||
const item = {
|
|
||||||
id: "456",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "Minimal Item",
|
|
||||||
};
|
|
||||||
|
|
||||||
const result = parseFacebookItem(item);
|
|
||||||
expect(result).not.toBeNull();
|
|
||||||
expect(result?.title).toBe("Minimal Item");
|
|
||||||
expect(result?.description).toBeUndefined();
|
|
||||||
expect(result?.seller).toBeUndefined();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should identify vehicle listings", () => {
|
|
||||||
const vehicleItem = {
|
|
||||||
id: "999",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "2012 Mazda 3",
|
|
||||||
formatted_price: { text: "$8,000" },
|
|
||||||
listing_price: { amount: "8000.00", currency: "CAD" },
|
|
||||||
vehicle_make_display_name: "Mazda",
|
|
||||||
vehicle_model_display_name: "3",
|
|
||||||
is_live: true,
|
|
||||||
};
|
|
||||||
|
|
||||||
const result = parseFacebookItem(vehicleItem);
|
|
||||||
expect(result?.listingType).toBe("vehicle");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle different listing statuses", () => {
|
|
||||||
const soldItem = {
|
|
||||||
id: "111",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "Sold Item",
|
|
||||||
is_sold: true,
|
|
||||||
is_live: false,
|
|
||||||
};
|
|
||||||
|
|
||||||
const pendingItem = {
|
|
||||||
id: "222",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "Pending Item",
|
|
||||||
is_pending: true,
|
|
||||||
is_live: true,
|
|
||||||
};
|
|
||||||
|
|
||||||
const hiddenItem = {
|
|
||||||
id: "333",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
marketplace_listing_title: "Hidden Item",
|
|
||||||
is_hidden: true,
|
|
||||||
is_live: false,
|
|
||||||
};
|
|
||||||
|
|
||||||
expect(parseFacebookItem(soldItem)?.listingStatus).toBe("SOLD");
|
|
||||||
expect(parseFacebookItem(pendingItem)?.listingStatus).toBe("PENDING");
|
|
||||||
expect(parseFacebookItem(hiddenItem)?.listingStatus).toBe("HIDDEN");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should return null for items without title", () => {
|
|
||||||
const invalidItem = {
|
|
||||||
id: "invalid",
|
|
||||||
__typename: "GroupCommerceProductItem" as const,
|
|
||||||
is_live: true,
|
|
||||||
};
|
|
||||||
|
|
||||||
const result = parseFacebookItem(invalidItem);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("parseFacebookAds", () => {
|
|
||||||
test("should parse search result ads", () => {
|
|
||||||
const ads = [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "Ad 1",
|
|
||||||
listing_price: {
|
|
||||||
amount: "50.00",
|
|
||||||
formatted_amount: "$50.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
reverse_geocode: { city_page: { display_name: "Toronto" } },
|
|
||||||
},
|
|
||||||
creation_time: 1640995200,
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "2",
|
|
||||||
marketplace_listing_title: "Ad 2",
|
|
||||||
listing_price: {
|
|
||||||
amount: "75.00",
|
|
||||||
formatted_amount: "$75.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
reverse_geocode: { city_page: { display_name: "Ottawa" } },
|
|
||||||
},
|
|
||||||
creation_time: 1640995300,
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
];
|
|
||||||
|
|
||||||
const results = parseFacebookAds(ads);
|
|
||||||
expect(results).toHaveLength(2);
|
|
||||||
expect(results[0].title).toBe("Ad 1");
|
|
||||||
expect(results[0].listingPrice?.cents).toBe(5000);
|
|
||||||
expect(results[0].address).toBe("Toronto");
|
|
||||||
expect(results[1].title).toBe("Ad 2");
|
|
||||||
expect(results[1].address).toBe("Ottawa");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should filter out ads without price", () => {
|
|
||||||
const ads = [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "With Price",
|
|
||||||
listing_price: {
|
|
||||||
amount: "100.00",
|
|
||||||
formatted_amount: "$100.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "2",
|
|
||||||
marketplace_listing_title: "No Price",
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
];
|
|
||||||
|
|
||||||
const results = parseFacebookAds(ads);
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
expect(results[0].title).toBe("With Price");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle malformed ads gracefully", () => {
|
|
||||||
const ads = [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "Valid Ad",
|
|
||||||
listing_price: {
|
|
||||||
amount: "50.00",
|
|
||||||
formatted_amount: "$50.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
// Missing listing
|
|
||||||
},
|
|
||||||
} as { node: { listing?: unknown } },
|
|
||||||
];
|
|
||||||
|
|
||||||
const results = parseFacebookAds(ads);
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
expect(results[0].title).toBe("Valid Ad");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Utility Functions", () => {
|
|
||||||
describe("formatCentsToCurrency", () => {
|
|
||||||
test("should format cents to currency string", () => {
|
|
||||||
expect(formatCentsToCurrency(100)).toBe("$1.00");
|
|
||||||
expect(formatCentsToCurrency(1000)).toBe("$10.00");
|
|
||||||
expect(formatCentsToCurrency(9999)).toBe("$99.99");
|
|
||||||
expect(formatCentsToCurrency(123456)).toBe("$1,234.56");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle string inputs", () => {
|
|
||||||
expect(formatCentsToCurrency("100")).toBe("$1.00");
|
|
||||||
expect(formatCentsToCurrency("1000")).toBe("$10.00");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle zero", () => {
|
|
||||||
expect(formatCentsToCurrency(0)).toBe("$0.00");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle null and undefined", () => {
|
|
||||||
expect(formatCentsToCurrency(null)).toBe("");
|
|
||||||
expect(formatCentsToCurrency(undefined)).toBe("");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle invalid inputs", () => {
|
|
||||||
expect(formatCentsToCurrency("invalid")).toBe("");
|
|
||||||
expect(formatCentsToCurrency(Number.NaN)).toBe("");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("formatCookiesForHeader", () => {
|
|
||||||
const mockCookies = [
|
|
||||||
{ name: "c_user", value: "123456", domain: ".facebook.com", path: "/" },
|
|
||||||
{ name: "xs", value: "abcdef", domain: ".facebook.com", path: "/" },
|
|
||||||
{ name: "session_id", value: "xyz", domain: "other.com", path: "/" },
|
|
||||||
];
|
|
||||||
|
|
||||||
test("should format cookies for header string", () => {
|
|
||||||
const result = formatCookiesForHeader(mockCookies, "www.facebook.com");
|
|
||||||
expect(result).toBe("c_user=123456; xs=abcdef");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should filter expired cookies", () => {
|
|
||||||
const cookiesWithExpiration = [
|
|
||||||
...mockCookies,
|
|
||||||
{
|
|
||||||
name: "expired",
|
|
||||||
value: "old",
|
|
||||||
domain: ".facebook.com",
|
|
||||||
path: "/",
|
|
||||||
expirationDate: Date.now() / 1000 - 1000,
|
|
||||||
},
|
|
||||||
];
|
|
||||||
const result = formatCookiesForHeader(
|
|
||||||
cookiesWithExpiration,
|
|
||||||
"www.facebook.com",
|
|
||||||
);
|
|
||||||
expect(result).not.toContain("expired");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle no matching cookies", () => {
|
|
||||||
const result = formatCookiesForHeader(mockCookies, "www.google.com");
|
|
||||||
expect(result).toBe("");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle empty cookie array", () => {
|
|
||||||
const result = formatCookiesForHeader([], "www.facebook.com");
|
|
||||||
expect(result).toBe("");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
@@ -1,712 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
|
|
||||||
import fetchFacebookItems, { fetchFacebookItem } from "../src/facebook";
|
|
||||||
|
|
||||||
// Mock fetch globally
|
|
||||||
const originalFetch = global.fetch;
|
|
||||||
|
|
||||||
describe("Facebook Marketplace Scraper Integration Tests", () => {
|
|
||||||
beforeEach(() => {
|
|
||||||
global.fetch = mock(() => {
|
|
||||||
throw new Error("fetch should be mocked in individual tests");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
afterEach(() => {
|
|
||||||
global.fetch = originalFetch;
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Main Search Function", () => {
|
|
||||||
const mockCookies = JSON.stringify([
|
|
||||||
{ name: "c_user", value: "12345", domain: ".facebook.com", path: "/" },
|
|
||||||
{ name: "xs", value: "abc123", domain: ".facebook.com", path: "/" },
|
|
||||||
]);
|
|
||||||
|
|
||||||
test("should successfully fetch search results", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "iPhone 13 Pro",
|
|
||||||
listing_price: {
|
|
||||||
amount: "800.00",
|
|
||||||
formatted_amount: "$800.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
reverse_geocode: {
|
|
||||||
city_page: { display_name: "Toronto" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
creation_time: 1640995200,
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "2",
|
|
||||||
marketplace_listing_title: "Samsung Galaxy",
|
|
||||||
listing_price: {
|
|
||||||
amount: "600.00",
|
|
||||||
formatted_amount: "$600.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
reverse_geocode: {
|
|
||||||
city_page: { display_name: "Mississauga" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
creation_time: 1640995300,
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"iPhone",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toHaveLength(2);
|
|
||||||
expect(results[0].title).toBe("iPhone 13 Pro");
|
|
||||||
expect(results[1].title).toBe("Samsung Galaxy");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should filter out items without price", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "With Price",
|
|
||||||
listing_price: {
|
|
||||||
amount: "100.00",
|
|
||||||
formatted_amount: "$100.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "2",
|
|
||||||
marketplace_listing_title: "No Price",
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
expect(results[0].title).toBe("With Price");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should respect MAX_ITEMS parameter", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: Array.from({ length: 10 }, (_, i) => ({
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: String(i),
|
|
||||||
marketplace_listing_title: `Item ${i}`,
|
|
||||||
listing_price: {
|
|
||||||
amount: `${(i + 1) * 10}.00`,
|
|
||||||
formatted_amount: `$${(i + 1) * 10}.00`,
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})),
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
5,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toHaveLength(5);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should return empty array for no results", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"nonexistent query",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toEqual([]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle authentication errors gracefully", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 401,
|
|
||||||
text: () => Promise.resolve("Unauthorized"),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toEqual([]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle network errors", async () => {
|
|
||||||
global.fetch = mock(() => Promise.reject(new Error("Network error")));
|
|
||||||
|
|
||||||
await expect(
|
|
||||||
fetchFacebookItems("test", 1, "toronto", 25, mockCookies),
|
|
||||||
).rejects.toThrow("Network error");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle rate limiting with retry", async () => {
|
|
||||||
let attempts = 0;
|
|
||||||
global.fetch = mock(() => {
|
|
||||||
attempts++;
|
|
||||||
if (attempts === 1) {
|
|
||||||
return Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 429,
|
|
||||||
headers: {
|
|
||||||
get: (header: string) => {
|
|
||||||
if (header === "X-RateLimit-Reset") return "1";
|
|
||||||
return null;
|
|
||||||
},
|
|
||||||
},
|
|
||||||
text: () => Promise.resolve("Rate limited"),
|
|
||||||
});
|
|
||||||
}
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "Item 1",
|
|
||||||
listing_price: {
|
|
||||||
amount: "100.00",
|
|
||||||
formatted_amount: "$100.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
return Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(attempts).toBe(2);
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Vehicle Listing Integration", () => {
|
|
||||||
const mockCookies = JSON.stringify([
|
|
||||||
{ name: "c_user", value: "12345", domain: ".facebook.com", path: "/" },
|
|
||||||
{ name: "xs", value: "abc123", domain: ".facebook.com", path: "/" },
|
|
||||||
]);
|
|
||||||
|
|
||||||
test("should correctly identify and parse vehicle listings", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "2006 Honda Civic",
|
|
||||||
listing_price: {
|
|
||||||
amount: "8000.00",
|
|
||||||
formatted_amount: "$8,000.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "2",
|
|
||||||
marketplace_listing_title: "iPhone 13",
|
|
||||||
listing_price: {
|
|
||||||
amount: "800.00",
|
|
||||||
formatted_amount: "$800.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"cars",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toHaveLength(2);
|
|
||||||
// Both should be classified as "item" type in search results (vehicle detection is for item details)
|
|
||||||
expect(results[0].title).toBe("2006 Honda Civic");
|
|
||||||
expect(results[1].title).toBe("iPhone 13");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Different Categories", () => {
|
|
||||||
const mockCookies = JSON.stringify([
|
|
||||||
{ name: "c_user", value: "12345", domain: ".facebook.com", path: "/" },
|
|
||||||
{ name: "xs", value: "abc123", domain: ".facebook.com", path: "/" },
|
|
||||||
]);
|
|
||||||
|
|
||||||
test("should handle electronics listings", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "Nintendo Switch",
|
|
||||||
listing_price: {
|
|
||||||
amount: "250.00",
|
|
||||||
formatted_amount: "$250.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
reverse_geocode: {
|
|
||||||
city_page: { display_name: "Toronto" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
marketplace_listing_category_id:
|
|
||||||
"479353692612078",
|
|
||||||
condition: "USED",
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"nintendo switch",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
expect(results[0].title).toBe("Nintendo Switch");
|
|
||||||
expect(results[0].categoryId).toBe("479353692612078");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle home goods/furniture listings", async () => {
|
|
||||||
const mockSearchData = {
|
|
||||||
require: [
|
|
||||||
[
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
null,
|
|
||||||
{
|
|
||||||
__bbox: {
|
|
||||||
result: {
|
|
||||||
data: {
|
|
||||||
marketplace_search: {
|
|
||||||
feed_units: {
|
|
||||||
edges: [
|
|
||||||
{
|
|
||||||
node: {
|
|
||||||
listing: {
|
|
||||||
id: "1",
|
|
||||||
marketplace_listing_title: "Dining Table",
|
|
||||||
listing_price: {
|
|
||||||
amount: "150.00",
|
|
||||||
formatted_amount: "$150.00",
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
reverse_geocode: {
|
|
||||||
city_page: { display_name: "Mississauga" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
marketplace_listing_category_id:
|
|
||||||
"1569171756675761",
|
|
||||||
condition: "USED",
|
|
||||||
is_live: true,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
],
|
|
||||||
],
|
|
||||||
};
|
|
||||||
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
`<html><body><script>${JSON.stringify(mockSearchData)}</script></body></html>`,
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"table",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
expect(results[0].title).toBe("Dining Table");
|
|
||||||
expect(results[0].categoryId).toBe("1569171756675761");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Error Scenarios", () => {
|
|
||||||
const mockCookies = JSON.stringify([
|
|
||||||
{ name: "c_user", value: "12345", domain: ".facebook.com", path: "/" },
|
|
||||||
{ name: "xs", value: "abc123", domain: ".facebook.com", path: "/" },
|
|
||||||
]);
|
|
||||||
|
|
||||||
test("should handle malformed HTML responses", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: true,
|
|
||||||
text: () =>
|
|
||||||
Promise.resolve(
|
|
||||||
"<html><body>Invalid HTML without JSON data</body></html>",
|
|
||||||
),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toEqual([]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle 404 errors gracefully", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 404,
|
|
||||||
text: () => Promise.resolve("Not found"),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toEqual([]);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle 500 errors gracefully", async () => {
|
|
||||||
global.fetch = mock(() =>
|
|
||||||
Promise.resolve({
|
|
||||||
ok: false,
|
|
||||||
status: 500,
|
|
||||||
text: () => Promise.resolve("Internal Server Error"),
|
|
||||||
headers: {
|
|
||||||
get: () => null,
|
|
||||||
},
|
|
||||||
}),
|
|
||||||
);
|
|
||||||
|
|
||||||
const results = await fetchFacebookItems(
|
|
||||||
"test",
|
|
||||||
1,
|
|
||||||
"toronto",
|
|
||||||
25,
|
|
||||||
mockCookies,
|
|
||||||
);
|
|
||||||
expect(results).toEqual([]);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
@@ -1,166 +0,0 @@
|
|||||||
import { describe, expect, test } from "bun:test";
|
|
||||||
import {
|
|
||||||
HttpError,
|
|
||||||
NetworkError,
|
|
||||||
ParseError,
|
|
||||||
RateLimitError,
|
|
||||||
ValidationError,
|
|
||||||
buildSearchUrl,
|
|
||||||
resolveCategoryId,
|
|
||||||
resolveLocationId,
|
|
||||||
} from "../src/kijiji";
|
|
||||||
|
|
||||||
describe("Location and Category Resolution", () => {
|
|
||||||
describe("resolveLocationId", () => {
|
|
||||||
test("should return numeric IDs as-is", () => {
|
|
||||||
expect(resolveLocationId(1700272)).toBe(1700272);
|
|
||||||
expect(resolveLocationId(0)).toBe(0);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should resolve string location names", () => {
|
|
||||||
expect(resolveLocationId("canada")).toBe(0);
|
|
||||||
expect(resolveLocationId("ontario")).toBe(9004);
|
|
||||||
expect(resolveLocationId("toronto")).toBe(1700273);
|
|
||||||
expect(resolveLocationId("gta")).toBe(1700272);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle case insensitive matching", () => {
|
|
||||||
expect(resolveLocationId("Canada")).toBe(0);
|
|
||||||
expect(resolveLocationId("ONTARIO")).toBe(9004);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should default to Canada for unknown locations", () => {
|
|
||||||
expect(resolveLocationId("unknown")).toBe(0);
|
|
||||||
expect(resolveLocationId("")).toBe(0);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle undefined input", () => {
|
|
||||||
expect(resolveLocationId(undefined)).toBe(0);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("resolveCategoryId", () => {
|
|
||||||
test("should return numeric IDs as-is", () => {
|
|
||||||
expect(resolveCategoryId(132)).toBe(132);
|
|
||||||
expect(resolveCategoryId(0)).toBe(0);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should resolve string category names", () => {
|
|
||||||
expect(resolveCategoryId("all")).toBe(0);
|
|
||||||
expect(resolveCategoryId("phones")).toBe(132);
|
|
||||||
expect(resolveCategoryId("electronics")).toBe(29659001);
|
|
||||||
expect(resolveCategoryId("buy-sell")).toBe(10);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle case insensitive matching", () => {
|
|
||||||
expect(resolveCategoryId("All")).toBe(0);
|
|
||||||
expect(resolveCategoryId("PHONES")).toBe(132);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should default to all categories for unknown categories", () => {
|
|
||||||
expect(resolveCategoryId("unknown")).toBe(0);
|
|
||||||
expect(resolveCategoryId("")).toBe(0);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle undefined input", () => {
|
|
||||||
expect(resolveCategoryId(undefined)).toBe(0);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("URL Construction", () => {
|
|
||||||
describe("buildSearchUrl", () => {
|
|
||||||
test("should build basic search URL", () => {
|
|
||||||
const url = buildSearchUrl("iphone", {
|
|
||||||
location: 1700272,
|
|
||||||
category: 132,
|
|
||||||
sortBy: "relevancy",
|
|
||||||
sortOrder: "desc",
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(url).toContain("b-buy-sell/canada/iphone/k0c132l1700272");
|
|
||||||
expect(url).toContain("sort=relevancyDesc");
|
|
||||||
expect(url).toContain("order=DESC");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle pagination", () => {
|
|
||||||
const url = buildSearchUrl("iphone", {
|
|
||||||
location: 1700272,
|
|
||||||
category: 132,
|
|
||||||
page: 2,
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(url).toContain("&page=2");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle different sort options", () => {
|
|
||||||
const dateUrl = buildSearchUrl("iphone", {
|
|
||||||
sortBy: "date",
|
|
||||||
sortOrder: "asc",
|
|
||||||
});
|
|
||||||
expect(dateUrl).toContain("sort=DATE");
|
|
||||||
expect(dateUrl).toContain("order=ASC");
|
|
||||||
|
|
||||||
const priceUrl = buildSearchUrl("iphone", {
|
|
||||||
sortBy: "price",
|
|
||||||
sortOrder: "desc",
|
|
||||||
});
|
|
||||||
expect(priceUrl).toContain("sort=PRICE");
|
|
||||||
expect(priceUrl).toContain("order=DESC");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle string location/category inputs", () => {
|
|
||||||
const url = buildSearchUrl("iphone", {
|
|
||||||
location: "toronto",
|
|
||||||
category: "phones",
|
|
||||||
});
|
|
||||||
|
|
||||||
expect(url).toContain("k0c132l1700273"); // phones + toronto
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("Error Classes", () => {
|
|
||||||
test("HttpError should store status and URL", () => {
|
|
||||||
const error = new HttpError("Not found", 404, "https://example.com");
|
|
||||||
expect(error.message).toBe("Not found");
|
|
||||||
expect(error.status).toBe(404);
|
|
||||||
expect(error.url).toBe("https://example.com");
|
|
||||||
expect(error.name).toBe("HttpError");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("NetworkError should store URL and cause", () => {
|
|
||||||
const cause = new Error("Connection failed");
|
|
||||||
const error = new NetworkError(
|
|
||||||
"Network error",
|
|
||||||
"https://example.com",
|
|
||||||
cause,
|
|
||||||
);
|
|
||||||
expect(error.message).toBe("Network error");
|
|
||||||
expect(error.url).toBe("https://example.com");
|
|
||||||
expect(error.cause).toBe(cause);
|
|
||||||
expect(error.name).toBe("NetworkError");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("ParseError should store data", () => {
|
|
||||||
const data = { invalid: "json" };
|
|
||||||
const error = new ParseError("Invalid JSON", data);
|
|
||||||
expect(error.message).toBe("Invalid JSON");
|
|
||||||
expect(error.data).toBe(data);
|
|
||||||
expect(error.name).toBe("ParseError");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("RateLimitError should store URL and reset time", () => {
|
|
||||||
const error = new RateLimitError("Rate limited", "https://example.com", 60);
|
|
||||||
expect(error.message).toBe("Rate limited");
|
|
||||||
expect(error.url).toBe("https://example.com");
|
|
||||||
expect(error.resetTime).toBe(60);
|
|
||||||
expect(error.name).toBe("RateLimitError");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("ValidationError should work without field", () => {
|
|
||||||
const error = new ValidationError("Invalid value");
|
|
||||||
expect(error.message).toBe("Invalid value");
|
|
||||||
expect(error.name).toBe("ValidationError");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
@@ -1,363 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, mock, test } from "bun:test";
|
|
||||||
import {
|
|
||||||
extractApolloState,
|
|
||||||
parseDetailedListing,
|
|
||||||
parseSearch,
|
|
||||||
} from "../src/kijiji";
|
|
||||||
|
|
||||||
// Mock fetch globally
|
|
||||||
const originalFetch = global.fetch;
|
|
||||||
|
|
||||||
describe("HTML Parsing Integration", () => {
|
|
||||||
beforeEach(() => {
|
|
||||||
// Mock fetch for all tests
|
|
||||||
global.fetch = mock(() => {
|
|
||||||
throw new Error("fetch should be mocked in individual tests");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
afterEach(() => {
|
|
||||||
global.fetch = originalFetch;
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("extractApolloState", () => {
|
|
||||||
test("should extract Apollo state from valid HTML", () => {
|
|
||||||
const mockHtml =
|
|
||||||
'<html><head><script id="__NEXT_DATA__" type="application/json">{"props":{"pageProps":{"__APOLLO_STATE__":{"ROOT_QUERY":{"test":"value"}}}}}</script></head></html>';
|
|
||||||
|
|
||||||
const result = extractApolloState(mockHtml);
|
|
||||||
expect(result).toEqual({
|
|
||||||
ROOT_QUERY: { test: "value" },
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should return null for HTML without Apollo state", () => {
|
|
||||||
const mockHtml = "<html><body>No data here</body></html>";
|
|
||||||
const result = extractApolloState(mockHtml);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should return null for malformed JSON", () => {
|
|
||||||
const mockHtml =
|
|
||||||
'<html><script id="__NEXT_DATA__" type="application/json">{"invalid": json}</script></html>';
|
|
||||||
|
|
||||||
const result = extractApolloState(mockHtml);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle missing __NEXT_DATA__ element", () => {
|
|
||||||
const mockHtml = "<html><body><div>Content</div></body></html>";
|
|
||||||
const result = extractApolloState(mockHtml);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("parseSearch", () => {
|
|
||||||
test("should parse search results from HTML", () => {
|
|
||||||
const mockHtml = `
|
|
||||||
<html>
|
|
||||||
<script id="__NEXT_DATA__" type="application/json">
|
|
||||||
${JSON.stringify({
|
|
||||||
props: {
|
|
||||||
pageProps: {
|
|
||||||
__APOLLO_STATE__: {
|
|
||||||
"Listing:123": {
|
|
||||||
url: "/v-iphone/k0l0",
|
|
||||||
title: "iPhone 13 Pro",
|
|
||||||
},
|
|
||||||
"Listing:456": {
|
|
||||||
url: "/v-samsung/k0l0",
|
|
||||||
title: "Samsung Galaxy",
|
|
||||||
},
|
|
||||||
ROOT_QUERY: { test: "value" },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})}
|
|
||||||
</script>
|
|
||||||
</html>
|
|
||||||
`;
|
|
||||||
|
|
||||||
const results = parseSearch(mockHtml, "https://www.kijiji.ca");
|
|
||||||
expect(results).toHaveLength(2);
|
|
||||||
expect(results[0]).toEqual({
|
|
||||||
name: "iPhone 13 Pro",
|
|
||||||
listingLink: "https://www.kijiji.ca/v-iphone/k0l0",
|
|
||||||
});
|
|
||||||
expect(results[1]).toEqual({
|
|
||||||
name: "Samsung Galaxy",
|
|
||||||
listingLink: "https://www.kijiji.ca/v-samsung/k0l0",
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle absolute URLs", () => {
|
|
||||||
const mockHtml = `
|
|
||||||
<html>
|
|
||||||
<script id="__NEXT_DATA__" type="application/json">
|
|
||||||
${JSON.stringify({
|
|
||||||
props: {
|
|
||||||
pageProps: {
|
|
||||||
__APOLLO_STATE__: {
|
|
||||||
"Listing:123": {
|
|
||||||
url: "https://www.kijiji.ca/v-iphone/k0l0",
|
|
||||||
title: "iPhone 13 Pro",
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})}
|
|
||||||
</script>
|
|
||||||
</html>
|
|
||||||
`;
|
|
||||||
|
|
||||||
const results = parseSearch(mockHtml, "https://www.kijiji.ca");
|
|
||||||
expect(results[0].listingLink).toBe(
|
|
||||||
"https://www.kijiji.ca/v-iphone/k0l0",
|
|
||||||
);
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should filter out invalid listings", () => {
|
|
||||||
const mockHtml = `
|
|
||||||
<html>
|
|
||||||
<script id="__NEXT_DATA__" type="application/json">
|
|
||||||
${JSON.stringify({
|
|
||||||
props: {
|
|
||||||
pageProps: {
|
|
||||||
__APOLLO_STATE__: {
|
|
||||||
"Listing:123": {
|
|
||||||
url: "/v-iphone/k0l0",
|
|
||||||
title: "iPhone 13 Pro",
|
|
||||||
},
|
|
||||||
"Listing:456": {
|
|
||||||
url: "/v-samsung/k0l0",
|
|
||||||
// Missing title
|
|
||||||
},
|
|
||||||
"Other:789": {
|
|
||||||
url: "/v-other/k0l0",
|
|
||||||
title: "Other Item",
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})}
|
|
||||||
</script>
|
|
||||||
</html>
|
|
||||||
`;
|
|
||||||
|
|
||||||
const results = parseSearch(mockHtml, "https://www.kijiji.ca");
|
|
||||||
expect(results).toHaveLength(1);
|
|
||||||
expect(results[0].name).toBe("iPhone 13 Pro");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should return empty array for invalid HTML", () => {
|
|
||||||
const results = parseSearch(
|
|
||||||
"<html><body>Invalid</body></html>",
|
|
||||||
"https://www.kijiji.ca",
|
|
||||||
);
|
|
||||||
expect(results).toEqual([]);
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("parseDetailedListing", () => {
|
|
||||||
test("should parse detailed listing with all fields", async () => {
|
|
||||||
const mockHtml = `
|
|
||||||
<html>
|
|
||||||
<script id="__NEXT_DATA__" type="application/json">
|
|
||||||
${JSON.stringify({
|
|
||||||
props: {
|
|
||||||
pageProps: {
|
|
||||||
__APOLLO_STATE__: {
|
|
||||||
"Listing:123": {
|
|
||||||
url: "/v-iphone-13-pro/k0l0",
|
|
||||||
title: "iPhone 13 Pro 256GB",
|
|
||||||
description: "Excellent condition iPhone 13 Pro",
|
|
||||||
price: {
|
|
||||||
amount: 80000,
|
|
||||||
currency: "CAD",
|
|
||||||
type: "FIXED",
|
|
||||||
},
|
|
||||||
type: "OFFER",
|
|
||||||
status: "ACTIVE",
|
|
||||||
activationDate: "2024-01-15T10:00:00.000Z",
|
|
||||||
endDate: "2025-01-15T10:00:00.000Z",
|
|
||||||
metrics: { views: 150 },
|
|
||||||
location: {
|
|
||||||
address: "Toronto, ON",
|
|
||||||
id: 1700273,
|
|
||||||
name: "Toronto",
|
|
||||||
coordinates: {
|
|
||||||
latitude: 43.6532,
|
|
||||||
longitude: -79.3832,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
imageUrls: [
|
|
||||||
"https://media.kijiji.ca/api/v1/image1.jpg",
|
|
||||||
"https://media.kijiji.ca/api/v1/image2.jpg",
|
|
||||||
],
|
|
||||||
imageCount: 2,
|
|
||||||
categoryId: 132,
|
|
||||||
adSource: "ORGANIC",
|
|
||||||
flags: {
|
|
||||||
topAd: false,
|
|
||||||
priceDrop: true,
|
|
||||||
},
|
|
||||||
posterInfo: {
|
|
||||||
posterId: "user123",
|
|
||||||
rating: 4.8,
|
|
||||||
},
|
|
||||||
attributes: [
|
|
||||||
{
|
|
||||||
canonicalName: "forsaleby",
|
|
||||||
canonicalValues: ["ownr"],
|
|
||||||
},
|
|
||||||
{
|
|
||||||
canonicalName: "phonecarrier",
|
|
||||||
canonicalValues: ["unlocked"],
|
|
||||||
},
|
|
||||||
],
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})}
|
|
||||||
</script>
|
|
||||||
</html>
|
|
||||||
`;
|
|
||||||
|
|
||||||
const result = await parseDetailedListing(
|
|
||||||
mockHtml,
|
|
||||||
"https://www.kijiji.ca",
|
|
||||||
);
|
|
||||||
expect(result).toEqual({
|
|
||||||
url: "https://www.kijiji.ca/v-iphone-13-pro/k0l0",
|
|
||||||
title: "iPhone 13 Pro 256GB",
|
|
||||||
description: "Excellent condition iPhone 13 Pro",
|
|
||||||
listingPrice: {
|
|
||||||
amountFormatted: "$800.00",
|
|
||||||
cents: 80000,
|
|
||||||
currency: "CAD",
|
|
||||||
},
|
|
||||||
listingType: "OFFER",
|
|
||||||
listingStatus: "ACTIVE",
|
|
||||||
creationDate: "2024-01-15T10:00:00.000Z",
|
|
||||||
endDate: "2025-01-15T10:00:00.000Z",
|
|
||||||
numberOfViews: 150,
|
|
||||||
address: "Toronto, ON",
|
|
||||||
images: [
|
|
||||||
"https://media.kijiji.ca/api/v1/image1.jpg",
|
|
||||||
"https://media.kijiji.ca/api/v1/image2.jpg",
|
|
||||||
],
|
|
||||||
categoryId: 132,
|
|
||||||
adSource: "ORGANIC",
|
|
||||||
flags: {
|
|
||||||
topAd: false,
|
|
||||||
priceDrop: true,
|
|
||||||
},
|
|
||||||
attributes: {
|
|
||||||
forsaleby: ["ownr"],
|
|
||||||
phonecarrier: ["unlocked"],
|
|
||||||
},
|
|
||||||
location: {
|
|
||||||
id: 1700273,
|
|
||||||
name: "Toronto",
|
|
||||||
coordinates: {
|
|
||||||
latitude: 43.6532,
|
|
||||||
longitude: -79.3832,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
sellerInfo: {
|
|
||||||
posterId: "user123",
|
|
||||||
rating: 4.8,
|
|
||||||
},
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should return null for contact-based pricing", async () => {
|
|
||||||
const mockHtml = `
|
|
||||||
<html>
|
|
||||||
<script id="__NEXT_DATA__" type="application/json">
|
|
||||||
${JSON.stringify({
|
|
||||||
props: {
|
|
||||||
pageProps: {
|
|
||||||
__APOLLO_STATE__: {
|
|
||||||
"Listing:123": {
|
|
||||||
url: "/v-iphone/k0l0",
|
|
||||||
title: "iPhone for Sale",
|
|
||||||
price: {
|
|
||||||
type: "CONTACT",
|
|
||||||
amount: null,
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})}
|
|
||||||
</script>
|
|
||||||
</html>
|
|
||||||
`;
|
|
||||||
|
|
||||||
const result = await parseDetailedListing(
|
|
||||||
mockHtml,
|
|
||||||
"https://www.kijiji.ca",
|
|
||||||
);
|
|
||||||
expect(result).toBeNull();
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle missing optional fields", async () => {
|
|
||||||
const mockHtml = `
|
|
||||||
<html>
|
|
||||||
<script id="__NEXT_DATA__" type="application/json">
|
|
||||||
${JSON.stringify({
|
|
||||||
props: {
|
|
||||||
pageProps: {
|
|
||||||
__APOLLO_STATE__: {
|
|
||||||
"Listing:123": {
|
|
||||||
url: "/v-iphone/k0l0",
|
|
||||||
title: "iPhone 13",
|
|
||||||
price: { amount: 50000 },
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
},
|
|
||||||
})}
|
|
||||||
</script>
|
|
||||||
</html>
|
|
||||||
`;
|
|
||||||
|
|
||||||
const result = await parseDetailedListing(
|
|
||||||
mockHtml,
|
|
||||||
"https://www.kijiji.ca",
|
|
||||||
);
|
|
||||||
expect(result).toEqual({
|
|
||||||
url: "https://www.kijiji.ca/v-iphone/k0l0",
|
|
||||||
title: "iPhone 13",
|
|
||||||
description: undefined,
|
|
||||||
listingPrice: {
|
|
||||||
amountFormatted: "$500.00",
|
|
||||||
cents: 50000,
|
|
||||||
currency: undefined,
|
|
||||||
},
|
|
||||||
listingType: undefined,
|
|
||||||
listingStatus: undefined,
|
|
||||||
creationDate: undefined,
|
|
||||||
endDate: undefined,
|
|
||||||
numberOfViews: undefined,
|
|
||||||
address: null,
|
|
||||||
images: [],
|
|
||||||
categoryId: 0,
|
|
||||||
adSource: "UNKNOWN",
|
|
||||||
flags: {
|
|
||||||
topAd: false,
|
|
||||||
priceDrop: false,
|
|
||||||
},
|
|
||||||
attributes: {},
|
|
||||||
location: {
|
|
||||||
id: 0,
|
|
||||||
name: "Unknown",
|
|
||||||
coordinates: undefined,
|
|
||||||
},
|
|
||||||
sellerInfo: undefined,
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
@@ -1,54 +0,0 @@
|
|||||||
import { afterEach, beforeEach, describe, expect, test } from "bun:test";
|
|
||||||
import { formatCentsToCurrency, slugify } from "../src/kijiji";
|
|
||||||
|
|
||||||
describe("Utility Functions", () => {
|
|
||||||
describe("slugify", () => {
|
|
||||||
test("should convert basic strings to slugs", () => {
|
|
||||||
expect(slugify("Hello World")).toBe("hello-world");
|
|
||||||
expect(slugify("iPhone 13 Pro")).toBe("iphone-13-pro");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle special characters", () => {
|
|
||||||
expect(slugify("Café & Restaurant")).toBe("cafe-restaurant");
|
|
||||||
expect(slugify("100% New")).toBe("100-new");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle empty and edge cases", () => {
|
|
||||||
expect(slugify("")).toBe("");
|
|
||||||
expect(slugify(" ")).toBe("-");
|
|
||||||
expect(slugify("---")).toBe("-");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should preserve numbers and valid characters", () => {
|
|
||||||
expect(slugify("iPhone 13")).toBe("iphone-13");
|
|
||||||
expect(slugify("item123")).toBe("item123");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
|
|
||||||
describe("formatCentsToCurrency", () => {
|
|
||||||
test("should format valid cent values", () => {
|
|
||||||
expect(formatCentsToCurrency(100)).toBe("$1.00");
|
|
||||||
expect(formatCentsToCurrency(1999)).toBe("$19.99");
|
|
||||||
expect(formatCentsToCurrency(0)).toBe("$0.00");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle string inputs", () => {
|
|
||||||
expect(formatCentsToCurrency("100")).toBe("$1.00");
|
|
||||||
expect(formatCentsToCurrency("1999")).toBe("$19.99");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle null/undefined inputs", () => {
|
|
||||||
expect(formatCentsToCurrency(null)).toBe("");
|
|
||||||
expect(formatCentsToCurrency(undefined)).toBe("");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should handle invalid inputs", () => {
|
|
||||||
expect(formatCentsToCurrency("invalid")).toBe("");
|
|
||||||
expect(formatCentsToCurrency(Number.NaN)).toBe("");
|
|
||||||
});
|
|
||||||
|
|
||||||
test("should use en-US locale formatting", () => {
|
|
||||||
expect(formatCentsToCurrency(123456)).toBe("$1,234.56");
|
|
||||||
});
|
|
||||||
});
|
|
||||||
});
|
|
||||||
@@ -1,14 +0,0 @@
|
|||||||
// Test setup for Bun test runner
|
|
||||||
import { expect } from "bun:test";
|
|
||||||
|
|
||||||
// Global test setup
|
|
||||||
// This file is loaded before any tests run due to bunfig.toml preload
|
|
||||||
|
|
||||||
// Mock fetch globally for tests
|
|
||||||
global.fetch =
|
|
||||||
global.fetch ||
|
|
||||||
(() => {
|
|
||||||
throw new Error("fetch is not available in test environment");
|
|
||||||
});
|
|
||||||
|
|
||||||
// Add any global test utilities here
|
|
||||||
@@ -7,21 +7,25 @@
|
|||||||
"moduleDetection": "force",
|
"moduleDetection": "force",
|
||||||
"jsx": "react-jsx",
|
"jsx": "react-jsx",
|
||||||
"allowJs": true,
|
"allowJs": true,
|
||||||
|
|
||||||
// Bundler mode
|
// Bundler mode
|
||||||
"moduleResolution": "bundler",
|
"moduleResolution": "bundler",
|
||||||
"allowImportingTsExtensions": true,
|
"allowImportingTsExtensions": true,
|
||||||
"verbatimModuleSyntax": true,
|
"verbatimModuleSyntax": true,
|
||||||
"noEmit": true,
|
"noEmit": true,
|
||||||
|
|
||||||
// Best practices
|
// Best practices
|
||||||
"strict": true,
|
"strict": true,
|
||||||
"skipLibCheck": true,
|
"skipLibCheck": true,
|
||||||
"noFallthroughCasesInSwitch": true,
|
"noFallthroughCasesInSwitch": true,
|
||||||
"noUncheckedIndexedAccess": true,
|
"noUncheckedIndexedAccess": true,
|
||||||
"noImplicitAny": true,
|
"noImplicitAny": true,
|
||||||
|
|
||||||
// Some stricter flags (disabled by default)
|
// Some stricter flags (disabled by default)
|
||||||
"noUnusedLocals": false,
|
"noUnusedLocals": false,
|
||||||
"noUnusedParameters": false,
|
"noUnusedParameters": false,
|
||||||
"noPropertyAccessFromIndexSignature": false,
|
"noPropertyAccessFromIndexSignature": false,
|
||||||
|
|
||||||
"paths": {
|
"paths": {
|
||||||
"@/*": ["./src/*"]
|
"@/*": ["./src/*"]
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user