Documentation Index
Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
Use this file to discover all available pages before exploring further.
GET /api/crawl/:id/pages first; it returns paginated crawl pages with the underlying scrape result resolved into each page. Use History when you need to inspect an individual underlying request by its scrapeRefId.
List history
Query parameters
Page number to fetch (1-indexed).
Entries per page.
Filter by service. One of
"scrape", "extract", "search", "monitor", "crawl", "schema".Example request
Example response
| Field | Description |
|---|---|
data[] | Ordered list of history entries (newest first). See Entry shape. |
pagination.page / .limit | Echo of the request’s page and limit. |
pagination.total | Total entry count matching the filter (across all pages). |
Get one entry
result payload (markdown, HTML, JSON extraction, screenshots, etc.).
Path parameters
The UUID of a request. This is the same UUID returned by the originating endpoint:
- From
POST /api/scrape→ top-levelid - From
POST /api/extract→ top-levelid - From
POST /api/search→ top-levelid - From
GET /api/crawl/:id→ eachpages[].scrapeRefId - From
GET /api/monitor/:cronId/activity→ eachticks[].id
Example request
Example response
Entry shape
Every entry — both inGET /history and GET /history/:id — has the same shape:
| Field | Description |
|---|---|
id | Entry UUID. Same UUID as the originating endpoint returned. |
userId | The account that issued the request. |
service | "scrape" | "extract" | "search" | "monitor" | "crawl" | "schema". |
status | Lifecycle: "running" | "completed" | "failed". |
params | The request body that produced this entry (URL, prompt, formats, etc.). |
result | The full response payload, shaped per the originating endpoint. null while running, populated on completion. |
error | Error object if status === "failed", otherwise null. |
elapsedMs | How long the request took, in milliseconds. |
requestParentId | If this entry was created as a child of another (e.g. a scrape run by a crawl), the parent’s UUID. null for top-level requests. |
createdAt | ISO-8601 timestamp. |
Fetching crawled page content
The canonical pattern: start a crawl, poll until completed, then for each page fetch its scrape result.requestParentId on each child scrape entry equals the parent crawl’s id, so you can also list every page produced by a single crawl with:
Errors
| HTTP | error.type | When |
|---|---|---|
400 | validation | Malformed id (must be a UUID), or invalid service filter value. |
404 | not_found | The id is well-formed but no matching entry exists for this account. |
403 | auth_invalid_key | The API key is invalid or revoked. |
Related
- Crawl jobs that produce
scrapeRefIds: Get crawl status - Originating endpoints whose
idyou can pass toGET /history/:id: Scrape, Extract, Search - SDK wrappers:
sgai.history.list()andsgai.history.get(id)— see JavaScript SDK and Python SDK