Documentation Index Fetch the complete documentation index at: https://docs.scrapegraphai.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
@scrapegraph-ai/ai-sdk exposes ScrapeGraphAI endpoints as Vercel AI SDK tools. Add the tools to generateText or streamText, set stopWhen, and the model can scrape, extract, search, crawl, and monitor web data during the run.
Vercel AI SDK docs Official Vercel AI SDK documentation
Tool calling How AI SDK Core tools are executed
Installation
Install the ScrapeGraphAI tool package, the AI SDK, and the model provider you use:
npm i @scrapegraph-ai/ai-sdk ai @ai-sdk/openai
pnpm add @scrapegraph-ai/ai-sdk ai @ai-sdk/openai
yarn add @scrapegraph-ai/ai-sdk ai @ai-sdk/openai
bun add @scrapegraph-ai/ai-sdk ai @ai-sdk/openai
Set your keys:
export SGAI_API_KEY = "your-scrapegraph-key"
export OPENAI_API_KEY = "your-openai-key"
The tools read SGAI_API_KEY from the environment by default. You can also pass { apiKey: process.env.SGAI_API_KEY } to any tool factory.
Quickstart
Give the model a scrape tool and allow multiple steps so it can call the tool, receive the result, then write the final answer.
import { openai } from "@ai-sdk/openai" ;
import { generateText , stepCountIs } from "ai" ;
import { scrapeTool } from "@scrapegraph-ai/ai-sdk" ;
const { text } = await generateText ({
model: openai ( "gpt-5-nano" ),
prompt:
"Scrape Hacker News and write a short, concise summary of what people are talking about today." ,
tools: {
scrape: scrapeTool (),
},
stopWhen: stepCountIs ( 3 ),
});
console . log ( text );
Factory What it gives the model scrapeTool()Scrape a page as markdown, HTML, JSON, links, images, summary, branding, or screenshot extractTool()Extract structured JSON from a URL, HTML, or markdown with a prompt searchTool()Search the web and optionally extract structured data from results crawlTools()Start, poll, page through, stop, resume, and delete crawl jobs monitorTools()Create, list, update, pause, resume, delete, and inspect monitor activity
Use a narrow tool set when the task is specific. Use all tools when the agent needs to decide the workflow:
import { openai } from "@ai-sdk/openai" ;
import { generateText , stepCountIs } from "ai" ;
import {
crawlTools ,
extractTool ,
monitorTools ,
scrapeTool ,
searchTool ,
} from "@scrapegraph-ai/ai-sdk" ;
const { text } = await generateText ({
model: openai ( "gpt-5-nano" ),
prompt: "Search for ScrapeGraphAI docs, scrape the best page, and summarize it." ,
tools: {
scrape: scrapeTool (),
extract: extractTool (),
search: searchTool (),
... crawlTools (),
... monitorTools (),
},
stopWhen: stepCountIs ( 10 ),
});
console . log ( text );
Scrape example
This is the smallest useful agent: one scrape tool, a concrete target, and enough steps for the model to call the tool before answering.
import { openai } from "@ai-sdk/openai" ;
import { generateText , stepCountIs } from "ai" ;
import { scrapeTool } from "@scrapegraph-ai/ai-sdk" ;
const result = await generateText ({
model: openai ( "gpt-5-nano" ),
prompt: "Find the main headline on https://example.com" ,
tools: {
scrape: scrapeTool (),
},
stopWhen: stepCountIs ( 5 ),
});
console . log ( result . text );
Pass an API key explicitly when your runtime does not expose environment variables:
const tools = {
scrape: scrapeTool ({ apiKey: process . env . SGAI_API_KEY }),
};
Crawl example
crawlTools() gives the model the full async crawl loop: start the job, poll status with getCrawl, then retrieve paginated pages with getCrawlPages.
import { openai } from "@ai-sdk/openai" ;
import { generateText , stepCountIs } from "ai" ;
import { crawlTools } from "@scrapegraph-ai/ai-sdk" ;
const { text , steps } = await generateText ({
model: openai ( "gpt-5-nano" ),
prompt:
"Find 10 https://scrapegraphai.com/ blog posts. Start a crawl, poll its status, fetch crawled pages with getCrawlPages, then summarize what you found." ,
tools: {
... crawlTools (),
},
stopWhen: stepCountIs ( 20 ),
});
for ( const step of steps ) {
for ( const toolCall of step . toolCalls ) {
console . log ( `[tool] ${ toolCall . toolName } ` );
console . log ( JSON . stringify ( toolCall . input , null , 2 ));
}
}
console . log ( text );
For longer crawls, keep the same tools but add your app’s own timeout, cancellation, and persistence around the AI SDK call.
Scrape
import { scrapeTool } from "@scrapegraph-ai/ai-sdk" ;
const tools = {
scrape: scrapeTool (),
};
import { extractTool } from "@scrapegraph-ai/ai-sdk" ;
const tools = {
extract: extractTool (),
};
Search
import { searchTool } from "@scrapegraph-ai/ai-sdk" ;
const tools = {
search: searchTool (),
};
Crawl
import { crawlTools } from "@scrapegraph-ai/ai-sdk" ;
const tools = {
... crawlTools (),
};
crawlTools() registers startCrawl, getCrawl, getCrawlPages, stopCrawl, resumeCrawl, and deleteCrawl.
Monitor
import { monitorTools } from "@scrapegraph-ai/ai-sdk" ;
const tools = {
... monitorTools (),
};
monitorTools() registers createMonitor, listMonitors, getMonitor, updateMonitor, deleteMonitor, pauseMonitor, resumeMonitor, and getMonitorActivity.
Support
GitHub Issues Report bugs and request features
Discord Community Get help from our community