Skip to main content

Developer's Guide

How to Use the Automate Endpoint

Overview

Automating complex, multi-step browser workflows has traditionally been brittle and time-consuming. The TABS API /v1/automate endpoint solves this by letting you describe your goal in natural language. An AI agent interprets your task, navigates websites, interacts with elements, and extracts data, all while streaming real-time progress updates.

This approach moves beyond simple, static scraping to handle dynamic, interactive tasks. It's a powerful tool for:

  • Complex web scraping requiring clicks, navigation, and stateful interaction.
  • Automated form submission and multi-step workflows.
  • Multi-page data collection, including pagination and "load more" buttons.
  • Dynamic content extraction from SPAs (Single Page Applications).
  • Price monitoring and competitive research that requires site interaction.
  • Testing and validating web application workflows.

Key Capabilities:

  • Natural Language Control: Execute tasks using plain-English instructions.
  • AI-Powered Interaction: The agent can click, type, scroll, and navigate.
  • Real-Time Streaming: Get live feedback via Server-Sent Events (SSE).
  • Data & Context: Pass in JSON data for form-filling.
  • Safety Guardrails: Define "do-not-do" rules for safe execution.

Prerequisites

Before you can use the /v1/automate endpoint, you'll need:

  1. A valid TABS API key: Sign up at https://tabstack.ai to get your key.
  2. Authentication: The API uses Bearer token authentication.
  3. SSE-capable client: Your HTTP client must be able to handle a text/event-stream response.
  4. A clear task: A specific, natural language description of your goal.

We recommend storing your API key in an environment variable for security.

Set Your API Key

This command makes your API key available in your shell session.

export TABS_API_KEY="your-api-key-here"

The export command is a bash command that sets an environment variable. We're setting the variable name TABS_API_KEY, which is what our code examples will look for throughout this guide. You'll need to replace "your-api-key-here" with the actual API key you received from TABS.

How to Run:

  1. Open your terminal.
  2. Paste the command, replacing the placeholder with your key.
  3. Press Enter. This variable will be set for your current terminal session.

Basic Usage

Endpoint Details

  • URL: https://api.tabstack.ai/v1/automate
  • Method: POST
  • Authentication: Bearer <your-api-key> (Required)
  • Content-Type: application/json
  • Response Type: text/event-stream (Server-Sent Events)
Streaming Response

Unlike other TABS API endpoints, /v1/automate always streams its response. You must use a client that can process Server-Sent Events (SSE) to receive real-time updates and the final result.

Minimal Request Example

Let's start with a simple task: extracting the top 3 trending repositories from GitHub. We'll send a POST request with our task and url in the JSON body.

curl -X POST https://api.tabstack.ai/v1/automate \
-H "Authorization: Bearer $TABS_API_KEY" \
-H "Content-Type: application/json" \
-N \
-d '{
"task": "Find the top 3 trending repositories and extract their names",
"url": "https://github.com/trending"
}'

Code Explanations

Here’s a breakdown of what each example is doing:

curl Breakdown:

This sends a POST request with authentication and a JSON payload containing your task and URL. The key flag here is -N, which disables buffering—without it, you won't see the streaming events in real-time. The response streams back as Server-Sent Events showing the agent's progress.

JavaScript Breakdown:

The code makes an authenticated POST request, then processes the streaming response using getReader(). The while loop reads chunks as they arrive, and since chunks may contain multiple events or partial events, the code splits by newlines and processes complete lines. The try/catch handles incomplete JSON that might arrive mid-chunk. This streaming approach lets you show real-time updates as the agent works.

Python Breakdown:

The Python version is more straightforward thanks to the requests library. Setting stream=True enables streaming, and iter_lines() handles the line-by-line processing automatically. The code decodes each line, checks for the SSE data: prefix, and parses the JSON. Python's requests library makes streaming simpler than the JavaScript version.

How to Run These Examples

  1. Environment: Ensure you've set your TABS_API_KEY environment variable (see Prerequisites).
  2. curl: Paste the command directly into your terminal and run it. You will see events stream directly to your console.
  3. JavaScript: Save the code as automate.mjs. Install node-fetch if you're not in a browser environment (npm install node-fetch). Run it with Node.js: node automate.mjs. (Note: process.env requires Node.js).
  4. Python: Save the code as automate.py. Install requests (pip install requests). Run it from your terminal: python automate.py.

Understanding Server-Sent Events (SSE)

Because automation tasks can take time, the endpoint streams its progress. You won't get a single JSON response; you'll get a series of events. Here’s an example of the event flow you'll receive.

event: task:started
data: {"task": "Find the top 3 trending repositories", "url": "https://github.com/trending"}

event: agent:processing
data: {"operation": "Creating task plan", "hasScreenshot": false}

event: browser:navigated
data: {"title": "Trending - GitHub", "url": "https://github.com/trending"}

event: agent:action
data: {"action": "extract", "target": "repository list"}

event: agent:extracted
data: {"extractedData": "[{\"name\": \"awesome-ai\"}, {\"name\": \"web-framework\"}, {\"name\": \"data-viz\"}]"}

event: task:completed
data: {"success": true, "finalAnswer": "Top 3 repos: awesome-ai, web-framework, data-viz"}

event: complete
data: {"success": true, "result": {"finalAnswer": "Top 3 repos: awesome-ai, web-framework, data-viz"}}

event: done
data: {}

Stream Explanation:

Each message in the stream consists of an event: type (like agent:processing or task:completed) and a data: payload containing a JSON object with details. The stream begins with event: task:started, which confirms the task has begun. As the AI works, you'll see event: agent:processing when the AI is thinking or planning its next steps. When the agent navigates to a new page, event: browser:navigated is emitted. The event: agent:extracted indicates the agent has successfully extracted data from the page. Near the end, both event: task:completed and event: complete signal the task has finished successfully and contain the final answer. Finally, event: done signals that the stream is closing.

This real-time flow allows you to build responsive UIs that show "Navigating...", "Extracting data...", "Task complete!"


Request Parameters

You can customize the agent's behavior by passing these parameters in the JSON body of your POST request.

task (required)

  • Type: string
  • Description: The natural language description of your objective. Be as specific as possible.

Examples:

// Good for data extraction
{
"task": "Find the top 5 products in the 'Electronics' category and extract their names, prices, and ratings."
}

// Good for form filling
{
"task": "Fill out the contact form with the provided information and submit it. Then, verify that the 'Thank You' message appears."
}

// Good for multi-step workflows
{
"task": "Go to the search bar, search for 'running shoes', filter by 'size 10' and 'brand: Nike', and extract the names of the first 3 results."
}

Tips for writing good tasks:

  • Be specific: "Extract top 3" is better than "extract products."
  • Mention quantities: "first 5," "all results on the page," etc.
  • Describe interactions: "Click the 'Next' button," "filter by size," "submit the form."
  • State the output: "extract their names and prices," "return the final confirmation text."

url (optional)

  • Type: string (URI format)
  • Description: The starting URL for the task.

Example:

{
"task": "Extract the latest blog post titles",
"url": "https://blog.example.com"
}
  • When to include: Use this when you know the exact starting page for the task.
  • When to omit: Omit this if your task is a general web search (e.g., "Find the weather in Boston"). The agent will determine the best starting point (e.g., Google).

data (optional)

  • Type: object
  • Description: A JSON object providing context or data for the task, typically for form-filling.

Example:

{
"task": "Submit the registration form with my information",
"url": "https://example.com/register",
"data": {
"firstName": "Alex",
"lastName": "Johnson",
"email": "[email protected]",
"credentials": {
"username": "alexj2025",
"password": "super-secret-password!"
}
}
}

The agent will intelligently map the keys in this object (like firstName) to the fields on the web page.

guardrails (optional)

  • Type: string
  • Description: Safety constraints describing what the agent should NOT do. This is critical for preventing unintended actions.

Examples:

// Read-only extraction
{
"task": "Extract product information",
"guardrails": "Browse and extract only. Do not click buttons, submit forms, or add anything to a cart."
}

// Limited interaction
{
"task": "Search for items and view details",
"guardrails": "Only use search, filter, and pagination. Do not perform any checkout actions or create an account."
}

// Domain restrictions
{
"task": "Research company information",
"guardrails": "Stay on the company's official website (example.com). Do not navigate to any external links, social media, or partner sites."
}

maxIterations (optional)

  • Type: number
  • Default: 50
  • Range: 1-100
  • Description: The maximum number of steps (page loads, clicks, extractions) the agent can take before terminating. This prevents infinite loops.

When to adjust:

  • Lower (10-20): For simple, single-page tasks.
  • Default (50): Sufficient for most tasks with moderate complexity.
  • Higher (75-100): For complex, multi-page workflows or deep pagination.

maxValidationAttempts (optional)

  • Type: number
  • Default: 3
  • Range: 1-10
  • Description: The maximum number of times the agent will try to validate that its final step was successful (e.g., re-checking for a "Thank You" message after form submission).

Response Structure and Event Types

You'll interact with a stream of events. Here are the most common ones and their purpose.

Task Events (High-Level Task Status)

  • task:started: Fired once. Signals the task has been received and started.
  • task:completed: Fired on success. The data payload contains the finalAnswer.
  • task:aborted: Fired on failure. The data payload contains a reason (e.g., "Exceeded maximum iterations").
  • task:validated: Fired when a validation step succeeds.

Agent Events (The AI's "Thoughts" and Actions)

  • agent:processing: The agent is planning or thinking.
  • agent:status: A human-readable message about the current plan (e.g., "plan": "1. Navigate to page...").
  • agent:action: The agent is performing an action (e.g., {"action": "click", "target": "Search button"}).
  • agent:reasoned: A human-readable thought process (e.g., "reasoning": "Search results loaded. Now extracting data.").
  • agent:extracted: Fired when a chunk of data is extracted. The data payload contains extractedData.
  • agent:waiting: The agent is waiting for a page to load or an element to appear.

Browser Events (Low-Level Browser State)

  • browser:navigated: A new page has loaded. The payload includes the new title and url.
  • browser:action_started: A browser interaction has begun (e.g., {"action": "typing"}).
  • browser:action_completed: The interaction has finished.

Stream Control Events

  • complete: The final event in a successful stream. The data payload contains the full result.
  • done: Signals the stream is closing (after complete or error).
  • error: Fired if an unrecoverable error occurs. The data payload contains an error message and code.

Working with Streaming Responses

The "Basic Usage" example is good for logging, but in a real application, you'll want more robust logic.

Processing Events in Real-Time

To build a responsive UI, you should process events as they arrive. These examples show a more robust way to handle the stream, including buffering incomplete lines.

async function processAutomationEvents() {
const response = await fetch('https://api.tabstack.ai/v1/automate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TABS_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
task: 'Extract the top 5 products with their prices',
url: 'https://shop.example.com/products'
})
});

if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
const { value, done } = await reader.read();
if (done) break;

// Add the new chunk to the buffer
buffer += decoder.decode(value, { stream: true });

// Process all complete lines in the buffer
let eolIndex;
while ((eolIndex = buffer.indexOf('\n')) >= 0) {
const line = buffer.substring(0, eolIndex).trim();
buffer = buffer.substring(eolIndex + 1); // Remove processed line from buffer

if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') break;

try {
const event = JSON.parse(data);

// Handle different event types
if (event.finalAnswer) {
console.log('✓ Task completed:', event.finalAnswer);
} else if (event.message) {
console.log('Status:', event.message);
} else if (event.url) {
console.log('→ Navigated to:', event.url);
} else if (event.extractedData) {
console.log('📦 Extracted:', event.extractedData);
}

} catch (e) {
// Ignore parse errors (likely incomplete JSON)
}
}
}
}
// After the loop, `buffer` may contain a final partial chunk.
// Add `decoder.decode(undefined, { stream: false })` to flush if needed.
}

processAutomationEvents();

Code Explanations

JavaScript Breakdown:

This version adds error checking and proper buffering. The key improvement is the buffer management—since SSE messages can arrive split across chunks, the code maintains a buffer and only processes complete lines (ending in \n). Any incomplete data stays in the buffer until the next chunk arrives. This prevents parsing errors from partial JSON.

Python Breakdown:

Python's requests library handles buffering automatically with iter_lines(), making the code simpler. The raise_for_status() call catches HTTP errors early. You don't need manual buffer management—the library takes care of splitting the stream into complete lines.

How to Run

Follow the same instructions as the "Basic Usage" section. These examples are drop-in replacements.

Collecting Final Results

Often, you'll want to process real-time events and capture the final complete event. This pattern wraps the logic in a function that returns the final payload.

async function runAutomationTask(task, url) {
const response = await fetch('https://api.tabstack.ai/v1/automate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TABS_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ task, url })
});

if (!response.ok) {
throw new Error(`HTTP error! status: ${response.status}`);
}

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
let finalResult = null;
const events = [];

while (true) {
const { value, done } = await reader.read();
if (done) break;

buffer += decoder.decode(value, { stream: true });
let eolIndex;
while ((eolIndex = buffer.indexOf('\n')) >= 0) {
const line = buffer.substring(0, eolIndex).trim();
buffer = buffer.substring(eolIndex + 1);

if (line.startsWith('data: ')) {
const data = line.slice(6);
if (data === '[DONE]') break;

try {
const event = JSON.parse(data);
events.push(event);

// Capture the final 'complete' event's result
if (event.success !== undefined && event.result) {
finalResult = event.result;
}
} catch (e) {
// Ignore
}
}
}
}

return {
success: finalResult !== null,
result: finalResult,
eventCount: events.length,
events: events // Full log of all events
};
}

// Usage
runAutomationTask(
'Find the top 3 articles and their publication dates',
'https://news.example.com'
).then(result => {
console.log('Task completed:', result.success);
console.log('Result:', result.result);
console.log('Total events:', result.eventCount);
});

Code Explanations

These functions build on the previous examples.

JavaScript & Python:

These functions collect all events while watching for the final complete event. Storing all events helps with debugging—you can see exactly what the agent did if something goes wrong. The code detects the complete event by checking for both success and result properties, then returns both the final result and the full event log.

How to Run

These examples are self-contained. Save the code and run the file (e.g., python run_task.py or node run_task.mjs). The "Usage" block at the end will execute the task and print the final result.


Error Handling

A robust implementation must handle two types of errors:

  1. HTTP Errors: Connection-level failures (e.g., 401 Unauthorized, 503 Service Unavailable).
  2. Stream Errors: Task-level failures that occur after the connection is established (e.g., the agent can't find an element).

These examples show how to catch both.

Common Error Status Codes

Status CodeErrorDescription
400task is requiredMissing required task parameter.
400invalid URL formatMalformed URL if provided.
400maxIterations must be between 1 and 100Invalid iteration limit.
401Unauthorized - Invalid tokenMissing or invalid Bearer token.
500failed to call automate serverInternal server error.
503automate service not availableService not configured or unavailable.
async function automateWithErrorHandling(task, url) {
try {
const response = await fetch('https://api.tabstack.ai/v1/automate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TABS_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ task, url })
});

// 1. Handle HTTP errors
if (!response.ok) {
const errorData = await response.json().catch(() => ({})); // Handle non-JSON errors
throw new Error(`HTTP ${response.status}: ${errorData.error || response.statusText}`);
}

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
let hasError = false;

while (true) {
const { value, done } = await reader.read();
if (done) break;

buffer += decoder.decode(value, { stream: true });
let eolIndex;
while ((eolIndex = buffer.indexOf('\n')) >= 0) {
const line = buffer.substring(0, eolIndex).trim();
buffer = buffer.substring(eolIndex + 1);

// 2. Handle stream-level errors
if (line.startsWith('event: error') || (line.startsWith('data: ') && line.includes('"error"'))) {
const data = line.slice(6);
try {
const event = JSON.parse(data);
if (event.error) {
console.error('Task error:', event.error);
hasError = true;
}
} catch (e) {
// Continue
}
}
}
}

if (hasError) {
throw new Error('Task failed during execution');
}

return { success: true };

} catch (error) {
console.error('Automation error:', error.message);

// Handle specific error types
if (error.message.includes('401')) {
console.error('Authentication failed. Check your API key.');
} else if (error.message.includes('503')) {
console.error('Automate service is unavailable. Try again later.');
}
throw error;
}
}

// Usage
automateWithErrorHandling(
'Extract product information',
'https://example.com/products'
).catch(err => console.error('Failed:', err.message));

Code Explanations

JavaScript Breakdown:

Handle errors at two levels: HTTP errors (wrong API key, bad request) are caught before streaming starts via response.ok. Stream errors (agent can't complete the task) are detected by watching for event: error messages in the stream itself. This two-level approach ensures you catch failures whether they happen during connection or during task execution.

Python Breakdown:

Python follows the same two-level pattern. The raise_for_status() method handles HTTP errors by automatically raising exceptions. Stream-level errors are caught by checking for error events in the stream. Separating these error types makes debugging easier—you know whether the problem is with your request or with task execution.

How to Run

Save and run these functions. To test the error handling, try running them with an invalid API key (to trigger a 401) or a task that is designed to fail.


Advanced Usage Patterns

Form Submission with Data

One of the most powerful features is the ability to fill and submit forms. By passing a data object, you can provide the information the AI agent needs.

async function submitForm(formUrl, formData) {
const response = await fetch('https://api.tabstack.ai/v1/automate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TABS_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
task: 'Fill out and submit the contact form with the provided information',
url: formUrl,
data: formData,
guardrails: 'Only fill and submit the form, do not navigate away or click other links'
})
});

// (Add full streaming and error handling logic here)
const reader = response.body.getReader();
const decoder = new TextDecoder();
let submissionSuccess = false;

while (true) {
const { value, done } = await reader.read();
if (done) break;

const chunk = decoder.decode(value, { stream: true });
const lines = chunk.split('\n');

for (const line of lines) {
// Look for confirmation in the agent's reasoning or final answer
if (line.includes('form submitted') || line.includes('submission successful')) {
submissionSuccess = true;
}

if (line.startsWith('data: ')) {
try {
const event = JSON.parse(line.slice(6));
if (event.finalAnswer) {
console.log('Result:', event.finalAnswer);
submissionSuccess = true;
}
} catch (e) {
// Continue
}
}
}
}
return { success: submissionSuccess };
}

// Usage
const contactData = {
name: 'Alex Johnson',
email: '[email protected]',
company: 'Acme Inc',
message: 'Interested in learning more about your products'
};

submitForm('https://company.com/contact', contactData)
.then(result => console.log('Form submitted:', result.success));

Code Explanations

The key here is the data parameter—the agent intelligently maps your object keys to form fields on the page. The guardrails prevent unintended actions like clicking ads. The code watches the stream for success indicators (either in the agent's reasoning or the final answer), letting you confirm the form was actually submitted.

How to Run

  1. Define a contactData object or dictionary with the information you want to submit.
  2. Call the submitForm function, passing the target URL and your data.
  3. Replace https://company.com/contact with a real form URL to test this.

Multi-Page Data Collection

The agent isn't limited to a single page. You can instruct it to perform complex, multi-page workflows, like pagination or following search results.

async function collectMultiPageData() {
const response = await fetch('https://api.tabstack.ai/v1/automate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TABS_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
task: 'Search for "wireless headphones", go through the first 3 pages of results, and extract product names and prices from each page',
url: 'https://shop.example.com',
maxIterations: 75, // Increase iterations for multi-page tasks
guardrails: 'Browse and extract only, do not add items to cart or checkout'
})
});

// (Add full streaming and error handling logic here)
const reader = response.body.getReader();
const decoder = new TextDecoder();
const extractedData = [];
let buffer = '';

while (true) {
const { value, done } = await reader.read();
if (done) break;

buffer += decoder.decode(value, { stream: true });
let eolIndex;
while ((eolIndex = buffer.indexOf('\n')) >= 0) {
const line = buffer.substring(0, eolIndex).trim();
buffer = buffer.substring(eolIndex + 1);

if (line.startsWith('data: ')) {
try {
const event = JSON.parse(line.slice(6));

// Collect data as it's extracted
if (event.extractedData) {
extractedData.push(event.extractedData);
console.log('Extracted data from page:', extractedData.length);
}
if (event.finalAnswer) {
console.log('Final result:', event.finalAnswer);
}
if (event.url) {
console.log('Currently on:', event.url);
}
} catch (e) {
// Continue
}
}
}
}
return extractedData;
}

// Usage
collectMultiPageData()
.then(data => {
console.log(`Collected ${data.length} data points across multiple pages`);
console.log('Data:', data);
});

Code Explanations

Multi-page tasks need more iterations—here we set maxIterations: 75 to give the agent enough steps for pagination. The code collects extractedData events as they arrive, building up results from each page. Your task description should explicitly mention pagination ("go through the first 3 pages") to guide the agent.

How to Run

Run the file directly. The function will execute, and you'll see "Extracted data from page: 1", "Currently on: ...page=2", "Extracted data from page: 2", etc., logged to the console.

Price Monitoring and Alerts

This final example combines concepts to build a practical price monitor. It runs a task, parses the stream for specific data, and checks it against a target.

async function monitorPrice(productUrl, targetPrice) {
const response = await fetch('https://api.tabstack.ai/v1/automate', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.TABS_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
task: `Maps to the product page and extract the current price. Also check if the product is in stock.`,
url: productUrl,
maxIterations: 20, // Simple task
guardrails: 'Browse and extract only, do not make purchases'
})
});

// (Add full streaming and error handling logic here)
const reader = response.body.getReader();
const decoder = new TextDecoder();
let currentPrice = null;
let inStock = false;

while (true) {
const { value, done } = await reader.read();
if (done) break;

const chunk = decoder.decode(value, { stream: true });

// Parse the stream for clues, even before the final answer
if (chunk.toLowerCase().includes('in stock')) {
inStock = true;
}
const priceMatch = chunk.match(/\$?([\d,]+\.?\d{0,2})/);
if (priceMatch) {
currentPrice = parseFloat(priceMatch[1].replace(',', ''));
}
}

const result = {
currentPrice,
inStock,
belowTarget: currentPrice && currentPrice <= targetPrice
};

if (result.belowTarget && result.inStock) {
console.log(`🎉 ALERT: Price dropped to $${currentPrice}! (Target: $${targetPrice})`);
}

return result;
}

// Usage: Monitor price every hour
const productUrl = 'https://shop.example.com/product/12345';
const targetPrice = 299.99;

// Run it once to test
monitorPrice(productUrl, targetPrice)
.then(result => {
console.log(`Price: $${result.currentPrice}, In Stock: ${result.inStock}`);
});

// Use this to run it on a schedule
// setInterval(() => {
// monitorPrice(productUrl, targetPrice)
// .then(result => {
// console.log(`Price: $${result.currentPrice}, In Stock: ${result.inStock}`);
// });
// }, 3600000); // Check every hour

Code Explanations

This price monitor demonstrates parsing the stream for specific patterns. Instead of waiting for the final result, the code extracts prices as they appear in the stream using regex. This can be faster for simple data. The function compares the found price against your target and alerts if it's dropped. Wrap it in a scheduler to check prices periodically.

How to Run

Set a productUrl and targetPrice. Run the script. It will perform one check. You can uncomment the setInterval (JS) or while True (Python) loop to run it continuously.


Best Practices

1. Write Clear, Specific Task Descriptions

The quality of your task description is the single most important factor for success.

// Vague - may not get what you want
task: "Get some products"

// Better - more specific
task: "Extract the top 5 products with their names and prices"

// Best - very detailed
task: `Maps to the laptops category, filter by price (under $1000),
sort by rating (highest first), and extract the top 5 results including:
product name, price, rating, and number of reviews`

2. Use Guardrails for Safety

Always specify what the agent should NOT do, especially when dealing with e-commerce or sensitive sites.

{
"task": "Research competitor pricing",
"guardrails": "Browse and view pages only. Do not submit any forms, create accounts, or make purchases. Stay on the main website, do not follow external links."
}

3. Set Appropriate Iteration Limits

Match maxIterations to your task's complexity to prevent runaways.

  • Simple (10-20): Single page extraction, basic form fill.
  • Medium (30-50): Multi-page search, simple workflows.
  • Complex (60-100): Deep pagination, multi-step form-filling with validation.

4. Provide Context with the data Parameter

When filling forms, structure your data object to match the conceptual groups on the page.

{
"task": "Complete the registration form",
"data": {
"personalInfo": {
"firstName": "Alex",
"lastName": "Johnson"
},
"contact": {
"email": "[email protected]",
"phone": "555-1234"
},
"preferences": {
"newsletter": true
}
}
}

5. Handle Streaming Events Progressively

Don't wait for the complete event. Process events as they arrive to build a real-time experience and collect data incrementally.

// Good: Process events immediately
while (streaming) {
const event = await readEvent();
if (event.extractedData) {
processData(event.extractedData); // Process immediately
}
}

// Bad: Collect all then process
const allEvents = [];
while (streaming) {
allEvents.push(await readEvent());
}
processAllAtOnce(allEvents); // Misses real-time benefits

6. Monitor Event Streams for Errors

Watch for event: error or task:aborted events. These are your signal that a task has failed, and you should stop processing.

if (event.error || event.task === 'aborted') {
console.error('Task failed:', event);
// Handle error appropriately
break;
}

7. Test with Simple Tasks First

Start with a simple task (like "Navigate to the products page") to verify your connection. Then, build up to more complex workflows ("Navigate and extract the first product"), adding filters and interactions one by one.