Guides

How to Generate JSON Data with AI

Learn how to use the Tabstack API `/v1/generate/json` endpoint to generate JSON data with AI.

Often, extracting existing data from a web page isn’t enough. You need to transform that content — summarize it, categorize it, translate it, or restructure it into a new format. This is where /v1/generate/json comes in.

Unlike /v1/extract/json, which pulls existing data, /v1/generate/json uses AI to generate new content based on your instructions.

This process is driven by two key inputs:

json_schema: The “what.” A precise blueprint defining the shape of the JSON you want.
instructions: The “how.” Natural language instructions telling the AI how to process the source content to populate your schema.

By combining a target URL, a schema, and clear instructions, you can build workflows for content summarization, sentiment analysis, data categorization, competitive intelligence, and more.

Prerequisites

You need a Tabstack API key. Get one at tabstack.ai, then set it as an environment variable:

export TABSTACK_API_KEY="your-api-key-here"

Install the SDK:

npm install @tabstack/sdk

pip install tabstack

Step-by-Step: From URL to Generated JSON

The goal: Analyze the Hacker News homepage (https://news.ycombinator.com). For each story, have the AI generate a category (e.g., “tech,” “business,” “science”) and write a new one-sentence summary.

Step 1: Define the Schema

Define the shape of your output using json_schema. We want an object containing a summaries array, where each item has title, category, and summary string properties.

{
  "json_schema": {
    "type": "object",
    "properties": {
      "summaries": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "title": { "type": "string" },
            "category": { "type": "string" },
            "summary": { "type": "string" }
          }
        }
      }
    }
  }
}

Step 2: Write the Instructions

Tell the AI how to populate the schema:

"For each story on the page, find its title. Then, categorize it as tech/business/science/other and write a one-sentence summary in simple terms."

Step 3: Assemble and Send the Request

import Tabstack from '@tabstack/sdk'

const client = new Tabstack()

const result = await client.generate.json({
  url: 'https://news.ycombinator.com',
  json_schema: {
    type: 'object',
    properties: {
      summaries: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            title: { type: 'string' },
            category: { type: 'string' },
            summary: { type: 'string' },
          },
        },
      },
    },
  },
  instructions:
    'For each story, categorize it as tech/business/science/other and write a one-sentence summary in simple terms',
})

console.log(JSON.stringify(result, null, 2))

import json
from tabstack import Tabstack

client = Tabstack()

result = client.generate.json(
    url='https://news.ycombinator.com',
    json_schema={
        'type': 'object',
        'properties': {
            'summaries': {
                'type': 'array',
                'items': {
                    'type': 'object',
                    'properties': {
                        'title': {'type': 'string'},
                        'category': {'type': 'string'},
                        'summary': {'type': 'string'},
                    },
                },
            },
        },
    },
    instructions='For each story, categorize it as tech/business/science/other and write a one-sentence summary in simple terms',
)

print(json.dumps(result, indent=2))

curl -X POST https://api.tabstack.ai/v1/generate/json \
  -H "Authorization: Bearer $TABSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://news.ycombinator.com",
    "json_schema": {
      "type": "object",
      "properties": {
        "summaries": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "title": {"type": "string"},
              "category": {"type": "string"},
              "summary": {"type": "string"}
            }
          }
        }
      }
    },
    "instructions": "For each story, categorize it as tech/business/science/other and write a one-sentence summary in simple terms"
  }'

Note: The examples throughout this guide use placeholder URLs like https://competitor.example.com/pricing, https://jobs.example.com/senior-backend-engineer, and https://example.com. Replace them with real URLs of pages you want to process.

Step 4: Analyze the Response

A successful request returns a 200 OK with JSON matching your schema exactly:

{
  "summaries": [
    {
      "title": "New AI Model Released",
      "category": "tech",
      "summary": "A research lab announced a new language model that performs better on reasoning tasks."
    },
    {
      "title": "Database Performance Tips",
      "category": "tech",
      "summary": "An engineer shares techniques that reduced database query times by 90%."
    },
    {
      "title": "Climate Tech Startup Raises Funding",
      "category": "business",
      "summary": "A carbon capture company secured $50M in Series B funding."
    }
  ]
}

The AI generated the category and summary fields based on your instructions. The title was extracted, but the other fields are newly created, AI-derived content.

Request Parameters

url (required)

Type: string
Description: The fully qualified URL of the web page to process.
Must be a valid, publicly accessible URL. Cannot be localhost or a private IP.

json_schema (required)

Type: object
Description: A valid JSON Schema object defining the exact structure, types, and constraints for your output. The AI strictly adheres to this schema.
Tips:
- Use string, number, boolean, array, and object types.
- Add description fields to properties — the AI uses them as hints.
- Use enum to constrain a field to a specific set of values.

{
  "json_schema": {
    "type": "object",
    "properties": {
      "summary": {
        "type": "string",
        "description": "Overall summary of the content"
      },
      "sentiment": {
        "type": "string",
        "enum": ["positive", "negative", "neutral"],
        "description": "Overall sentiment of the article"
      }
    },
    "required": ["summary"]
  }
}

instructions (required)

Type: string (max 20,000 characters)
Description: Natural language instructions telling the AI how to generate data to fit your schema.
Tips:
- Be specific: “Write a 3-sentence summary” beats “Summarize this.”
- Reference schema properties by name.
- Define edge cases: “If no author is found, set author to null.”

effort (optional)

Type: 'min' | 'standard' | 'max'
Default: standard
Controls the speed vs. capability tradeoff. Use max for JS-heavy SPAs or complex pages.

nocache (optional)

Type: boolean
Default: false
Set to true to bypass cache and force a fresh fetch. Use for real-time content or when testing with different instructions on the same URL.

geo_target (optional)

Type: { country: string } — ISO 3166-1 alpha-2 code (e.g. 'US', 'GB')
Fetches the URL from a specific geographic location.

Real-World Examples

Example 1: Content Summarization and Audience Analysis

Goal: Generate a “TL;DR,” a longer summary, and identify the target audience for a blog post.

Request payload:

{
  "url": "https://techblog.example.com/long-article",
  "json_schema": {
    "type": "object",
    "properties": {
      "title": { "type": "string" },
      "tldr": { "type": "string", "description": "One-sentence summary" },
      "summary": { "type": "string", "description": "3-4 sentence summary" },
      "targetAudience": { "type": "string", "description": "Who this is for" }
    }
  },
  "instructions": "Create a TLDR (one sentence) and a longer summary (3-4 sentences) of this article. Also identify who the target audience is (e.g., developers, business leaders, general audience)."
}

Example response:

{
  "title": "Understanding Microservices Architecture",
  "tldr": "Microservices break large applications into small, independent services that communicate via APIs, offering better scalability and flexibility.",
  "summary": "This article explains microservices architecture, where applications are built as a collection of small, loosely coupled services. Each service handles a specific business function and can be developed, deployed, and scaled independently. The approach offers better fault isolation, technology flexibility, and team autonomy, though it introduces complexity in service coordination and data consistency.",
  "targetAudience": "Software developers and architects"
}

Example 2: Content Categorization and Tagging

Goal: For each post on a blog homepage, generate a category, a list of tags, and a difficulty level.

Request payload:

{
  "url": "https://blog.example.com",
  "json_schema": {
    "type": "object",
    "properties": {
      "posts": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "title": { "type": "string" },
            "category": {
              "type": "string",
              "description": "One of: tutorial, opinion, news, case-study"
            },
            "tags": { "type": "array", "items": { "type": "string" } },
            "difficultyLevel": {
              "type": "string",
              "enum": ["beginner", "intermediate", "advanced"]
            }
          }
        }
      }
    }
  },
  "instructions": "For each blog post, assign it to a category (tutorial/opinion/news/case-study), generate 3-5 relevant tags, and determine the technical difficulty level (beginner/intermediate/advanced)."
}

Example response:

{
  "posts": [
    {
      "title": "Getting Started with React Hooks",
      "category": "tutorial",
      "tags": ["react", "hooks", "javascript", "frontend", "beginner-friendly"],
      "difficultyLevel": "beginner"
    },
    {
      "title": "Optimizing Database Queries at Scale",
      "category": "tutorial",
      "tags": ["database", "performance", "optimization", "sql", "indexing"],
      "difficultyLevel": "advanced"
    },
    {
      "title": "Why We Migrated to Kubernetes",
      "category": "case-study",
      "tags": ["kubernetes", "devops", "infrastructure", "migration", "containers"],
      "difficultyLevel": "intermediate"
    }
  ]
}

Example 3: Sentiment and Trend Analysis from Reviews

Goal: Analyze a product review page to find overall sentiment, common themes, and a final recommendation.

Request payload:

{
  "url": "https://reviews.example.com/product/xyz",
  "json_schema": {
    "type": "object",
    "properties": {
      "overallSentiment": {
        "type": "string",
        "enum": ["very positive", "positive", "neutral", "negative", "very negative"]
      },
      "rating": {
        "type": "number",
        "description": "Estimated average rating out of 5"
      },
      "commonPraises": { "type": "array", "items": { "type": "string" } },
      "commonComplaints": { "type": "array", "items": { "type": "string" } },
      "recommendation": { "type": "string" }
    }
  },
  "instructions": "Analyze all customer reviews on this page and determine: the overall sentiment, an estimated rating (1-5), the top 3 most common praises, the top 3 most common complaints, and write a final recommendation (yes/no with a brief reason)."
}

Example response:

{
  "overallSentiment": "positive",
  "rating": 4.2,
  "commonPraises": [
    "Excellent build quality and durability",
    "Great battery life lasting 2-3 days",
    "Intuitive and easy-to-use interface"
  ],
  "commonComplaints": [
    "Price is higher than competitors",
    "Limited color options available",
    "Charging cable is too short"
  ],
  "recommendation": "Yes - the product excels in quality and performance, making it worth the premium price for users who prioritize reliability over cost."
}

Agent use cases

These examples show generate.json as an intelligence call inside an agent — fetching a URL, transforming its content, and returning structured data the agent can act on.

Competitive signal extraction

Fetch a competitor’s pricing page and transform it into structured competitive intelligence.

import Tabstack from '@tabstack/sdk'

const client = new Tabstack()

const competitiveSignals = await client.generate.json({
  url: 'https://competitor.example.com/pricing',
  json_schema: {
    type: 'object',
    properties: {
      tiers: {
        type: 'array',
        items: {
          type: 'object',
          properties: {
            name: { type: 'string', description: 'Plan or tier name' },
            price_monthly: {
              type: ['number', 'null'],
              description: 'Monthly price in USD, null if custom or contact-us',
            },
            price_annual: {
              type: ['number', 'null'],
              description: 'Annual price in USD, null if custom or contact-us',
            },
            key_features: {
              type: 'array',
              items: { type: 'string' },
              description: 'Top 3-5 features highlighted for this tier',
            },
            target_customer: {
              type: 'string',
              description: 'Who this tier appears to be aimed at',
            },
            value_rating: {
              type: 'string',
              enum: ['good', 'fair', 'poor'],
              description:
                "Heuristic judgement of this tier's value based on features vs price. 'good' = rich feature set relative to price; 'poor' = thin feature set for the cost.",
            },
          },
        },
      },
      pricing_model: {
        type: 'string',
        enum: ['per-seat', 'usage-based', 'flat-rate', 'hybrid', 'unknown'],
        description: 'Overall pricing structure',
      },
      free_tier_available: { type: 'boolean' },
    },
  },
  instructions:
    'Extract each pricing tier. For each tier, capture the name, monthly and annual prices (null if not shown), the top 3-5 highlighted features, and who it seems targeted at. Then judge the tier\'s value_rating (good/fair/poor) by weighing its feature set against its price. Also identify the overall pricing model and whether a free tier exists.',
  nocache: true,
})

console.log(JSON.stringify(competitiveSignals, null, 2))

import json
from tabstack import Tabstack

client = Tabstack()

competitive_signals = client.generate.json(
    url='https://competitor.example.com/pricing',
    json_schema={
        'type': 'object',
        'properties': {
            'tiers': {
                'type': 'array',
                'items': {
                    'type': 'object',
                    'properties': {
                        'name': {'type': 'string', 'description': 'Plan or tier name'},
                        'price_monthly': {
                            'type': ['number', 'null'],
                            'description': 'Monthly price in USD, null if custom or contact-us',
                        },
                        'price_annual': {
                            'type': ['number', 'null'],
                            'description': 'Annual price in USD, null if custom or contact-us',
                        },
                        'key_features': {
                            'type': 'array',
                            'items': {'type': 'string'},
                            'description': 'Top 3-5 features highlighted for this tier',
                        },
                        'target_customer': {
                            'type': 'string',
                            'description': 'Who this tier appears to be aimed at',
                        },
                        'value_rating': {
                            'type': 'string',
                            'enum': ['good', 'fair', 'poor'],
                            'description': (
                                "Heuristic judgement of this tier's value based on features vs price. "
                                "'good' = rich feature set relative to price; 'poor' = thin feature set for the cost."
                            ),
                        },
                    },
                },
            },
            'pricing_model': {
                'type': 'string',
                'enum': ['per-seat', 'usage-based', 'flat-rate', 'hybrid', 'unknown'],
                'description': 'Overall pricing structure',
            },
            'free_tier_available': {'type': 'boolean'},
        },
    },
    instructions=(
        'Extract each pricing tier. For each tier, capture the name, monthly and annual prices '
        '(null if not shown), the top 3-5 highlighted features, and who it seems targeted at. '
        "Then judge the tier's value_rating (good/fair/poor) by weighing its feature set against its price. "
        'Also identify the overall pricing model and whether a free tier exists.'
    ),
    nocache=True,
)

print(json.dumps(competitive_signals, indent=2))

curl -X POST https://api.tabstack.ai/v1/generate/json \
  -H "Authorization: Bearer $TABSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://competitor.example.com/pricing",
    "json_schema": {
      "type": "object",
      "properties": {
        "tiers": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "name": {"type": "string"},
              "price_monthly": {"type": ["number", "null"]},
              "price_annual": {"type": ["number", "null"]},
              "key_features": {"type": "array", "items": {"type": "string"}},
              "target_customer": {"type": "string"},
              "value_rating": {
                "type": "string",
                "enum": ["good", "fair", "poor"]
              }
            }
          }
        },
        "pricing_model": {
          "type": "string",
          "enum": ["per-seat", "usage-based", "flat-rate", "hybrid", "unknown"]
        },
        "free_tier_available": {"type": "boolean"}
      }
    },
    "instructions": "Extract each pricing tier with name, monthly/annual prices, top 3-5 features, and target customer. Then judge each tier'\''s value_rating (good/fair/poor) by weighing its feature set against its price. Also identify the pricing model and whether a free tier exists.",
    "nocache": true
  }'

Hiring intent data

Convert a job listing into structured hiring intent signals — useful for sales intelligence, market research, or agent-driven lead qualification.

import Tabstack from '@tabstack/sdk'

const client = new Tabstack()

const hiringIntent = await client.generate.json({
  url: 'https://jobs.example.com/senior-backend-engineer',
  json_schema: {
    type: 'object',
    properties: {
      role: { type: 'string', description: 'Job title as listed' },
      seniority: {
        type: 'string',
        enum: ['intern', 'junior', 'mid', 'senior', 'staff', 'principal', 'manager', 'director'],
      },
      tech_stack: {
        type: 'array',
        items: { type: 'string' },
        description: 'Technologies, frameworks, and tools mentioned in requirements',
      },
      team_size_signals: {
        type: 'string',
        description:
          'Any indicators of team or company size (headcount, growth stage, etc.)',
      },
      pain_points: {
        type: 'array',
        items: { type: 'string' },
        description:
          'Problems or challenges implied by the job description (e.g., "scaling infrastructure", "improving developer experience")',
      },
      remote_policy: {
        type: 'string',
        enum: ['remote', 'hybrid', 'on-site', 'unknown'],
      },
      urgency_level: {
        type: 'string',
        enum: ['low', 'medium', 'high'],
        description:
          "Rate the hiring urgency. 'high' = the listing implies backfill or a launched-but-incomplete project; 'low' = speculative or long-term planning.",
      },
      company_name: { type: 'string' },
    },
  },
  instructions:
    'Parse this job listing and extract: the exact role title, seniority level, all technologies and tools mentioned in requirements or responsibilities, any signals about team or company size, implied pain points or challenges the hire is meant to solve, the remote work policy, and the company name. Then judge the urgency_level (low/medium/high) from language cues about timelines, backfill, shipping pressure, or speculative planning.',
})

console.log(JSON.stringify(hiringIntent, null, 2))

import json
from tabstack import Tabstack

client = Tabstack()

hiring_intent = client.generate.json(
    url='https://jobs.example.com/senior-backend-engineer',
    json_schema={
        'type': 'object',
        'properties': {
            'role': {'type': 'string', 'description': 'Job title as listed'},
            'seniority': {
                'type': 'string',
                'enum': ['intern', 'junior', 'mid', 'senior', 'staff', 'principal', 'manager', 'director'],
            },
            'tech_stack': {
                'type': 'array',
                'items': {'type': 'string'},
                'description': 'Technologies, frameworks, and tools mentioned in requirements',
            },
            'team_size_signals': {
                'type': 'string',
                'description': 'Any indicators of team or company size',
            },
            'pain_points': {
                'type': 'array',
                'items': {'type': 'string'},
                'description': 'Problems implied by the job description',
            },
            'remote_policy': {
                'type': 'string',
                'enum': ['remote', 'hybrid', 'on-site', 'unknown'],
            },
            'urgency_level': {
                'type': 'string',
                'enum': ['low', 'medium', 'high'],
                'description': (
                    "Rate the hiring urgency. 'high' = the listing implies backfill or a "
                    "launched-but-incomplete project; 'low' = speculative or long-term planning."
                ),
            },
            'company_name': {'type': 'string'},
        },
    },
    instructions=(
        'Parse this job listing and extract: the exact role title, seniority level, '
        'all technologies and tools mentioned, any signals about team or company size, '
        'implied pain points the hire is meant to solve, the remote work policy, '
        'and the company name. Then judge the urgency_level (low/medium/high) from '
        'language cues about timelines, backfill, shipping pressure, or speculative planning.'
    ),
)

print(json.dumps(hiring_intent, indent=2))

curl -X POST https://api.tabstack.ai/v1/generate/json \
  -H "Authorization: Bearer $TABSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://jobs.example.com/senior-backend-engineer",
    "json_schema": {
      "type": "object",
      "properties": {
        "role": {"type": "string"},
        "seniority": {
          "type": "string",
          "enum": ["intern", "junior", "mid", "senior", "staff", "principal", "manager", "director"]
        },
        "tech_stack": {"type": "array", "items": {"type": "string"}},
        "team_size_signals": {"type": "string"},
        "pain_points": {"type": "array", "items": {"type": "string"}},
        "remote_policy": {
          "type": "string",
          "enum": ["remote", "hybrid", "on-site", "unknown"]
        },
        "urgency_level": {
          "type": "string",
          "enum": ["low", "medium", "high"]
        },
        "company_name": {"type": "string"}
      }
    },
    "instructions": "Parse this job listing: extract the role title, seniority level, all mentioned technologies, team/company size signals, implied pain points, remote work policy, and company name. Then judge the urgency_level (low/medium/high) from language cues about timelines, backfill, shipping pressure, or speculative planning."
  }'

Working with Responses

This example fetches an article, asks the AI to generate key insights and action items, then processes the structured output.

import Tabstack from '@tabstack/sdk'

const client = new Tabstack()

async function analyzeAndProcess(url: string) {
  const data = await client.generate.json({
    url,
    json_schema: {
      type: 'object',
      properties: {
        mainTopic: { type: 'string' },
        keyInsights: {
          type: 'array',
          items: { type: 'string' },
        },
        actionItems: {
          type: 'array',
          items: { type: 'string' },
        },
      },
    },
    instructions:
      'Identify the main topic, extract 3-5 key insights, and suggest 2-3 action items for someone reading this.',
  })

  console.log(`Main Topic: ${data.mainTopic}\n`)

  console.log('Key Insights:')
  ;(data.keyInsights as string[]).forEach((insight, i) => {
    console.log(`  ${i + 1}. ${insight}`)
  })

  console.log('\nRecommended Actions:')
  ;(data.actionItems as string[]).forEach((action, i) => {
    console.log(`  ${i + 1}. ${action}`)
  })

  return data
}

analyzeAndProcess('https://blog.example.com/article')

import json
from tabstack import Tabstack

client = Tabstack()

def analyze_and_process(url: str):
    data = client.generate.json(
        url=url,
        json_schema={
            'type': 'object',
            'properties': {
                'mainTopic': {'type': 'string'},
                'keyInsights': {
                    'type': 'array',
                    'items': {'type': 'string'},
                },
                'actionItems': {
                    'type': 'array',
                    'items': {'type': 'string'},
                },
            },
        },
        instructions='Identify the main topic, extract 3-5 key insights, and suggest 2-3 action items for someone reading this.',
    )

    print(f"Main Topic: {data['mainTopic']}\n")

    print('Key Insights:')
    for i, insight in enumerate(data['keyInsights'], 1):
        print(f"  {i}. {insight}")

    print('\nRecommended Actions:')
    for i, action in enumerate(data['actionItems'], 1):
        print(f"  {i}. {action}")

    return data

if __name__ == "__main__":
    analyze_and_process('https://blog.example.com/article')

curl -X POST https://api.tabstack.ai/v1/generate/json \
  -H "Authorization: Bearer $TABSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://blog.example.com/article",
    "json_schema": {
      "type": "object",
      "properties": {
        "mainTopic": {"type": "string"},
        "keyInsights": {"type": "array", "items": {"type": "string"}},
        "actionItems": {"type": "array", "items": {"type": "string"}}
      }
    },
    "instructions": "Identify the main topic, extract 3-5 key insights, and suggest 2-3 action items for someone reading this."
  }'

Error Handling

The SDK throws typed errors you can catch by class.

Error status codes

Status	SDK error type	Description
400	`BadRequestError`	Missing or malformed `url`, `json_schema`, or `instructions`.
401	`AuthenticationError`	API key missing, invalid, or expired.
403	`PermissionDeniedError`	Your key doesn’t have access to this resource.
422	`UnprocessableEntityError`	The URL was malformed or points to an inaccessible resource.
429	`RateLimitError`	Too many requests. The SDK retries automatically with backoff.
500	`InternalServerError`	Server-side failure — page too large, fetch failed, or AI transformation failed.

The SDK automatically retries 408, 409, 429, and 500+ errors twice with exponential backoff.

Error handling example

import Tabstack, {
  AuthenticationError,
  BadRequestError,
  RateLimitError,
  UnprocessableEntityError,
  InternalServerError,
} from '@tabstack/sdk'

const client = new Tabstack()

async function generateWithErrorHandling(
  url: string,
  schema: unknown,
  instructions: string
) {
  try {
    const data = await client.generate.json({
      url,
      json_schema: schema,
      instructions,
    })
    return data
  } catch (error) {
    if (error instanceof AuthenticationError) {
      throw new Error('Authentication failed. Check your TABSTACK_API_KEY.')
    }
    if (error instanceof BadRequestError) {
      throw new Error(`Bad request: ${error.message}`)
    }
    if (error instanceof UnprocessableEntityError) {
      throw new Error(`Invalid URL: ${error.message}`)
    }
    if (error instanceof RateLimitError) {
      throw new Error('Rate limit exceeded. Retries exhausted.')
    }
    if (error instanceof InternalServerError) {
      throw new Error(
        `Server error: ${error.message}. Try simplifying your instructions or using a different URL.`
      )
    }
    throw error
  }
}

// Usage
const schema = {
  type: 'object',
  properties: {
    summary: { type: 'string' },
  },
}

generateWithErrorHandling('https://example.com', schema, 'Create a brief summary').then(
  (data) => console.log(data)
)

import json
from tabstack import Tabstack
from tabstack import AuthenticationError, BadRequestError, RateLimitError
from tabstack import UnprocessableEntityError, InternalServerError

client = Tabstack()

def generate_with_error_handling(url: str, schema: dict, instructions: str):
    try:
        return client.generate.json(
            url=url,
            json_schema=schema,
            instructions=instructions,
        )
    except AuthenticationError:
        raise PermissionError('Authentication failed. Check your TABSTACK_API_KEY.')
    except BadRequestError as e:
        raise ValueError(f'Bad request: {e}')
    except UnprocessableEntityError as e:
        raise ValueError(f'Invalid URL: {e}')
    except RateLimitError:
        raise RuntimeError('Rate limit exceeded. Retries exhausted.')
    except InternalServerError as e:
        raise RuntimeError(
            f'Server error: {e}. Try simplifying your instructions or using a different URL.'
        )

# Usage
schema = {
    'type': 'object',
    'properties': {
        'summary': {'type': 'string'},
    },
}

try:
    data = generate_with_error_handling('https://example.com', schema, 'Create a brief summary')
    print(json.dumps(data, indent=2))
except Exception as e:
    print(f'Failed to generate: {e}')

# Check the HTTP status code and error field in the response
curl -s -w "\n%{http_code}" -X POST https://api.tabstack.ai/v1/generate/json \
  -H "Authorization: Bearer $TABSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://example.com",
    "json_schema": {"type": "object", "properties": {"summary": {"type": "string"}}},
    "instructions": "Create a brief summary"
  }'

Best Practices

1. Write clear, specific instructions

The quality of your output is directly proportional to the quality of your instructions.

Vague: "instructions": "Summarize this"

Better: "instructions": "Write a 2-sentence summary of this article"

Best: "instructions": "Create a concise summary for the 'summary' field. The summary should be 2-3 sentences, written in a professional but accessible style, and focus on the main argument and key findings."

2. Use schema descriptions effectively

The AI reads the description fields in your json_schema. Use them to provide context and constraints for each property.

Without descriptions (less reliable):

{
  "type": "object",
  "properties": {
    "technicalLevel": {
      "type": "string",
      "enum": ["beginner", "intermediate", "advanced"]
    }
  }
}

With descriptions (more reliable):

{
  "type": "object",
  "properties": {
    "technicalLevel": {
      "type": "string",
      "enum": ["beginner", "intermediate", "advanced"],
      "description": "Assess the technical complexity of the content. 'Beginner' means no prior knowledge needed. 'Advanced' means deep technical expertise is required."
    }
  }
}

3. Test instructions iteratively

Start with a simple schema and simple instructions. Get that working, then add complexity.

Start simple: "Summarize this article in 2 sentences."
Add a field: "Summarize... and identify the target audience."
Add more: "Summarize... identify the target audience... list 3 key takeaways... and rate the technical difficulty from 1-5."

This is easier to debug than writing a complex prompt from scratch.

4. Monitor and log generations

For production systems, log your requests and responses. This lets you monitor output quality, identify edge cases, and refine your instructions over time.

import Tabstack from '@tabstack/sdk'

const client = new Tabstack()

async function generateWithLogging(
  url: string,
  schema: unknown,
  instructions: string
) {
  const startTime = Date.now()

  try {
    const data = await client.generate.json({ url, json_schema: schema, instructions })
    const duration = Date.now() - startTime

    console.log(
      JSON.stringify({
        level: 'info',
        timestamp: new Date().toISOString(),
        url,
        status: 'success',
        durationMs: duration,
        instructionsLength: instructions.length,
        responseSize: JSON.stringify(data).length,
      })
    )

    return data
  } catch (error) {
    const duration = Date.now() - startTime

    console.error(
      JSON.stringify({
        level: 'error',
        timestamp: new Date().toISOString(),
        url,
        status: 'error',
        durationMs: duration,
        errorMessage: (error as Error).message,
      })
    )
    throw error
  }
}

await generateWithLogging(
  'https://example.com',
  { type: 'object', properties: { summary: { type: 'string' } } },
  'Write a 1-sentence summary of the page.',
)

import time
import json
import logging
from tabstack import Tabstack

client = Tabstack()

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s')

def generate_with_logging(url: str, schema: dict, instructions: str):
    start_time = time.time()

    try:
        data = client.generate.json(url=url, json_schema=schema, instructions=instructions)
        duration_ms = (time.time() - start_time) * 1000

        logging.info(json.dumps({
            'url': url,
            'status': 'success',
            'duration_ms': duration_ms,
            'instructions_length': len(instructions),
            'response_size': len(json.dumps(data)),
        }))

        return data
    except Exception as e:
        duration_ms = (time.time() - start_time) * 1000

        logging.error(json.dumps({
            'url': url,
            'status': 'error',
            'duration_ms': duration_ms,
            'error_message': str(e),
        }))
        raise

generate_with_logging(
    'https://example.com',
    {'type': 'object', 'properties': {'summary': {'type': 'string'}}},
    'Write a 1-sentence summary of the page.',
)

# Log the response and status code together
START=$(date +%s%N)
RESPONSE=$(curl -s -w "\n%{http_code}" -X POST https://api.tabstack.ai/v1/generate/json \
  -H "Authorization: Bearer $TABSTACK_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com", "json_schema": {"type": "object", "properties": {"summary": {"type": "string"}}, "required": ["summary"]}, "instructions": "Write a 1-sentence summary of the page in the '\''summary'\'' field."}')
END=$(date +%s%N)
echo "Duration: $(( (END - START) / 1000000 ))ms"
echo "$RESPONSE"

How to Generate JSON Data with AI

Prerequisites

Step-by-Step: From URL to Generated JSON

Step 1: Define the Schema

Step 2: Write the Instructions

Step 3: Assemble and Send the Request

Step 4: Analyze the Response

Request Parameters

url (required)

json_schema (required)

instructions (required)

effort (optional)

nocache (optional)

geo_target (optional)

Real-World Examples

Example 1: Content Summarization and Audience Analysis

Example 2: Content Categorization and Tagging

Example 3: Sentiment and Trend Analysis from Reviews

Agent use cases

Competitive signal extraction

Hiring intent data

Working with Responses

Error Handling

Error status codes

Error handling example

Best Practices

1. Write clear, specific instructions

2. Use schema descriptions effectively

3. Test instructions iteratively

4. Monitor and log generations

Related resources