--- title: How to Generate JSON Data with AI | Tabstack description: Learn how to use the Tabstack API `/v1/generate/json` endpoint to generate JSON data with AI. --- Often, simply extracting existing data from a web page isn’t enough. You need to **transform** that content—summarize it, categorize it, translate it, or restructure it into a new format. This is where the Tabstack API `/v1/generate/json` endpoint shines. Unlike the `/v1/extract/json` endpoint, which pulls existing data, the `/v1/generate/json` endpoint uses AI to **generate new content** based on your needs. This process is driven by two key inputs you provide: 1. **`json_schema`**: The “what.” This is a precise blueprint defining the **shape** of the final JSON you want. 2. **`instructions`**: The “how.” These are natural language instructions that tell the AI *how* to process the source content to populate your schema. By combining a target URL, a schema, and clear instructions, you can build powerful workflows for content summarization, sentiment analysis, data categorization, and much more. This guide will walk you through the entire process, from your first request to advanced, production-ready patterns. --- ## Prerequisites Before you can use the JSON generation endpoint, you’ll need a valid Tabstack API key. 1. Sign up at to get your API key. 2. The API uses Bearer Token authentication, so you’ll pass your key in an `Authorization` header. We recommend storing your API key as an environment variable for security and convenience. This command shows how to set an environment variable in a Bash-compatible shell. Terminal window ``` export TABSTACK_API_KEY="your-api-key-here" ``` The `export` command makes the variable available to any processes or scripts you run from this shell session. `TABSTACK_API_KEY` is the name of the environment variable that our code examples will look for. You should replace `"your-api-key-here"` with your actual, secret API key. **How to Run:** 1. Copy this command. 2. Paste it into your terminal. 3. Replace the placeholder with your key and press Enter. 4. The `curl`, `javascript` (Node.js), and `python` examples in this guide will now work by reading this variable. --- ## Step-by-Step: From URL to Generated JSON Let’s walk through a complete, practical example. **The Goal:** We want to analyze the Hacker News homepage (`https://news.ycombinator.com`). We don’t just want to scrape data; we want the AI to analyze each story and **generate a category** (e.g., “tech,” “business,” “science”) and **write a new one-sentence summary** for it. ### Step 1: Define the Schema First, we define the *shape* of our desired output using `json_schema`. We want a top-level object containing a key called `summaries`, which should be an array. Each object in that array should have three string properties: `title`, `category`, and `summary`. ``` { "json_schema": { "type": "object", "properties": { "summaries": { "type": "array", "items": { "type": "object", "properties": { "title": { "type": "string" }, "category": { "type": "string" }, "summary": { "type": "string" } } } } } } } ``` ### Step 2: Write the Instructions Next, we write the `instructions` to tell the AI *how* to populate this schema. ``` "instructions": "For each story on the page, find its title. Then, categorize it as tech/business/science/other and write a one-sentence summary in simple terms." ``` This instruction is critical: it explicitly tells the AI to generate the new `category` and `summary` data. ### Step 3: Assemble and Run the Request Now, let’s combine the target `url`, our `json_schema`, and our `instructions` into a single `POST` request to the `https://api.tabstack.ai/v1/generate/json` endpoint. - [curl](#tab-panel-12) - [JavaScript](#tab-panel-13) - [Python](#tab-panel-14) Terminal window ``` curl -X POST https://api.tabstack.ai/v1/generate/json \ -H "Authorization: Bearer $TABSTACK_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "url": "https://news.ycombinator.com", "json_schema": { "type": "object", "properties": { "summaries": { "type": "array", "items": { "type": "object", "properties": { "title": {"type": "string"}, "category": {"type": "string"}, "summary": {"type": "string"} } } } } }, "instructions": "For each story, categorize it as tech/business/science/other and write a one-sentence summary in simple terms" }' ``` ``` async function generateJson() { const response = await fetch("https://api.tabstack.ai/v1/generate/json", { method: "POST", headers: { Authorization: `Bearer ${process.env.TABSTACK_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ url: "https://news.ycombinator.com", json_schema: { type: "object", properties: { summaries: { type: "array", items: { type: "object", properties: { title: { type: "string" }, category: { type: "string" }, summary: { type: "string" }, }, }, }, }, }, instructions: "For each story, categorize it as tech/business/science/other and write a one-sentence summary in simple terms", }), }); const data = await response.json(); console.log(JSON.stringify(data, null, 2)); return data; } generateJson(); ``` ``` import requests import os import json response = requests.post( 'https://api.tabstack.ai/v1/generate/json', headers={ 'Authorization': f'Bearer {os.environ["TABSTACK_API_KEY"]}', 'Content-Type': 'application/json' }, json={ 'url': 'https://news.ycombinator.com', 'json_schema': { 'type': 'object', 'properties': { 'summaries': { 'type': 'array', 'items': { 'type': 'object', 'properties': { 'title': {'type': 'string'}, 'category': {'type': 'string'}, 'summary': {'type': 'string'} } } } } }, 'instructions': 'For each story, categorize it as tech/business/science/other and write a one-sentence summary in simple terms' } ) data = response.json() print(json.dumps(data, indent=2)) ``` **Code Explanation (Step-by-Step):** We send a `POST` request to the endpoint `https://api.tabstack.ai/v1/generate/json`. For authentication, we include the `Authorization: Bearer $TABSTACK_API_KEY` header, which uses the environment variable we set earlier. The `Content-Type: application/json` header tells the server that we are sending a JSON payload. The request body contains three key parameters: `"url"` specifies the target web page to fetch and analyze, `"json_schema"` provides the structured blueprint for the output, and `"instructions"` provides the natural language logic for the AI to follow. **How to Run:** - **curl:** You can run this command directly in your terminal, assuming you’ve set the `TABSTACK_API_KEY` variable. - **JavaScript:** Save the code as `generate.js`. Make sure you’re in a Node.js project. Run it from your terminal using `node generate.js`. - **Python:** Save the code as `generate.py`. Ensure you have the `requests` library installed (`pip install requests`). Run it using `python generate.py`. ### Step 4: Analyze the Response A successful request will return a `200 OK` status and the JSON payload, perfectly matching your schema. ``` { "summaries": [ { "title": "New AI Model Released", "category": "tech", "summary": "A research lab announced a new language model that performs better on reasoning tasks." }, { "title": "Database Performance Tips", "category": "tech", "summary": "An engineer shares techniques that reduced database query times by 90%." }, { "title": "Climate Tech Startup Raises Funding", "category": "business", "summary": "A carbon capture company secured $50M in Series B funding." } ] } ``` **Response Explanation:** The key difference from extraction: the AI *generated* the `category` and `summary` fields based on your instructions. The title was extracted, but the other fields are newly created, AI-derived content. This is what makes the generate endpoint powerful—it creates data, not just finds it. --- ## Request Parameters Here is a detailed breakdown of the `POST` request body parameters. ### url (required) - **Type:** `string` - **Description:** The fully qualified URL of the web page you want to process. - **Validation:** - Must be a valid, publicly accessible URL. - Cannot be a `localhost` address or a private/internal IP. ``` { "url": "https://techblog.example.com/article" } ``` ### json\_schema (required) - **Type:** `object` - **Description:** A valid JSON Schema object defining the exact structure, data types, and constraints for your desired output. The AI will strictly adhere to this schema. - **Tips for Effective Schemas:** - **Be Specific:** Use `string`, `number`, `boolean`, `array`, and `object` types. - **Use Descriptions:** Add `description` fields to properties. The AI uses these as hints to generate better, more relevant data. - **Use Enums:** To constrain a field to a specific list of values (e.g., for categories), use the `enum` keyword. ``` { "json_schema": { "type": "object", "properties": { "summary": { "type": "string", "description": "Overall summary of the content" }, "sentiment": { "type": "string", "enum": ["positive", "negative", "neutral"], "description": "Overall sentiment of the article" } }, "required": ["summary"] } } ``` ### instructions (required) - **Type:** `string` - **Description:** Natural language instructions that tell the AI *how* to generate the data to fit your schema. This is where you define the transformation logic. - **Tips for Effective Instructions:** - **Be Clear and Specific:** “Write a 3-sentence summary” is better than “Summarize this.” - **Reference Schema Properties:** Mention property names (e.g., “For the `sentiment` field, determine if the tone is…”). - **Define Edge Cases:** “If no author is found, set the `author` field to `null`.” - **Specify Format:** “Extract the key points as a list of bullet points.” ``` { "instructions": "Read the article and create a concise summary (2-3 sentences). Determine if the overall tone is positive, negative, or neutral and assign it to the 'sentiment' field." } ``` ### nocache (optional) - **Type:** `boolean` - **Default:** `false` - **Description:** When set to `true`, this forces the API to bypass any cached version of the URL and re-fetch and re-process the content. - **When to Use:** - Analyzing real-time or frequently updated content (e.g., news homepages, stock tickers). - Debugging a request with different instructions on the same URL. ``` { "url": "https://news.example.com", "json_schema": { ... }, "instructions": "Summarize the news", "nocache": true } ``` --- ## Real-World Examples These examples show the request *payload* and the corresponding *response*. The key is to see how the `instructions` and `json_schema` work together. ### Example 1: Content Summarization and Audience Analysis **Goal:** Generate a “TL;DR,” a longer summary, and identify the target audience for a blog post. **Request Payload:** ``` { "url": "https://techblog.example.com/long-article", "json_schema": { "type": "object", "properties": { "title": { "type": "string" }, "tldr": { "type": "string", "description": "One-sentence summary" }, "summary": { "type": "string", "description": "3-4 sentence summary" }, "targetAudience": { "type": "string", "description": "Who this is for" } } }, "instructions": "Create a TLDR (one sentence) and a longer summary (3-4 sentences) of this article. Also identify who the target audience is (e.g., developers, business leaders, general audience)." } ``` **Example Response:** ``` { "title": "Understanding Microservices Architecture", "tldr": "Microservices break large applications into small, independent services that communicate via APIs, offering better scalability and flexibility.", "summary": "This article explains microservices architecture, where applications are built as a collection of small, loosely coupled services. Each service handles a specific business function and can be developed, deployed, and scaled independently. The approach offers better fault isolation, technology flexibility, and team autonomy, though it introduces complexity in service coordination and data consistency.", "targetAudience": "Software developers and architects" } ``` ### Example 2: Content Categorization and Tagging **Goal:** Scrape a blog homepage, and for each post, generate a category, a list of tags, and a difficulty level. **Request Payload:** ``` { "url": "https://blog.example.com", "json_schema": { "type": "object", "properties": { "posts": { "type": "array", "items": { "type": "object", "properties": { "title": { "type": "string" }, "category": { "type": "string", "description": "One of: tutorial, opinion, news, case-study" }, "tags": { "type": "array", "items": { "type": "string" } }, "difficultyLevel": { "type": "string", "enum": ["beginner", "intermediate", "advanced"] } } } } } }, "instructions": "For each blog post, assign it to a category (tutorial/opinion/news/case-study), generate 3-5 relevant tags, and determine the technical difficulty level (beginner/intermediate/advanced)." } ``` **Example Response:** ``` { "posts": [ { "title": "Getting Started with React Hooks", "category": "tutorial", "tags": ["react", "hooks", "javascript", "frontend", "beginner-friendly"], "difficultyLevel": "beginner" }, { "title": "Optimizing Database Queries at Scale", "category": "tutorial", "tags": ["database", "performance", "optimization", "sql", "indexing"], "difficultyLevel": "advanced" }, { "title": "Why We Migrated to Kubernetes", "category": "case-study", "tags": [ "kubernetes", "devops", "infrastructure", "migration", "containers" ], "difficultyLevel": "intermediate" } ] } ``` ### Example 3: Sentiment and Trend Analysis from Reviews **Goal:** Analyze a page of product reviews to find the overall sentiment, common themes, and a final recommendation. **Request Payload:** ``` { "url": "https://reviews.example.com/product/xyz", "json_schema": { "type": "object", "properties": { "overallSentiment": { "type": "string", "enum": [ "very positive", "positive", "neutral", "negative", "very negative" ] }, "rating": { "type": "number", "description": "Estimated average rating out of 5" }, "commonPraises": { "type": "array", "items": { "type": "string" } }, "commonComplaints": { "type": "array", "items": { "type": "string" } }, "recommendation": { "type": "string" } } }, "instructions": "Analyze all customer reviews on this page and determine: the overall sentiment, an estimated rating (1-5), the top 3 most common praises, the top 3 most common complaints, and write a final recommendation (yes/no with a brief reason)." } ``` **Example Response:** ``` { "overallSentiment": "positive", "rating": 4.2, "commonPraises": [ "Excellent build quality and durability", "Great battery life lasting 2-3 days", "Intuitive and easy-to-use interface" ], "commonComplaints": [ "Price is higher than competitors", "Limited color options available", "Charging cable is too short" ], "recommendation": "Yes - the product excels in quality and performance, making it worth the premium price for users who prioritize reliability over cost." } ``` --- ## Working with Responses Here are complete, runnable examples showing how to call the API and then process the AI-generated data in your application. ### Example: Processing Key Insights This script fetches an article, asks the AI to generate key insights and action items, and then prints them in a formatted way. - [JavaScript](#tab-panel-15) - [Python](#tab-panel-16) ``` async function analyzeAndProcess(url) { const payload = { url, json_schema: { type: "object", properties: { mainTopic: { type: "string" }, keyInsights: { type: "array", items: { type: "string" }, }, actionItems: { type: "array", items: { type: "string" }, }, }, }, instructions: "Identify the main topic, extract 3-5 key insights, and suggest 2-3 action items for someone reading this.", }; const response = await fetch("https://api.tabstack.ai/v1/generate/json", { method: "POST", headers: { Authorization: `Bearer ${process.env.TABSTACK_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify(payload), }); if (!response.ok) { throw new Error(`API request failed: ${response.statusText}`); } const data = await response.json(); // Process the AI-generated content console.log(`Main Topic: ${data.mainTopic}\n`); console.log("Key Insights:"); data.keyInsights.forEach((insight, i) => { console.log(` ${i + 1}. ${insight}`); }); console.log("\nRecommended Actions:"); data.actionItems.forEach((action, i) => { console.log(` ${i + 1}. ${action}`); }); return data; } // Call the function analyzeAndProcess("https://blog.example.com/article"); ``` ``` import requests import os import json def analyze_and_process(url): payload = { 'url': url, 'json_schema': { 'type': 'object', 'properties': { 'mainTopic': {'type': 'string'}, 'keyInsights': { 'type': 'array', 'items': {'type': 'string'} }, 'actionItems': { 'type': 'array', 'items': {'type': 'string'} } } }, 'instructions': 'Identify the main topic, extract 3-5 key insights, and suggest 2-3 action items for someone reading this.' } response = requests.post( 'https://api.tabstack.ai/v1/generate/json', headers={ 'Authorization': f'Bearer {os.environ["TABSTACK_API_KEY"]}', 'Content-Type': 'application/json' }, json=payload ) response.raise_for_status() # Raises an HTTPError for bad responses data = response.json() # Process the AI-generated content print(f"Main Topic: {data['mainTopic']}\n") print('Key Insights:') for i, insight in enumerate(data['keyInsights'], 1): print(f" {i}. {insight}") print('\nRecommended Actions:') for i, action in enumerate(data['actionItems'], 1): print(f" {i}. {action}") return data # Call the function if __name__ == "__main__": analyze_and_process('https://blog.example.com/article') ``` **Code Explanation (JavaScript):** This reusable function takes a URL, sends the generation request with your schema and instructions, handles errors, and processes the structured response. The AI returns data matching your schema, which you can then display, store, or use however needed. **How to Run:** - **JavaScript:** Save as `analyze.js`. Run with `node analyze.js`. - **Python:** Save as `analyze.py`. Run with `python analyze.py`. (Remember to `pip install requests`). --- ## Error Handling Well-structured error handling is essential for a robust application. The API returns standard HTTP status codes. ### Common Error Status Codes | Status Code | Error Message | Description | | ----------- | ------------------------------------ | ------------------------------------------------------------------------------------------------------------------------- | | 400 | `url is required` | The `url` parameter was missing from your request body. | | 400 | `json schema is required` | The `json_schema` parameter was missing. | | 400 | `instructions are required` | The `instructions` parameter was missing. | | 400 | `json schema must be a valid object` | The `json_schema` you provided was malformed or not valid. | | 401 | `Unauthorized - Invalid token` | Your API key is missing, invalid, or expired. | | 422 | `url is invalid` | The `url` was malformed or pointed to an inaccessible resource. | | 500 | `failed to fetch URL` | The server had a problem fetching the provided URL. | | 500 | `web page is too large` | The target page’s content exceeded the maximum processing size. | | 500 | `failed to transform data` | The AI failed to generate data. This can be due to overly complex instructions, a schema mismatch, or bad source content. | All error responses return a JSON object with an `error` field. ``` { "error": "instructions are required" } ``` ### Error Handling Example Code This example expands our previous function with robust `try...catch` blocks and status-specific error messages. - [JavaScript](#tab-panel-17) - [Python](#tab-panel-18) ``` async function generateWithErrorHandling(url, schema, instructions) { try { const response = await fetch("https://api.tabstack.ai/v1/generate/json", { method: "POST", headers: { Authorization: `Bearer ${process.env.TABSTACK_API_KEY}`, "Content-Type": "application/json", }, body: JSON.stringify({ url, json_schema: schema, instructions, }), }); const data = await response.json(); if (!response.ok) { // Handle API-level errors switch (response.status) { case 400: throw new Error(`Bad request: ${data.error}`); case 401: throw new Error("Authentication failed. Check your API key."); case 422: throw new Error(`Invalid URL: ${data.error}`); case 500: if (data.error.includes("too large")) { throw new Error("Page is too large. Try a different URL."); } else if (data.error.includes("transform")) { throw new Error( "AI transformation failed. Try simplifying your instructions." ); } throw new Error(`Server error: ${data.error}`); default: throw new Error(`Request failed (${response.status}): ${data.error}`); } } return data; } catch (error) { // Handle network errors or thrown exceptions console.error("Error generating JSON:", error.message); throw error; } } // Usage const schema = { type: "object", properties: { summary: { type: "string" }, }, }; generateWithErrorHandling( "https://example.com", schema, "Create a brief summary" ).then((data) => console.log(data)); ``` ``` import requests import os import json def generate_with_error_handling(url, schema, instructions): try: response = requests.post( 'https://api.tabstack.ai/v1/generate/json', headers={ 'Authorization': f'Bearer {os.environ["TABSTACK_API_KEY"]}', 'Content-Type': 'application/json' }, json={ 'url': url, 'json_schema': schema, 'instructions': instructions }, timeout=30 ) data = response.json() if not response.ok: error_msg = data.get('error', 'Unknown error') if response.status_code == 400: raise ValueError(f'Bad request: {error_msg}') elif response.status_code == 401: raise PermissionError('Authentication failed. Check your API key.') elif response.status_code == 422: raise ValueError(f'Invalid URL: {error_msg}') elif response.status_code == 500: if 'too large' in error_msg: raise RuntimeError('Page is too large. Try a different URL.') elif 'transform' in error_msg: raise RuntimeError('AI transformation failed. Try simplifying instructions.') raise RuntimeError(f'Server error: {error_msg}') else: response.raise_for_status() return data except requests.exceptions.Timeout: raise TimeoutError('The request timed out.') except requests.exceptions.HTTPError as e: print(f'HTTP Error: {e}') raise except requests.exceptions.RequestException as e: print(f'Network error: {e}') raise # Usage schema = { 'type': 'object', 'properties': { 'summary': {'type': 'string'} } } try: data = generate_with_error_handling( 'https://example.com', schema, 'Create a brief summary' ) print(json.dumps(data, indent=2)) except Exception as e: print(f'Failed to generate: {e}') ``` **Code Explanation (JavaScript):** This version adds robust error handling. Parse the response first so you can access error messages, then check `response.ok`. Use a switch statement to provide specific error messages for different status codes. This makes debugging easier—you know immediately if the issue is authentication, a bad URL, or an AI generation failure. **How to Run:** - This code is designed to be part of a larger application. You can test it by running the file. To trigger an error, try passing an invalid URL (e.g., `http://invalid-url-123.xyz`) or an empty `instructions` string. --- ## Best Practices Follow these practices to get the most reliable and accurate results from the AI. ### 1. Write Clear, Specific Instructions The quality of your output is directly proportional to the quality of your instructions. Vague instructions lead to vague results. > **Vague:** `"instructions": "Summarize this"` > > **Better:** `"instructions": "Write a 2-sentence summary of this article"` > > **Best:** `"instructions": "Create a concise summary of this article for the 'summary' field. The summary should be 2-3 sentences, written in a professional but accessible style, and focus on the main argument and key findings."` ### 2. Use Schema Descriptions Effectively The AI reads the `description` fields in your `json_schema`. Use them to provide context and hints for each property. **Schema without descriptions (less reliable):** ``` { "type": "object", "properties": { "technicalLevel": { "type": "string", "enum": ["beginner", "intermediate", "advanced"] } } } ``` **Schema with descriptions (more reliable):** ``` { "type": "object", "properties": { "technicalLevel": { "type": "string", "enum": ["beginner", "intermediate", "advanced"], "description": "Assess the technical complexity of the content. 'Beginner' means no prior knowledge needed. 'Advanced' means deep technical expertise is required." } } } ``` ### 3. Test Instructions Iteratively Start with a simple schema and simple instructions. Get that working, and then gradually add complexity. 1. **Start Simple:** `"instructions": "Summarize this article in 2 sentences."` 2. **Test and Refine:** Does it work? Good. Now add more. 3. **Add Complexity:** `"instructions": "Summarize this article in 2 sentences and identify the target audience."` 4. **Test Again:** `"instructions": "Summarize... identify the target audience... list 3 key takeaways... and rate the technical difficulty from 1-5."` This iterative process is much easier to debug than writing a highly complex prompt from scratch. ### 4. Monitor and Log Generations For production systems, log your requests and the AI’s (successful) responses. This allows you to monitor for quality, identify edge cases where the AI struggles, and build a dataset for finetuning your instructions. This function logs key metadata about each request. - [JavaScript](#tab-panel-19) - [Python](#tab-panel-20) ``` async function generateWithLogging(url, schema, instructions) { const startTime = Date.now(); try { const data = await generateWithErrorHandling(url, schema, instructions); // Re-use our error handler const duration = Date.now() - startTime; console.log( JSON.stringify({ level: "info", timestamp: new Date().toISOString(), url, status: "success", durationMs: duration, instructionsLength: instructions.length, responseSize: JSON.stringify(data).length, }) ); return data; } catch (error) { const duration = Date.now() - startTime; console.error( JSON.stringify({ level: "error", timestamp: new Date().toISOString(), url, status: "error", durationMs: duration, errorMessage: error.message, }) ); throw error; } } ``` ``` import time import logging # Configure logging logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(message)s') def generate_with_logging(url, schema, instructions): start_time = time.time() try: data = generate_with_error_handling(url, schema, instructions) # Re-use error handler duration_ms = (time.time() - start_time) * 1000 logging.info(json.dumps({ 'url': url, 'status': 'success', 'duration_ms': duration_ms, 'instructions_length': len(instructions), 'response_size': len(json.dumps(data)) })) return data except Exception as e: duration_ms = (time.time() - start_time) * 1000 logging.error(json.dumps({ 'url': url, 'status': 'error', 'duration_ms': duration_ms, 'error_message': str(e) })) raise ``` **Code Explanation (JavaScript):** We record the time before the request using `const startTime = Date.now()`. Inside the `try` block, we call our robust `generateWithErrorHandling` function to make the API request. After the request completes, we calculate the request duration. We then log a structured JSON object using `console.log(JSON.stringify(...))`. In a real application, you would send this to a logging service like DataDog, Sentry, or your own database. The `catch (error)` block also logs a structured error message on failure, ensuring that both successful and failed requests are properly tracked. **How to Run:** - Use this `generateWithLogging` function as your new primary entry point for making API calls. --- ## Related Resources - [API Reference: Generate JSON Endpoint](https://www.google.com/search?q=/api/generate-json-v-1) - [How to Extract JSON Data](https://www.google.com/search?q=/guides/how-to-extract-json) - [Quick Start Guide](https://www.google.com/search?q=/getting-started/quick-start)