Recipe Aggregator
Complexity: Beginner
Estimated time: 30-45 minutes
Key TABS APIs: /v1/extract/json/schema, /v1/extract/json, /v1/extract/markdown
Build your personal recipe collection by importing recipes from any cooking website. This example shows you how to extract, store, and search recipes from across the web in one centralized location.
The Story Behind This Project
You've got recipes bookmarked across dozens of websites—some on food blogs, some on NYT Cooking, some on AllRecipes. Each site has different formatting, ads, and paywalls. You want all your favorite recipes in one place, in a clean format, searchable and organized. This recipe aggregator does exactly that.
What This Aggregator Does
Unlike browser bookmarks or Pinterest boards, our recipe aggregator will:
- Extract recipes from any cooking website using AI
- Convert messy recipe pages into clean, structured data
- Store recipes in a searchable local database
- Support filtering by ingredients, cuisine, cooking time
- Export recipes to clean markdown format
How It Works: The Recipe Import Pipeline
Enter Recipe URL → Discover Schema → Extract Recipe Data → Save to Database → Search & View
- Schema Discovery: Use
/v1/extract/json/schemato understand the recipe page structure - Data Extraction: Use
/v1/extract/jsonto extract structured recipe data - Content Cleaning: Use
/v1/extract/markdownfor clean recipe text - Storage: Save to a local JSON database
- Search: Find recipes by name, ingredient, or cuisine
Project Setup
Let's build a simple but powerful recipe aggregator.
Prerequisites
- Node.js 18+ installed
- A TABS API key (get one here)
- Basic command-line knowledge
Step 1: Initialize Your Project
# Create project directory
mkdir recipe-aggregator
cd recipe-aggregator
# Initialize Node.js project
npm init -y
# Install dependencies
npm install node-fetch dotenv
Step 2: Create Project Structure
mkdir -p src data
touch .env src/extractor.js src/database.js src/search.js src/cli.js
Your project structure:
recipe-aggregator/
├── src/
│ ├── extractor.js # Extract recipes from URLs
│ ├── database.js # Simple JSON storage
│ ├── search.js # Search and filter recipes
│ └── cli.js # Command-line interface
├── data/
│ └── recipes.json # Recipe database
├── .env # API key
└── package.json
Step 3: Configure Environment
Create your .env file:
TABS_API_KEY=your_api_key_here
TABS_API_URL=https://api.tabstack.ai
Update your package.json:
{
...
"type": "module",
"scripts": {
"add": "node src/cli.js add",
"list": "node src/cli.js list",
"search": "node src/cli.js search"
},
...
}
Important: The "type": "module" line is required to use ES6 imports in Node.js.
Building Your Recipe Aggregator
Step 1: Recipe Extractor - The Recipe Parser
This module extracts structured recipe data from any cooking website using the TABS API.
Create src/extractor.js:
// src/extractor.js
import fetch from 'node-fetch';
import dotenv from 'dotenv';
dotenv.config();
const API_KEY = process.env.TABS_API_KEY;
const BASE_URL = process.env.TABS_API_URL || 'https://api.tabstack.ai';
export class RecipeExtractor {
constructor() {
this.headers = {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
};
}
/**
* Discover the recipe schema for a given URL
*/
async discoverRecipeSchema(url) {
console.log('🔍 Analyzing recipe page structure...');
const instructions = `Analyze this recipe page and create a schema to extract:
- Recipe title/name
- Description or summary
- Preparation time (in minutes)
- Cooking time (in minutes)
- Total time (in minutes)
- Number of servings
- Difficulty level (if available)
- Cuisine type (Italian, Mexican, etc.)
- Dietary tags (vegetarian, vegan, gluten-free, etc.)
- List of ingredients with quantities
- Step-by-step cooking instructions
- Nutritional information (if available)
- Author or source`;
const response = await fetch(
`${BASE_URL}/v1/extract/json/schema`,
{
method: 'POST',
headers: this.headers,
body: JSON.stringify({
url: url,
instructions: instructions
})
}
);
if (!response.ok) {
throw new Error(`Schema discovery failed: ${response.statusText}`);
}
const schema = await response.json();
console.log('✅ Recipe structure discovered');
return schema;
}
/**
* Extract recipe data using a schema
*/
async extractRecipe(url, schema = null) {
console.log('📖 Extracting recipe data...');
// Use provided schema or the default recipe schema
const recipeSchema = schema || this.getDefaultRecipeSchema();
const response = await fetch(`${BASE_URL}/v1/extract/json`, {
method: 'POST',
headers: this.headers,
body: JSON.stringify({
url: url,
json_schema: recipeSchema
})
});
if (!response.ok) {
throw new Error(`Recipe extraction failed: ${response.statusText}`);
}
const recipeData = await response.json();
console.log('✅ Recipe extracted successfully');
// Parse time and servings to numbers if possible
const parseTime = (timeStr) => {
if (!timeStr) return null;
const match = timeStr.match(/(\d+)/);
return match ? parseInt(match[1]) : null;
};
const parseServings = (servingsStr) => {
if (!servingsStr) return null;
const match = servingsStr.match(/(\d+)/);
return match ? parseInt(match[1]) : null;
};
// Add metadata and parse fields
return {
...recipeData,
total_time: parseTime(recipeData.total_time),
servings: parseServings(recipeData.servings),
source_url: url,
imported_at: new Date().toISOString(),
id: this.generateRecipeId(recipeData.title || 'Untitled Recipe')
};
}
/**
* Get clean markdown version of the recipe page
*/
async getRecipeMarkdown(url) {
console.log('📄 Getting clean recipe text...');
const response = await fetch(
`${BASE_URL}/v1/extract/markdown`,
{
method: 'POST',
headers: this.headers,
body: JSON.stringify({
url: url,
metadata: true
})
}
);
if (!response.ok) {
throw new Error(`Markdown conversion failed: ${response.statusText}`);
}
const data = await response.json();
console.log('✅ Markdown generated');
return data.content;
}
/**
* Default recipe schema for common recipe sites
*/
getDefaultRecipeSchema() {
return {
type: "object",
properties: {
title: {
type: "string",
description: "Recipe title or name"
},
description: {
type: "string",
description: "Recipe description or summary"
},
total_time: {
type: "string",
description: "Total time to make the recipe"
},
servings: {
type: "string",
description: "Number of servings or yield"
},
ingredients: {
type: "array",
items: { type: "string" },
maxItems: 50,
description: "List of ingredients with quantities"
},
instructions: {
type: "array",
items: { type: "string" },
maxItems: 30,
description: "Step-by-step cooking instructions"
}
},
required: ["title", "description", "total_time", "servings", "ingredients", "instructions"],
additionalProperties: false
};
}
/**
* Generate a unique ID for a recipe
*/
generateRecipeId(title) {
const slug = title
.toLowerCase()
.replace(/[^a-z0-9]+/g, '-')
.replace(/^-|-$/g, '');
return `${slug}-${Date.now()}`;
}
}
Step 2: Database - Simple JSON Storage
This module handles storing and retrieving recipes from a JSON file.
Create src/database.js:
// src/database.js
import { promises as fs } from 'fs';
import path from 'path';
export class RecipeDatabase {
constructor(dbPath = './data/recipes.json') {
this.dbPath = dbPath;
}
/**
* Load all recipes from the database
*/
async loadRecipes() {
try {
const data = await fs.readFile(this.dbPath, 'utf8');
return JSON.parse(data);
} catch (error) {
// Return empty database if file doesn't exist
return { recipes: [] };
}
}
/**
* Save recipes to the database
*/
async saveRecipes(data) {
await fs.mkdir(path.dirname(this.dbPath), { recursive: true });
await fs.writeFile(
this.dbPath,
JSON.stringify(data, null, 2),
'utf8'
);
}
/**
* Add a new recipe to the database
*/
async addRecipe(recipe) {
const db = await this.loadRecipes();
// Check for duplicates based on URL
const exists = db.recipes.some(r => r.source_url === recipe.source_url);
if (exists) {
console.log('⚠️ Recipe from this URL already exists in database');
return false;
}
db.recipes.push(recipe);
await this.saveRecipes(db);
console.log(`✅ Recipe "${recipe.title}" added to database`);
return true;
}
/**
* Get a recipe by ID
*/
async getRecipe(id) {
const db = await this.loadRecipes();
return db.recipes.find(r => r.id === id);
}
/**
* Get all recipes
*/
async getAllRecipes() {
const db = await this.loadRecipes();
return db.recipes;
}
/**
* Delete a recipe by ID
*/
async deleteRecipe(id) {
const db = await this.loadRecipes();
const initialLength = db.recipes.length;
db.recipes = db.recipes.filter(r => r.id !== id);
if (db.recipes.length < initialLength) {
await this.saveRecipes(db);
console.log('✅ Recipe deleted');
return true;
}
console.log('⚠️ Recipe not found');
return false;
}
/**
* Get database statistics
*/
async getStats() {
const db = await this.loadRecipes();
const recipes = db.recipes;
const totalWithTime = recipes.filter(r => r.total_time).length;
const avgTime = totalWithTime > 0
? recipes.reduce((sum, r) => sum + (r.total_time || 0), 0) / totalWithTime
: 0;
return {
total: recipes.length,
avgTotalTime: Math.round(avgTime),
withTimeInfo: totalWithTime
};
}
}
Step 3: Search - Find Your Recipes
This module provides search and filtering functionality.
Create src/search.js:
// src/search.js
import { RecipeDatabase } from './database.js';
export class RecipeSearch {
constructor() {
this.db = new RecipeDatabase();
}
/**
* Search recipes by title or description
*/
async searchByText(query) {
const recipes = await this.db.getAllRecipes();
const lowerQuery = query.toLowerCase();
return recipes.filter(recipe => {
const title = (recipe.title || '').toLowerCase();
const description = (recipe.description || '').toLowerCase();
return title.includes(lowerQuery) || description.includes(lowerQuery);
});
}
/**
* Search recipes by ingredient
*/
async searchByIngredient(ingredient) {
const recipes = await this.db.getAllRecipes();
const lowerIngredient = ingredient.toLowerCase();
return recipes.filter(recipe => {
if (!recipe.ingredients) return false;
return recipe.ingredients.some(ing =>
ing.toLowerCase().includes(lowerIngredient)
);
});
}
/**
* Filter recipes by cooking time
*/
async filterByTime(maxMinutes) {
const recipes = await this.db.getAllRecipes();
return recipes.filter(recipe => {
return recipe.total_time && recipe.total_time <= maxMinutes;
});
}
/**
* Get quick recipes (under 30 min)
*/
async getQuickRecipes() {
const recipes = await this.db.getAllRecipes();
return recipes.filter(recipe => {
return recipe.total_time && recipe.total_time <= 30;
});
}
/**
* Format recipe for display
*/
formatRecipe(recipe, detailed = false) {
let output = `\n${'='.repeat(60)}\n`;
output += `📖 ${recipe.title}\n`;
output += `${'='.repeat(60)}\n\n`;
if (recipe.description) {
output += `${recipe.description}\n\n`;
}
// Metadata
const metadata = [];
if (recipe.servings) metadata.push(`👥 ${recipe.servings} servings`);
if (recipe.total_time) metadata.push(`⏱️ ${recipe.total_time} min`);
if (metadata.length > 0) {
output += metadata.join(' | ') + '\n\n';
}
if (detailed) {
// Ingredients
if (recipe.ingredients && recipe.ingredients.length > 0) {
output += `📝 INGREDIENTS:\n`;
recipe.ingredients.forEach((ing, i) => {
output += ` ${i + 1}. ${ing}\n`;
});
output += `\n`;
}
// Instructions
if (recipe.instructions && recipe.instructions.length > 0) {
output += `👨🍳 INSTRUCTIONS:\n`;
recipe.instructions.forEach((step, i) => {
output += ` ${i + 1}. ${step}\n`;
});
output += `\n`;
}
}
// Source
output += `🔗 Source: ${recipe.source_url}\n`;
output += `📅 Added: ${new Date(recipe.imported_at).toLocaleDateString()}\n`;
return output;
}
}
Step 4: CLI - Command-Line Interface
This is the main interface for interacting with your recipe aggregator.
Create src/cli.js:
// src/cli.js
import { RecipeExtractor } from './extractor.js';
import { RecipeDatabase } from './database.js';
import { RecipeSearch } from './search.js';
class RecipeCLI {
constructor() {
this.extractor = new RecipeExtractor();
this.db = new RecipeDatabase();
this.search = new RecipeSearch();
}
/**
* Add a recipe from a URL
*/
async addRecipe(url) {
console.log(`\n🍳 Recipe Aggregator - Add Recipe\n`);
console.log(`URL: ${url}\n`);
try {
// Extract recipe data
const recipe = await this.extractor.extractRecipe(url);
// Save to database
const added = await this.db.addRecipe(recipe);
if (added) {
console.log(`\n${this.search.formatRecipe(recipe, true)}`);
}
} catch (error) {
console.error(`\n❌ Error: ${error.message}`);
process.exit(1);
}
}
/**
* List all recipes
*/
async listRecipes() {
console.log(`\n📚 Recipe Collection\n`);
try {
const recipes = await this.db.getAllRecipes();
if (recipes.length === 0) {
console.log('No recipes yet. Add one with: npm run add <url>');
return;
}
console.log(`Total recipes: ${recipes.length}\n`);
console.log(`${'='.repeat(80)}\n`);
recipes.forEach((recipe, i) => {
const servings = recipe.servings ? ` | ${recipe.servings} servings` : '';
console.log(`${i + 1}. ${recipe.title}`);
console.log(` ⏱️ ${recipe.total_time || '?'} min${servings}`);
console.log(` 🔗 ${recipe.source_url}`);
console.log('');
});
// Show stats
const stats = await this.db.getStats();
console.log(`${'='.repeat(80)}\n`);
console.log(`📊 Statistics:`);
console.log(` Total recipes: ${stats.total}`);
console.log(` Average time: ${stats.avgTotalTime} min`);
console.log(` Recipes with time info: ${stats.withTimeInfo}`);
} catch (error) {
console.error(`\n❌ Error: ${error.message}`);
process.exit(1);
}
}
/**
* Search recipes
*/
async searchRecipes(query, type = 'text') {
console.log(`\n🔍 Searching recipes...\n`);
try {
let results = [];
switch (type) {
case 'ingredient':
results = await this.search.searchByIngredient(query);
break;
case 'quick':
results = await this.search.getQuickRecipes();
break;
default:
results = await this.search.searchByText(query);
}
if (results.length === 0) {
console.log('No recipes found matching your criteria.');
return;
}
console.log(`Found ${results.length} recipe(s):\n`);
results.forEach((recipe, i) => {
console.log(`${i + 1}. ${recipe.title}`);
const servings = recipe.servings ? ` | ${recipe.servings} servings` : '';
console.log(` ⏱️ ${recipe.total_time || '?'} min${servings}`);
console.log(` 🔗 ${recipe.source_url}\n`);
});
} catch (error) {
console.error(`\n❌ Error: ${error.message}`);
process.exit(1);
}
}
}
// Parse command-line arguments
const [,, command, ...args] = process.argv;
const cli = new RecipeCLI();
switch (command) {
case 'add':
if (!args[0]) {
console.error('Usage: npm run add <recipe-url>');
process.exit(1);
}
cli.addRecipe(args[0]);
break;
case 'list':
cli.listRecipes();
break;
case 'search':
if (!args[0]) {
console.error('Usage: npm run search <query> [type]');
console.error('Types: text (default), ingredient, quick');
process.exit(1);
}
cli.searchRecipes(args[0], args[1]);
break;
default:
console.log(`
Recipe Aggregator CLI
Commands:
npm run add <url> Add a recipe from a URL
npm run list List all recipes
npm run search <query> [type] Search recipes
Search types:
text Search by title/description (default)
ingredient Search by ingredient
quick Show quick recipes (under 30 min)
Examples:
npm run add https://www.allrecipes.com/recipe/12345/
npm run search "pasta"
npm run search "chicken" ingredient
npm run search "" quick
`);
}
Running Your Recipe Aggregator
Adding Your First Recipe
Let's add a recipe from any cooking website:
npm run add "https://www.seriouseats.com/basic-tomato-sauce-recipe"
Expected Output:
🍳 Recipe Aggregator - Add Recipe
URL: https://www.seriouseats.com/basic-tomato-sauce-recipe
📖 Extracting recipe data...
✅ Recipe extracted successfully
✅ Recipe "Basic Tomato Sauce" added to database
============================================================
📖 Basic Tomato Sauce
============================================================
A simple, classic tomato sauce that's perfect for pasta, pizza, or
as a base for other Italian dishes.
👥 4 servings | ⏱️ 45 min
📝 INGREDIENTS:
1. 2 tablespoons extra-virgin olive oil
2. 4 cloves garlic, minced
3. 1 (28-ounce) can whole peeled tomatoes
4. 1 teaspoon dried oregano
5. Kosher salt and black pepper
6. 2 tablespoons fresh basil, torn
👨🍳 INSTRUCTIONS:
1. Heat oil in a large saucepan over medium heat
2. Add garlic and cook until fragrant, about 1 minute
3. Add tomatoes, crushing them by hand as you add them
4. Add oregano and season with salt and pepper
5. Simmer for 30 minutes, stirring occasionally
6. Stir in fresh basil before serving
🔗 Source: https://www.seriouseats.com/basic-tomato-sauce-recipe
📅 Added: 1/15/2024
Listing All Recipes
View your recipe collection:
npm run list
Expected Output:
📚 Recipe Collection
Total recipes: 5
================================================================================
1. Basic Tomato Sauce
⏱️ 45 min | 4 servings
🔗 https://www.seriouseats.com/basic-tomato-sauce-recipe
2. Thai Green Curry
⏱️ 35 min | 4 servings
🔗 https://www.bonappetit.com/recipe/thai-green-curry
3. Chocolate Chip Cookies
⏱️ 25 min | 24 servings
🔗 https://www.allrecipes.com/recipe/10813/best-chocolate-chip-cookies/
4. Greek Salad
⏱️ 15 min | 6 servings
🔗 https://www.foodnetwork.com/recipes/greek-salad
5. Beef Tacos
⏱️ 30 min | 4 servings
🔗 https://www.mexicanplease.com/beef-tacos/
================================================================================
📊 Statistics:
Total recipes: 5
Average time: 30 min
Recipes with time info: 5
Searching Recipes
Search by title or description:
npm run search "tomato"
Search by ingredient:
npm run search "chicken" ingredient
Find quick recipes (under 30 min):
npm run search "" quick
Example Search Output:
🔍 Searching recipes...
Found 2 recipe(s):
1. Basic Tomato Sauce
⏱️ 45 min | 4 servings
🔗 https://www.seriouseats.com/basic-tomato-sauce-recipe
2. Tomato Basil Soup
⏱️ 30 min | 6 servings
🔗 https://www.allrecipes.com/recipe/39544/garden-fresh-tomato-soup/
Understanding the TABS API Endpoints
This recipe aggregator uses three TABS endpoints:
/v1/extract/json/schema Endpoint
Automatically discovers the structure of recipe pages:
POST https://api.tabstack.ai/v1/extract/json/schema
{
"url": "https://example.com/recipe",
"instructions": "extract recipe data"
}
// Returns a schema like:
{
"type": "object",
"properties": {
"title": { "type": "string" },
"ingredients": { "type": "array", "items": { "type": "string" } },
"instructions": { "type": "array", "items": { "type": "string" } }
}
}
/v1/extract/json Endpoint
Extracts structured recipe data:
POST https://api.tabstack.ai/v1/extract/json
{
"url": "https://example.com/recipe",
"json_schema": { ... }
}
// Returns:
{
"title": "Chocolate Chip Cookies",
"ingredients": ["2 cups flour", "1 cup butter", ...],
"instructions": ["Preheat oven to 350°F", ...]
}
/v1/extract/markdown Endpoint
Gets clean recipe text:
POST https://api.tabstack.ai/v1/extract/markdown
{
"url": "https://example.com/recipe"
}
// Returns clean markdown without ads or clutter
Extending Your Recipe Aggregator
Here are some ideas to make your aggregator even better:
1. Recipe Ratings
Add a rating system:
async rateRecipe(id, rating) {
const recipe = await this.db.getRecipe(id);
if (!recipe) return false;
recipe.rating = rating;
recipe.rated_at = new Date().toISOString();
await this.db.saveRecipes(await this.db.loadRecipes());
return true;
}
2. Shopping List Generator
Generate a shopping list from selected recipes:
async generateShoppingList(recipeIds) {
const recipes = await Promise.all(
recipeIds.map(id => this.db.getRecipe(id))
);
const ingredients = new Set();
recipes.forEach(recipe => {
recipe.ingredients?.forEach(ing => ingredients.add(ing));
});
return Array.from(ingredients).sort();
}
3. Meal Planning
Plan your week:
class MealPlanner {
async createWeekPlan(recipeIds) {
const days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday'];
const plan = {};
days.forEach((day, i) => {
plan[day] = recipeIds[i] || null;
});
return plan;
}
}
4. Recipe Scaling
Adjust serving sizes:
function scaleRecipe(recipe, newServings) {
const scale = newServings / recipe.servings;
return {
...recipe,
servings: newServings,
ingredients: recipe.ingredients.map(ing => {
// Parse and scale ingredient quantities
return scaleIngredient(ing, scale);
})
};
}
5. Export Recipes
Export to different formats:
async exportRecipe(id, format = 'markdown') {
const recipe = await this.db.getRecipe(id);
switch (format) {
case 'markdown':
return this.toMarkdown(recipe);
case 'json':
return JSON.stringify(recipe, null, 2);
case 'html':
return this.toHTML(recipe);
}
}
6. Recipe Tags
Add custom tags:
async addTag(recipeId, tag) {
const recipe = await this.db.getRecipe(recipeId);
if (!recipe.custom_tags) {
recipe.custom_tags = [];
}
if (!recipe.custom_tags.includes(tag)) {
recipe.custom_tags.push(tag);
await this.db.saveRecipes(await this.db.loadRecipes());
}
}
7. Duplicate Detection
Prevent adding the same recipe twice:
async findSimilarRecipes(title) {
const recipes = await this.db.getAllRecipes();
return recipes.filter(recipe => {
const similarity = this.calculateSimilarity(
title.toLowerCase(),
recipe.title.toLowerCase()
);
return similarity > 0.8;
});
}
Troubleshooting
"Schema discovery failed"
Some recipe sites may be harder to analyze:
- Try using the default schema instead
- Simplify the extraction instructions
- Check if the site requires login
"No ingredients extracted"
The page structure may be unusual:
- Check the source URL in a browser
- Look for recipe structured data (JSON-LD)
- Try the
/markdownendpoint to see the clean content
Incomplete recipe data
Not all sites include all fields:
- The schema will extract what's available
- Missing fields will be undefined/null
- Consider adding manual fields after import
Next Steps
You now have a working recipe aggregator! Here's what to explore next:
- Import your favorite recipes from different cooking sites
- Build a web interface with Express and React
- Add recipe photos by extracting images with
/json - Create collections (weeknight dinners, desserts, etc.)
- Share recipes with friends by exporting to markdown
For more examples, check out the Price Monitor tutorial.