Extract

POST/extract/json

POST/extract/markdown

ExtractJsonResponse = map[unknown]

ExtractMarkdownResponse object { content, url, metadata }

content: string

The markdown content (includes metadata as YAML frontmatter by default)

url: string

The URL that was converted to markdown

formaturi

metadata: optional object { author, created_at, creator, 13 more }

Extracted metadata from the page (only included when metadata parameter is true)

author: optional string

Author information from HTML metadata

created_at: optional string

Document creation date (ISO 8601)

creator: optional string

Creator application (e.g., “Microsoft Word”)

description: optional string

Page description from Open Graph or HTML

image: optional string

Featured image URL from Open Graph

formaturi

keywords: optional array of string

PDF keywords as array

modified_at: optional string

Document modification date (ISO 8601)

page_count: optional number

Number of pages (PDF documents)

pdf_version: optional string

PDF version (e.g., “1.5”)

producer: optional string

PDF producer software (e.g., “Adobe PDF Library”)

publisher: optional string

Publisher information from Open Graph

site_name: optional string

Site name from Open Graph

subject: optional string

PDF-specific metadata fields (populated for PDF documents) PDF subject or summary

title: optional string

Page title from Open Graph or HTML

type: optional string

Content type from Open Graph (e.g., article, website)

url: optional string

Canonical URL from Open Graph

formaturi