# Extract ## JSON `client.Extract.Json(ctx, body) (*ExtractJsonResponse, error)` **post** `/extract/json` Fetches a URL and extracts structured data according to a provided JSON schema ### Parameters - `body ExtractJsonParams` - `JsonSchema param.Field[any]` JSON schema definition that describes the structure of data to extract. - `URL param.Field[string]` URL to fetch and extract data from - `Effort param.Field[ExtractJsonParamsEffort]` Fetch effort level controlling speed vs. capability tradeoff. "min": fastest, no fallback (1-5s). "standard": balanced with enhanced reliability (default, 3-15s). "max": full browser rendering for JS-heavy sites (15-60s). - `const ExtractJsonParamsEffortMin ExtractJsonParamsEffort = "min"` - `const ExtractJsonParamsEffortStandard ExtractJsonParamsEffort = "standard"` - `const ExtractJsonParamsEffortMax ExtractJsonParamsEffort = "max"` - `GeoTarget param.Field[ExtractJsonParamsGeoTarget]` Optional geotargeting parameters for proxy requests - `Country string` Country code using ISO 3166-1 alpha-2 standard (2 letters, e.g., "US", "GB", "JP"). See: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 - `Nocache param.Field[bool]` Bypass cache and force fresh data retrieval ### Returns - `type ExtractJsonResponse map[string, any]` ### Example ```go package main import ( "context" "fmt" "github.com/stainless-sdks/tabstack-go" "github.com/stainless-sdks/tabstack-go/option" ) func main() { client := tabstack.NewClient( option.WithAPIKey("My API Key"), ) response, err := client.Extract.Json(context.TODO(), tabstack.ExtractJsonParams{ JsonSchema: map[string]any{ "properties": map[string]any{ "stories": map[string]any{ "items": map[string]any{ "properties": map[string]any{ "author": map[string]any{ "description": "Author username", "type": "string", }, "points": map[string]any{ "description": "Story points", "type": "number", }, "title": map[string]any{ "description": "Story title", "type": "string", }, }, "type": "object", }, "type": "array", }, }, "type": "object", }, URL: "https://news.ycombinator.com", }) if err != nil { panic(err.Error()) } fmt.Printf("%+v\n", response) } ``` #### Response ```json { "foo": "bar" } ``` ## Markdown `client.Extract.Markdown(ctx, body) (*ExtractMarkdownResponse, error)` **post** `/extract/markdown` Fetches a URL and converts its HTML content to clean Markdown format with optional metadata extraction ### Parameters - `body ExtractMarkdownParams` - `URL param.Field[string]` URL to fetch and convert to markdown - `Effort param.Field[ExtractMarkdownParamsEffort]` Fetch effort level controlling speed vs. capability tradeoff. "min": fastest, no fallback (1-5s). "standard": balanced with enhanced reliability (default, 3-15s). "max": full browser rendering for JS-heavy sites (15-60s). - `const ExtractMarkdownParamsEffortMin ExtractMarkdownParamsEffort = "min"` - `const ExtractMarkdownParamsEffortStandard ExtractMarkdownParamsEffort = "standard"` - `const ExtractMarkdownParamsEffortMax ExtractMarkdownParamsEffort = "max"` - `GeoTarget param.Field[ExtractMarkdownParamsGeoTarget]` Optional geotargeting parameters for proxy requests - `Country string` Country code using ISO 3166-1 alpha-2 standard (2 letters, e.g., "US", "GB", "JP"). See: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2 - `Metadata param.Field[bool]` Include extracted metadata (Open Graph and HTML metadata) as a separate field in the response - `Nocache param.Field[bool]` Bypass cache and force fresh data retrieval ### Returns - `type ExtractMarkdownResponse struct{…}` - `Content string` The markdown content (includes metadata as YAML frontmatter by default) - `URL string` The URL that was converted to markdown - `Metadata ExtractMarkdownResponseMetadata` Extracted metadata from the page (only included when metadata parameter is true) - `Author string` Author information from HTML metadata - `CreatedAt string` Document creation date (ISO 8601) - `Creator string` Creator application (e.g., "Microsoft Word") - `Description string` Page description from Open Graph or HTML - `Image string` Featured image URL from Open Graph - `Keywords []string` PDF keywords as array - `ModifiedAt string` Document modification date (ISO 8601) - `PageCount int64` Number of pages (PDF documents) - `PdfVersion string` PDF version (e.g., "1.5") - `Producer string` PDF producer software (e.g., "Adobe PDF Library") - `Publisher string` Publisher information from Open Graph - `SiteName string` Site name from Open Graph - `Subject string` PDF-specific metadata fields (populated for PDF documents) PDF subject or summary - `Title string` Page title from Open Graph or HTML - `Type string` Content type from Open Graph (e.g., article, website) - `URL string` Canonical URL from Open Graph ### Example ```go package main import ( "context" "fmt" "github.com/stainless-sdks/tabstack-go" "github.com/stainless-sdks/tabstack-go/option" ) func main() { client := tabstack.NewClient( option.WithAPIKey("My API Key"), ) response, err := client.Extract.Markdown(context.TODO(), tabstack.ExtractMarkdownParams{ URL: "https://example.com/blog/article", }) if err != nil { panic(err.Error()) } fmt.Printf("%+v\n", response.Content) } ``` #### Response ```json { "content": "# Example Article Title\n\nThis is the article content converted to markdown...", "metadata": { "author": "Example Author", "description": "This is an example article description", "image": "https://example.com/images/article.jpg", "publisher": "Example Publisher", "site_name": "Example Blog", "title": "Example Article Title", "type": "article", "url": "https://example.com/blog/article" }, "url": "https://example.com/blog/article" } ```