# Extract

## JSON

`client.Extract.Json(ctx, body) (*ExtractJsonResponse, error)`

**post** `/extract/json`

Fetches a URL and extracts structured data according to a provided JSON schema

### Parameters

- `body ExtractJsonParams`

  - `JsonSchema param.Field[any]`

    JSON schema definition that describes the structure of data to extract.

  - `URL param.Field[string]`

    URL to fetch and extract data from

  - `Effort param.Field[ExtractJsonParamsEffort]`

    Fetch effort level controlling speed vs. capability tradeoff. "min": fastest, no fallback (1-5s). "standard": balanced with enhanced reliability (default, 3-15s). "max": full browser rendering for JS-heavy sites (15-60s).

    - `const ExtractJsonParamsEffortMin ExtractJsonParamsEffort = "min"`

    - `const ExtractJsonParamsEffortStandard ExtractJsonParamsEffort = "standard"`

    - `const ExtractJsonParamsEffortMax ExtractJsonParamsEffort = "max"`

  - `GeoTarget param.Field[ExtractJsonParamsGeoTarget]`

    Optional geotargeting parameters for proxy requests

    - `Country string`

      Country code using ISO 3166-1 alpha-2 standard (2 letters, e.g., "US", "GB", "JP").
      See: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2

  - `Nocache param.Field[bool]`

    Bypass cache and force fresh data retrieval

### Returns

- `type ExtractJsonResponse map[string, any]`

### Example

```go
package main

import (
  "context"
  "fmt"

  "github.com/stainless-sdks/tabstack-go"
  "github.com/stainless-sdks/tabstack-go/option"
)

func main() {
  client := tabstack.NewClient(
    option.WithAPIKey("My API Key"),
  )
  response, err := client.Extract.Json(context.TODO(), tabstack.ExtractJsonParams{
    JsonSchema: map[string]any{
    "properties": map[string]any{
    "stories": map[string]any{
    "items": map[string]any{
    "properties": map[string]any{
    "author": map[string]any{
    "description": "Author username",
    "type": "string",
    },
    "points": map[string]any{
    "description": "Story points",
    "type": "number",
    },
    "title": map[string]any{
    "description": "Story title",
    "type": "string",
    },
    },
    "type": "object",
    },
    "type": "array",
    },
    },
    "type": "object",
    },
    URL: "https://news.ycombinator.com",
  })
  if err != nil {
    panic(err.Error())
  }
  fmt.Printf("%+v\n", response)
}
```

#### Response

```json
{
  "foo": "bar"
}
```

## Markdown

`client.Extract.Markdown(ctx, body) (*ExtractMarkdownResponse, error)`

**post** `/extract/markdown`

Fetches a URL and converts its HTML content to clean Markdown format with optional metadata extraction

### Parameters

- `body ExtractMarkdownParams`

  - `URL param.Field[string]`

    URL to fetch and convert to markdown

  - `Effort param.Field[ExtractMarkdownParamsEffort]`

    Fetch effort level controlling speed vs. capability tradeoff. "min": fastest, no fallback (1-5s). "standard": balanced with enhanced reliability (default, 3-15s). "max": full browser rendering for JS-heavy sites (15-60s).

    - `const ExtractMarkdownParamsEffortMin ExtractMarkdownParamsEffort = "min"`

    - `const ExtractMarkdownParamsEffortStandard ExtractMarkdownParamsEffort = "standard"`

    - `const ExtractMarkdownParamsEffortMax ExtractMarkdownParamsEffort = "max"`

  - `GeoTarget param.Field[ExtractMarkdownParamsGeoTarget]`

    Optional geotargeting parameters for proxy requests

    - `Country string`

      Country code using ISO 3166-1 alpha-2 standard (2 letters, e.g., "US", "GB", "JP").
      See: https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2

  - `Metadata param.Field[bool]`

    Include extracted metadata (Open Graph and HTML metadata) as a separate field in the response

  - `Nocache param.Field[bool]`

    Bypass cache and force fresh data retrieval

### Returns

- `type ExtractMarkdownResponse struct{…}`

  - `Content string`

    The markdown content (includes metadata as YAML frontmatter by default)

  - `URL string`

    The URL that was converted to markdown

  - `Metadata ExtractMarkdownResponseMetadata`

    Extracted metadata from the page (only included when metadata parameter is true)

    - `Author string`

      Author information from HTML metadata

    - `CreatedAt string`

      Document creation date (ISO 8601)

    - `Creator string`

      Creator application (e.g., "Microsoft Word")

    - `Description string`

      Page description from Open Graph or HTML

    - `Image string`

      Featured image URL from Open Graph

    - `Keywords []string`

      PDF keywords as array

    - `ModifiedAt string`

      Document modification date (ISO 8601)

    - `PageCount int64`

      Number of pages (PDF documents)

    - `PdfVersion string`

      PDF version (e.g., "1.5")

    - `Producer string`

      PDF producer software (e.g., "Adobe PDF Library")

    - `Publisher string`

      Publisher information from Open Graph

    - `SiteName string`

      Site name from Open Graph

    - `Subject string`

      PDF-specific metadata fields (populated for PDF documents)
      PDF subject or summary

    - `Title string`

      Page title from Open Graph or HTML

    - `Type string`

      Content type from Open Graph (e.g., article, website)

    - `URL string`

      Canonical URL from Open Graph

### Example

```go
package main

import (
  "context"
  "fmt"

  "github.com/stainless-sdks/tabstack-go"
  "github.com/stainless-sdks/tabstack-go/option"
)

func main() {
  client := tabstack.NewClient(
    option.WithAPIKey("My API Key"),
  )
  response, err := client.Extract.Markdown(context.TODO(), tabstack.ExtractMarkdownParams{
    URL: "https://example.com/blog/article",
  })
  if err != nil {
    panic(err.Error())
  }
  fmt.Printf("%+v\n", response.Content)
}
```

#### Response

```json
{
  "content": "# Example Article Title\n\nThis is the article content converted to markdown...",
  "metadata": {
    "author": "Example Author",
    "description": "This is an example article description",
    "image": "https://example.com/images/article.jpg",
    "publisher": "Example Publisher",
    "site_name": "Example Blog",
    "title": "Example Article Title",
    "type": "article",
    "url": "https://example.com/blog/article"
  },
  "url": "https://example.com/blog/article"
}
```