Raw OpenAPI specification (YAML - inlined for the scrapers that need it)
openapi: 3.1.0
info:
title: Riveter API
description: |
## Overview
The Riveter API lets you **build datasets** and **run enrichments** programmatically.
- **A dataset** is a collection of rows — companies, people, URLs, or anything else you want to work with. You can build one from a natural-language prompt or a structured spec, and Riveter will generate the rows for you.
- **An enrichment** takes rows of input data and fills in new columns using AI, web scraping, and other tools. For example, given a list of companies, an enrichment can look up each company's revenue, employee count, and CEO.
## Four ways to use the API
### 1. Build a new dataset and enrich it (easiest)
Describe a dataset and Riveter will generate the dataset and enrich it in one step. Just provide a prompt (e.g. "top 50 SaaS companies"), the attributes you want to find (e.g. "CEO", "revenue"), and set `auto_run_enrichment: true`.
This is the fastest way to go from idea to enriched data — no setup required.
**Key endpoints:**
- [/build_dataset](#tag/dataset-builder/post/build_dataset) — generate rows and auto-run enrichment with `auto_run_enrichment: true`
- [/run_status](#tag/runs/get/run_status) — check progress (or use `webhook_url`)
- [/run_data](#tag/runs/get/run_data) — get the enriched results
### 2. Run a new enrichment with your own data
Already have input data? Define everything in a single API request — no UI setup required.
**Option A: Prompt + attributes (recommended)** — provide your input data, a natural-language prompt, and output column names. The AI generates the full configuration automatically.
**Option B: Full specification** — define exact prompts, tools, and formatting for each output column when you need precise control.
**Key endpoints:**
- [/run_new_enrichment](#tag/enrichments/post/run_new_enrichment) — execute with prompt + attributes or full configuration
- [/run_status](#tag/runs/get/run_status) — check progress (or use `webhook_url`)
- [/run_data](#tag/runs/get/run_data) — get the enriched results
### 3. Run an existing enrichment
Build and test your enrichment in the [Riveter UI](https://app.riveterhq.com/enrichments), then run it via the API by passing new input data. The enrichment already stores your prompts, tool settings, and output format — you just supply new rows.
This is ideal when you've fine-tuned an enrichment in the UI and want to deploy it to production.
**Key endpoints:**
- [/run_enrichment](#tag/enrichments/post/run_enrichment) — execute your enrichment with input data
- [/run_status](#tag/runs/get/run_status) — check progress (or use `webhook_url`)
- [/run_data](#tag/runs/get/run_data) — get the enriched results
### 4. Build a dataset for an existing enrichment
Have an enrichment but need new input data? Use `/build_dataset_from_enrichment` to generate rows that match your enrichment's expected input columns. The dataset builder derives identifiers from your enrichment's source-data columns automatically.
Optionally set `auto_run_enrichment: true` to run the enrichment automatically when the dataset completes.
**Key endpoints:**
- [/build_dataset_from_enrichment](#tag/dataset-builder/post/build_dataset_from_enrichment) — generate rows matching your enrichment's structure
- [/dataset_build_status](#tag/dataset-builder/get/dataset_build_status) — check progress (or use `dataset_webhook_url`)
- [/run_enrichment_from_dataset](#tag/dataset-builder/post/run_enrichment_from_dataset) — run the enrichment on the generated data
## Webhooks
Instead of polling, pass a `webhook_url` in the JSON body when starting a run and we'll POST the results to your URL when it finishes.
1. Include `webhook_url` in the JSON body of your `/run_new_enrichment` or `/run_enrichment` request
2. Your run executes normally
3. When complete, we POST the full results (same format as `/run_data`) to your webhook URL
```bash
curl -X POST "https://api.riveterhq.com/v1/run_enrichment?enrichment_uuid=xxx" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"webhook_url": "https://your-server.com/webhook",
"input": {"Company": ["Apple", "Google"]}
}'
```
**Webhook payload:**
```json
{
"event": "run.completed",
"run_key": "abc-123",
"status": "success",
"enrichment_uuid": "...",
"enrichment_name": "My Enrichment",
"credits_used": 2.0,
"completed_at": "2024-01-15T12:00:00Z",
"formatted_data": {
"Company": [{"value": "Apple"}, {"value": "Google"}],
"Revenue": [{"value": "383000000000"}, {"value": "307000000000"}]
}
}
```
**Events:** `run.completed` (success), `run.stopped` (manually stopped)
**Retries:** Failed deliveries are retried up to 2 times. Your endpoint should return a 2xx status code.
Dataset builds also support webhooks — pass `dataset_webhook_url` to `/build_dataset`.
## Authentication
All endpoints require an API key via the Authorization header:
```
Authorization: Bearer YOUR_API_KEY
```
[Get an API key here](https://app.riveterhq.com/settings/api)
## Rate limiting
Default: 30 requests per minute. You can send up to 1,000 rows per request.
## Response format
All responses include a `request_status` field (`success` or `error`).
## MCP server
Use Riveter from Claude, Cursor, or any MCP-compatible AI assistant. [Get an API key](https://app.riveterhq.com/settings/api), then:
**Claude Code:**
```bash
claude mcp add riveter -- npx -y riveter-mcp-server --env RIVETER_API_KEY=YOUR_API_KEY
```
**Codex:**
```bash
codex mcp add riveter --env RIVETER_API_KEY=YOUR_API_KEY -- npx -y riveter-mcp-server
```
**Cursor / Windsurf / Claude Desktop** — paste into your MCP config:
```json
{
"mcpServers": {
"riveter": {
"command": "npx",
"args": ["-y", "riveter-mcp-server"],
"env": {
"RIVETER_API_KEY": "YOUR_API_KEY"
}
}
}
}
```
The MCP server dynamically exposes all API endpoints as tools, with full descriptions and typed parameters. No setup beyond the API key.
version: 1.0.4
contact:
name: Riveter Support
url: https://riveterhq.com
email: [email protected]
servers:
- url: https://api.riveterhq.com/v1
description: Production server
security:
- ApiKeyAuth: []
paths:
/run_enrichment:
post:
summary: run_enrichment
description: |
Run an existing enrichment with input data. The enrichment must be API-enabled (you can turn this on from your enrichment view).
**Recommended:** Pass a `webhook_url` in the JSON body to receive results when the run completes — this is more efficient than polling. If you must poll, use `/run_status` with the returned `run_key` (suggested interval: 5–10 seconds). Grab data with `/run_data`.
## Quick Example
```bash
curl -X POST "https://api.riveterhq.com/v1/run_enrichment?enrichment_uuid=YOUR_ENRICHMENT_UUID" \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"input": {"Company Name": ["Acme Corp", "Tech Solutions Inc"]}}'
```
The enrichment UUID comes from the enrichment's URL (ex: app.riveterhq.com/enrichments/YOUR_ENRICHMENT_UUID)
## After running
Use the [/run_data](#tag/runs/get/run_data) endpoint to get results as they become available, or pass `webhook_url` in the body to receive them automatically.
## Input Format
Input data should be a JSON object where:
- Keys are column headers from your enrichment's sheet.
- Values are arrays of strings (all arrays must be the same length)
- Only source data columns are required (columns marked as "source data" in your enrichment)
- Maximum 1000 rows per request
Optionally pass in a run_key (if not, one will be returned). Use this to later poll for the [status](#tag/runs/get/run_status) of your enrichment run, or grab the final [data](#tag/runs/get/run_data)
operationId: runExistingEnrichment
tags:
- Enrichments
parameters:
- name: enrichment_uuid
in: query
required: true
description: UUID of the enrichment to run (project_uuid is also accepted)
schema:
type: string
format: uuid
- name: run_key
in: query
required: false
description: Custom identifier for this run (optional, will be generated if not provided)
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/RunExistingEnrichmentRequest"
examples:
basic_enrichment:
summary: Basic company enrichment
value:
input:
"Company Name": ["Acme Corp", "Tech Solutions Inc"]
"Website": ["acme.com", "techsolutions.com"]
with_webhook:
summary: With webhook delivery
value:
input:
"Company Name": ["Acme Corp", "Tech Solutions Inc"]
"Website": ["acme.com", "techsolutions.com"]
webhook_url: "https://your-server.com/webhook"
responses:
"200":
description: Enrichment run initiated successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentRunResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"409":
$ref: "#/components/responses/Conflict"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/enrichment:
get:
summary: get_enrichment
description: |
Retrieve the structure of an existing enrichment: input column names and the full output specification
(prompts, contexts, tools, formats, etc.) keyed by column header. Use this to inspect configuration before
updating it with PATCH, or to round-trip config in your automation.
Does not return row data — only the enrichment configuration.
operationId: getEnrichment
tags:
- Enrichments
parameters:
- name: enrichment_uuid
in: query
required: true
description: UUID of the enrichment (project_uuid is also accepted)
schema:
type: string
format: uuid
responses:
"200":
description: Enrichment structure retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentStructureResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
patch:
summary: update_enrichment
description: |
Partially update an existing enrichment's output columns in place. The same `enrichment_uuid` is preserved.
The request body is keyed by **column header** (output column display name).
- **Update** an existing output column: include only the keys you want to change. When a key is present, its value **fully replaces** the previous value for that key (e.g. sending `contexts` replaces all contexts).
- **Add** a new output column: use a header name that does not exist yet and include the full column configuration (same required fields as [/run_new_enrichment](#tag/enrichments/post/run_new_enrichment) output columns — `prompt` and `contexts` for agent mode, or `tool` plus its parameters for tool-only mode).
- **Delete** an output column: send `{ "delete": true }` for that column header. You cannot delete input columns or the last remaining output column.
Supported keys per column match the full output specification from [/run_new_enrichment](#tag/enrichments/post/run_new_enrichment):
`prompt`, `contexts`, `tools`, `format`, `format_details`, `run_when`, `run_when_config`, and
tool-only fields (`tool` plus its parameters). Expand **CEO** or **Industry** in the request schema for the full field list.
You may send the column map at the top level of the body, or wrap it in an `output` property (see below).
This endpoint updates configuration only — it does not run the enrichment. Use [/run_enrichment](#tag/enrichments/post/run_enrichment) afterward.
**Update one column's prompt and contexts:**
```json
{
"CEO": {
"prompt": "Find the current CEO of this company using recent news and filings",
"contexts": ["Company Name", "Website"]
}
}
```
**Update another column's format and tools:**
```json
{
"Industry": {
"format": "tag",
"format_details": {
"options": ["SaaS", "Fintech", "Healthcare", "Other"],
"allow_multiple": false
},
"tools": ["web_search", "scrape"]
}
}
```
**Update multiple columns in one request:**
```json
{
"CEO": {
"prompt": "Find the current CEO of this company using recent news and filings",
"contexts": ["Company Name", "Website"]
},
"Industry": {
"prompt": "What industry is this company in?",
"contexts": ["Company Name"],
"format": "tag",
"format_details": {
"options": ["SaaS", "Fintech", "Healthcare", "Other"]
},
"tools": ["web_search", "scrape"]
}
}
```
**Same payloads with an `output` wrapper** (optional):
```json
{
"output": {
"CEO": {
"prompt": "Find the current CEO of this company using recent news and filings",
"contexts": ["Company Name", "Website"]
},
"Industry": {
"format": "tag",
"format_details": {
"options": ["SaaS", "Fintech", "Healthcare", "Other"]
},
"tools": ["web_search", "scrape"]
}
}
}
```
**Add a new output column:**
```json
{
"Annual Revenue": {
"prompt": "Find the latest annual revenue for this company",
"contexts": ["Company Name", "Website"]
}
}
```
**Delete an output column:**
```json
{
"CEO": {
"delete": true
}
}
```
operationId: updateEnrichment
tags:
- Enrichments
parameters:
- name: enrichment_uuid
in: query
required: true
description: UUID of the enrichment to update (project_uuid is also accepted)
schema:
type: string
format: uuid
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/UpdateEnrichmentRequestBody"
example:
"CEO":
prompt: "Find the current CEO of this company using recent news and filings"
contexts: ["Company Name", "Website"]
"Industry":
prompt: "What industry is this company in?"
contexts: ["Company Name"]
format: "tag"
format_details:
options: ["SaaS", "Fintech", "Healthcare", "Other"]
tools: ["web_search", "scrape"]
responses:
"200":
description: Enrichment updated successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentStructureResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/run_new_enrichment:
post:
summary: run_new_enrichment
description: |
Create and run a new enrichment in a single request. There are two ways to define your output columns:
### Option 1: Prompt + attributes (recommended)
Provide a natural-language `prompt` describing what you want and an `attributes` array listing the output column names. The AI will automatically generate the full output configuration (prompts, contexts, tools, formats) for each attribute. This is the easiest option and is often the best choice — just describe what you need and let the AI handle the rest.
### Option 2: Full output specification (not recommended)
Define the exact structure of each output column (prompts, contexts, tools, formats). Use this when you need precise control over how each column is enriched.
You must provide **either** both `prompt` and `attributes` **or** `output` — not both, not neither.
**Recommended:** Pass a `webhook_url` in the JSON body to receive results when the run completes — this is more efficient than polling. If you must poll, use `/run_status` with the returned `run_key` (suggested interval: 5–10 seconds). Grab data with `/run_data`.
operationId: runNewEnrichment
tags:
- Enrichments
parameters:
- name: run_key
in: query
required: false
description: Custom identifier for this run (optional, will be generated if not provided)
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
$ref: "#/components/schemas/RunNewEnrichmentRequest"
examples:
prompt_and_attributes:
summary: "Option 1: Prompt + attributes (recommended)"
value:
input:
"Company Name": ["Acme Corp", "Tech Solutions Inc"]
"Website": ["acme.com", "techsolutions.com"]
prompt: "Find key business information about these companies"
attributes:
["Employee Count", "Industry", "Annual Revenue", "CEO"]
prompt_and_attributes_with_webhook:
summary: "Option 1 + webhook delivery"
value:
input:
"Company Name": ["Acme Corp", "Tech Solutions Inc"]
"Website": ["acme.com", "techsolutions.com"]
prompt: "Find key business information about these companies"
attributes:
["Employee Count", "Industry", "Annual Revenue", "CEO"]
webhook_url: "https://your-server.com/webhook"
full_output_specification:
summary: "Option 2: Full output specification"
value:
input:
"Company Name": ["Acme Corp", "Tech Solutions Inc"]
"Website": ["acme.com", "techsolutions.com"]
output:
"Employee Count":
prompt: "Find the number of employees at this company"
contexts: ["Company Name", "Website"]
format: "number"
"Industry":
prompt: "What industry is this company in?"
contexts: ["Company Name"]
format: "text"
tool_only_code:
summary: "Tool-only: JavaScript code execution"
value:
input:
"First Name": ["Jane", "John"]
"Last Name": ["Doe", "Smith"]
"Revenue": ["1500000", "3200000"]
"Employees": ["50", "120"]
output:
"Full Name":
tool: "code"
code: "return `${args.first} ${args.last}`"
args:
first: "First Name"
last: "Last Name"
"Revenue Per Employee":
tool: "code"
code: "const r = parseFloat(args.revenue) || 0; const e = parseInt(args.employees) || 1; return (r / e).toFixed(2)"
args:
revenue: "Revenue"
employees: "Employees"
format: "number"
responses:
"200":
description: Run initiated successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentRunResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"409":
$ref: "#/components/responses/Conflict"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/monitor_enrichment:
post:
summary: monitor_enrichment
description: |
Create a monitor for an enrichment. Monitors run your enrichment on a schedule and can send webhook notifications with results.
operationId: monitorEnrichment
tags:
- Monitors
parameters:
- name: enrichment_uuid
in: query
required: true
description: UUID of the enrichment to monitor (project_uuid is also accepted)
schema:
type: string
format: uuid
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
cadence:
type: string
enum: [daily, weekly, monthly]
description: How often the monitor runs
minute:
type: integer
minimum: 0
maximum: 59
description: Minute of the hour to run
hour:
type: integer
minimum: 0
maximum: 23
description: Hour of the day to run
day_of_week:
type: integer
minimum: 0
maximum: 6
description: Day of the week (0=Sunday, required for weekly)
day_of_month:
type: integer
minimum: 1
maximum: 28
description: Day of the month (required for monthly)
timezone:
type: string
description: "Timezone (e.g. 'UTC', 'America/New_York')"
webhook_url:
type: string
format: uri
description: URL to receive webhook notifications
alert_rule:
type: string
enum: [each_run, only_on_change]
description: When to send alerts (default each_run)
output_format:
type: string
enum: [current_only, current_and_previous]
description: Output format (default current_only)
run_immediately:
type: boolean
description: Whether to run the monitor immediately after creation
input:
$ref: "#/components/schemas/EnrichmentInputData"
description: Optional input data for the monitor
required:
- cadence
- minute
- hour
- timezone
responses:
"201":
description: Monitor created successfully
content:
application/json:
schema:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
monitor:
type: object
properties:
uuid:
type: string
name:
type: string
cadence:
type: string
enabled:
type: boolean
project_uuid:
type: string
description: UUID of the enrichment (also available as enrichment_uuid)
project_name:
type: string
description: Name of the enrichment (also available as enrichment_name)
enrichment_uuid:
type: string
description: UUID of the enrichment
enrichment_name:
type: string
description: Name of the enrichment
next_run_at:
type: string
schedule_summary:
type: string
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/run_status:
get:
summary: run_status
description: |
Check the current status of an enrichment run.
**Tip:** Webhooks are more efficient than polling. Pass `webhook_url` when starting a run to receive results automatically.
**Polling interval:** If you must poll, we recommend 5–10 seconds between requests. Early time estimates may be unreliable — actual completion is often faster than initial projections.
## Quick Example
```bash
curl -X GET "https://api.riveterhq.com/v1/run_status?run_key=YOUR_RUN_KEY" \
-H "Authorization: Bearer YOUR_API_KEY"
```
The run key comes from the [/run_enrichment](#tag/enrichments/post/run_enrichment) endpoint.
operationId: getRunStatus
tags:
- Runs
parameters:
- name: run_key
in: query
required: true
description: The run key (UUID) of the run to check
schema:
type: string
responses:
"200":
description: Status retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentStatusResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
/run_data:
get:
summary: run_data
description: |
Retrieve the processed data from a completed run
## Quick Example
```bash
curl -X GET "https://api.riveterhq.com/v1/run_data?run_key=YOUR_RUN_KEY" \
-H "Authorization: Bearer YOUR_API_KEY"
```
The run key comes from the [/run_enrichment](#tag/enrichments/post/run_enrichment) endpoint.
operationId: getRunData
tags:
- Runs
parameters:
- name: run_key
in: query
required: true
description: The run key (UUID) of the run to retrieve data for
schema:
type: string
responses:
"200":
description: Run data retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentDataResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/stop_run:
post:
summary: stop_run
description: |
Stop a currently running enrichment. This will halt all processing and mark the run as stopped.
## Quick Example
```bash
curl -X POST "https://api.riveterhq.com/v1/stop_run?run_key=YOUR_RUN_KEY" \
-H "Authorization: Bearer YOUR_API_KEY"
```
The run key comes from the [/run_enrichment](#tag/enrichments/post/run_enrichment) endpoint.
## Behavior
- If the run is already stopped or success, returns success with current status
- If the run is in progress, stops all pending cells and marks the run as stopped
- Stopped runs cannot be resumed
operationId: stopRun
tags:
- Runs
parameters:
- name: run_key
in: query
required: true
description: The run key (UUID) of the run to stop
schema:
type: string
responses:
"200":
description: Run stopped successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentStopResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/scrape:
post:
summary: scrape
description: |
Scrape a webpage and return the text content. This endpoint allows you to extract text content from any public webpage.
## Quick Example
```bash
curl -X POST https://api.riveterhq.com/v1/scrape \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com"}'
```
## Credit Costs
- **With proxy**: 1/5 credit (0.20 credits)
- **Without proxy**: 1/20 credit (0.05 credits)
- **From cache**: Free (0 credits)
## Proxy Usage
Scraping is not guaranteed to succeed without a proxy. Some websites may block requests or require specific geographic locations. Using a proxy may be necessary to guarantee results.
To use a proxy, include the `proxy_country_code` parameter with a two-character country code (e.g., 'us', 'gb', 'de').
## Caching
By default, recently scraped pages are cached to save credits. If you hit a recently cached webpage, the scrape is free. To always fetch fresh content, set `skip_cache` to true.
## Response
The response includes:
- `text`: The extracted text content from the webpage
- `url`: The URL that was scraped
- `base_url_for_links`: The base URL for resolving relative links
- `status_code`: The HTTP status code returned by the server (e.g., 200, 404, 500)
- `possibly_blocked`: (optional) Boolean flag if the page may be blocked by anti-scraping measures
- `credit_used`: The number of credits consumed
- `riveter_app_link`: Direct link to view this scrape in the Riveter application
operationId: scrape
tags:
- Tools
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
url:
type: string
format: uri
description: The URL to scrape
proxy_country_code:
type: string
description: Optional two-character country code for proxy (e.g., 'us', 'gb', 'de')
pattern: "^[a-z]{2}$"
skip_cache:
type: boolean
description: Set to true to bypass cache and always fetch fresh content
default: false
required:
- url
examples:
basic_scrape:
summary: Basic webpage scrape
value:
url: "https://example.com"
scrape_with_proxy:
summary: Scrape with proxy
value:
url: "https://example.com"
proxy_country_code: "us"
scrape_skip_cache:
summary: Scrape bypassing cache
value:
url: "https://example.com"
skip_cache: true
responses:
"200":
description: Webpage scraped successfully
content:
application/json:
schema:
$ref: "#/components/schemas/ScrapeResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/web_search:
post:
summary: web_search
description: |
Run one or many web searches in a single request. Each search is a `{ query, date_start?, date_end? }`
object. This is a convenience wrapper around a tool-only `web_search` enrichment — you don't need to
create an enrichment first.
Handles a single search or up to **100,000** searches per request (larger batches fan out across the
big-run pipeline).
Optionally filter each search to a date range with `date_start` / `date_end` (format `YYYY-MM-DD`).
Dates are optional and may be set per-search — searches without dates are unfiltered. If only
`date_start` is given, `date_end` defaults to today.
This is **async**: poll [/run_status](#tag/runs/get/run_status) with the returned `run_key`, then fetch
results with [/run_data](#tag/runs/get/run_data). Or pass a `webhook_url` to receive results when the
run completes. Results come back under the `search_results` column alongside the `query` (and dates).
## Quick Example (inline)
```bash
curl -X POST https://api.riveterhq.com/v1/web_search \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"searches": [
{ "query": "OpenAI GPT-4o mini", "date_start": "2024-07-01", "date_end": "2024-07-31" },
{ "query": "Anthropic Claude 3.5 Sonnet" }
]
}'
```
## Quick Example (from a file)
For larger batches, put the body in a file and pass it with curl's `@` syntax:
`queries.json`:
```json
{
"searches": [
{ "query": "OpenAI GPT-4o mini", "date_start": "2024-07-01", "date_end": "2024-07-31" },
{ "query": "Anthropic Claude 3.5 Sonnet", "date_start": "2024-06-01", "date_end": "2024-06-30" },
{ "query": "Google Gemini 2.0" }
],
"webhook_url": "https://your-server.com/webhook"
}
```
```bash
curl -X POST https://api.riveterhq.com/v1/web_search \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d @queries.json
```
## Credit Costs
- **0.03 credits** per search.
operationId: webSearch
tags:
- Tools
parameters:
- name: run_key
in: query
required: false
description: Custom identifier for this run (optional, will be generated if not provided)
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
searches:
type: array
description: One or more searches to run (up to 100,000).
items:
type: object
properties:
query:
type: string
description: The search query (required).
date_start:
type: string
description: "Optional start date filter, format YYYY-MM-DD."
date_end:
type: string
description: "Optional end date filter, format YYYY-MM-DD. Defaults to today if date_start is set."
required:
- query
webhook_url:
type: string
format: uri
description: Optional URL to POST results to when the run completes.
required:
- searches
examples:
single_search:
summary: A single search
value:
searches:
- query: "latest OpenAI news"
single_search_with_dates:
summary: A single date-bounded search
value:
searches:
- query: "OpenAI GPT-4o mini release"
date_start: "2024-07-01"
date_end: "2024-07-31"
many_searches:
summary: Many searches with per-search date ranges (e.g. from @queries.json)
value:
searches:
- query: "OpenAI GPT-4o mini"
date_start: "2024-07-01"
date_end: "2024-07-31"
- query: "Anthropic Claude 3.5 Sonnet"
date_start: "2024-06-01"
date_end: "2024-06-30"
- query: "Google Gemini 2.0"
webhook_url: "https://your-server.com/webhook"
responses:
"200":
description: Search run initiated successfully
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentRunResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"409":
$ref: "#/components/responses/Conflict"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/pause_monitor:
post:
summary: pause_monitor
description: |
Pause an active monitor. If the monitor is already paused, this is a no-op and returns success.
operationId: pauseMonitor
tags:
- Monitors
parameters:
- name: monitor_uuid
in: query
required: true
description: UUID of the monitor to pause
schema:
type: string
format: uuid
responses:
"200":
description: Monitor paused successfully
content:
application/json:
schema:
$ref: "#/components/schemas/MonitorResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
/monitor_status:
get:
summary: monitor_status
description: |
Retrieve the current status and configuration of a monitor.
operationId: getMonitorStatus
tags:
- Monitors
parameters:
- name: monitor_uuid
in: query
required: true
description: UUID of the monitor to check
schema:
type: string
format: uuid
responses:
"200":
description: Monitor status retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/MonitorResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
/monitor_recent_run_data:
get:
summary: monitor_recent_run_data
description: |
Retrieve the data from the most recent run of a monitor. Returns the formatted output data along with run status details.
operationId: getMonitorRecentRunData
tags:
- Monitors
parameters:
- name: monitor_uuid
in: query
required: true
description: UUID of the monitor
schema:
type: string
format: uuid
responses:
"200":
description: Recent run data retrieved successfully
content:
application/json:
schema:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
run_key:
type: string
formatted_data:
$ref: "#/components/schemas/EnrichmentFormattedData"
description: The processed data from the most recent run in columnar format
status:
$ref: "#/components/schemas/EnrichmentRunStatusDetails"
monitor_uuid:
type: string
format: uuid
required:
- request_status
- message
- run_key
- formatted_data
- status
- monitor_uuid
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/api_stats:
get:
summary: api_stats
description: |
Retrieve API usage statistics for the current account. Returns counts of API runs grouped by status.
Pass `detailed=true` to include run-level detail (run_key and app URL) for active runs (pending, enqueued, processing).
operationId: getApiStats
tags:
- Account
parameters:
- name: detailed
in: query
required: false
description: Set to true to include run-level detail for active statuses
schema:
type: boolean
default: false
responses:
"200":
description: API stats retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/ApiStatsResponse"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
/build_dataset:
post:
summary: build_dataset
description: |
Build a dataset by providing either a natural-language prompt, a structured spec (identifiers, qualifiers, attributes), or both.
**Recommended:** Pass a `dataset_webhook_url` to receive results when the build completes — this is more efficient than polling. If you must poll, use `/dataset_build_status` (suggested interval: 5–10 seconds).
## Input options
**Prompt only** — describe what you want in plain English:
```json
{ "prompt": "Top 50 SaaS companies by revenue", "max_items": 50 }
```
**Structured spec** — define the shape explicitly:
```json
{
"identifiers": ["Company Name"],
"qualifiers": ["B2B SaaS", "revenue > $10M"],
"attributes": ["CEO", "Headquarters"],
"max_items": 50
}
```
**Both** — the prompt provides intent while the spec constrains the output.
operationId: buildDataset
tags:
- Dataset Builder
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
prompt:
type: string
description: Natural-language description of the dataset to build. Can be used alone or combined with identifiers/qualifiers/attributes. At least one of prompt or identifiers is required.
identifiers:
type: array
items:
type: string
maxItems: 3
description: Column names that uniquely identify each row (max 3). Can be used alone or combined with prompt. At least one of prompt or identifiers is required.
qualifiers:
type: array
items:
type: string
maxItems: 10
description: Filters or constraints on the dataset (max 10). Used with identifiers and/or prompt.
attributes:
type: array
items:
type: string
maxItems: 10
description: Additional columns to include in the output (max 10). Used with identifiers and/or prompt.
max_items:
type: integer
minimum: 1
default: 100
description: Maximum number of rows to generate
dataset_webhook_url:
type: string
format: uri
description: |
URL to receive a POST when the dataset build completes.
Always pass URLs containing query strings (e.g. Power Automate / Azure Logic Apps SAS URLs) here in the body — never as a query parameter — so the `&` characters in the SAS token are preserved.
auto_run_enrichment:
type: boolean
description: Automatically create and run an enrichment from the dataset when complete. When true, an auto_run_enrichment_run_key is returned immediately in the response that can be used to poll run_status / run_data.
default: false
auto_run_enrichment_webhook_url:
type: string
format: uri
description: Webhook URL for the auto-run enrichment (requires auto_run_enrichment). Same body-only guidance applies.
examples:
prompt_only:
summary: Prompt only
value:
prompt: "Top 50 SaaS companies by revenue"
max_items: 50
structured_spec:
summary: Structured spec only
value:
identifiers: ["Company Name"]
qualifiers: ["B2B SaaS", "revenue > $10M"]
attributes: ["CEO", "Headquarters"]
max_items: 50
prompt_and_spec:
summary: Prompt + structured spec (combined)
value:
prompt: "Top 50 B2B SaaS companies by revenue"
identifiers: ["Company Name"]
qualifiers: ["B2B SaaS", "revenue > $10M"]
attributes: ["CEO", "Headquarters", "Founded Year"]
max_items: 50
prompt_with_webhook:
summary: Prompt + dataset webhook delivery
value:
prompt: "Top 50 SaaS companies by revenue"
max_items: 50
dataset_webhook_url: "https://your-server.com/dataset-webhook"
responses:
"200":
description: Dataset build started
content:
application/json:
schema:
$ref: "#/components/schemas/DatasetBuildResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/dataset_build_status:
get:
summary: dataset_build_status
description: |
Check the current status of a dataset build.
**Tip:** Webhooks are more efficient than polling. Pass `dataset_webhook_url` when starting a build to receive results automatically.
**Polling interval:** If you must poll, we recommend 30 seconds between requests. Early time estimates may be unreliable — actual completion is often faster than initial projections.
operationId: getDatasetBuildStatus
tags:
- Dataset Builder
parameters:
- name: run_key
in: query
required: true
description: The run_key returned by /build_dataset
schema:
type: string
responses:
"200":
description: Status retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/DatasetStatusResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
/dataset_build_data:
get:
summary: dataset_build_data
description: |
Retrieve the generated data from a completed dataset build. Returns the data in columnar format along with status details.
operationId: getDatasetBuildData
tags:
- Dataset Builder
parameters:
- name: run_key
in: query
required: true
description: The run_key returned by /build_dataset
schema:
type: string
responses:
"200":
description: Dataset data retrieved successfully
content:
application/json:
schema:
$ref: "#/components/schemas/DatasetDataResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
/stop_dataset_build:
post:
summary: stop_dataset_build
description: |
Stop a dataset build that is currently in progress. Only builds in an active state can be stopped.
operationId: stopDatasetBuild
tags:
- Dataset Builder
parameters:
- name: run_key
in: query
required: true
description: The run_key returned by /build_dataset
schema:
type: string
requestBody:
required: false
content:
application/json:
schema:
type: object
properties:
dataset_webhook_url:
type: string
format: uri
description: |
Optionally update the webhook URL before stopping. Always pass URLs containing query strings (e.g. Power Automate / Azure Logic Apps SAS URLs) here in the body — never as a query parameter.
responses:
"200":
description: Stop signal sent
content:
application/json:
schema:
$ref: "#/components/schemas/DatasetBuildResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
/create_enrichment_from_dataset:
post:
summary: create_enrichment_from_dataset
description: |
Create an enrichment from a completed dataset build. The dataset must have finished building and have result data available.
After creating the enrichment, use `/run_enrichment_from_dataset` or `/run_enrichment` to run it.
operationId: createEnrichmentFromDataset
tags:
- Dataset Builder
parameters:
- name: run_key
in: query
required: true
description: The run_key returned by /build_dataset
schema:
type: string
responses:
"200":
description: Enrichment created from dataset
content:
application/json:
schema:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
run_key:
type: string
enrichment_uuid:
type: string
format: uuid
description: UUID of the newly created enrichment
required:
- request_status
- message
- run_key
- enrichment_uuid
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"500":
description: Internal server error
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
/run_enrichment_from_dataset:
post:
summary: run_enrichment_from_dataset
description: |
Run an enrichment using a dataset build's result data as input. If an enrichment hasn't been created yet, one is automatically created from the dataset with the attributes of the dataset's columns.
**Recommended:** Pass a `webhook_url` in the JSON body to receive results when the run completes — this is more efficient than polling. If you must poll, use `/run_status` with the returned `run_key` (suggested interval: 5–10 seconds). Grab data with `/run_data`.
operationId: runEnrichmentFromDataset
tags:
- Dataset Builder
parameters:
- name: run_key
in: query
required: true
description: The run_key returned by /build_dataset
schema:
type: string
requestBody:
required: false
content:
application/json:
schema:
type: object
properties:
webhook_url:
type: string
format: uri
description: |
URL to receive a POST when the enrichment run completes. The webhook payload includes the full results (same as /run_data). See the Webhooks section above for payload format and details.
Always pass URLs containing query strings (e.g. Power Automate / Azure Logic Apps SAS URLs) here in the body — passing them as query parameters will truncate them at the first `&`.
examples:
with_webhook:
summary: Run with webhook delivery
value:
webhook_url: "https://your-server.com/webhook"
responses:
"200":
description: Enrichment run initiated
content:
application/json:
schema:
$ref: "#/components/schemas/EnrichmentStatusResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"409":
$ref: "#/components/responses/Conflict"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/build_dataset_from_enrichment:
post:
summary: build_dataset_from_enrichment
description: |
Build a dataset using an existing enrichment's column structure. Instead of specifying identifiers and attributes manually, the endpoint derives them from the enrichment:
- **Source-data columns** become the dataset's **identifiers** (max 3)
- **Agent / tool-only columns** become the dataset's **attributes**
You provide a prompt describing what entities to find, qualifiers to filter them, and an optional max_items limit.
**Recommended:** Pass `dataset_webhook_url` and/or `auto_run_enrichment_webhook_url` to receive results when complete — this is more efficient than polling. If you must poll, use `/dataset_build_status` and `/run_status` (suggested interval: 5–10 seconds).
After the dataset is built, you can either:
- Use `/run_enrichment_from_dataset` to manually run the enrichment on the generated data, or
- Pass `auto_run_enrichment: true` to automatically run the enrichment when the dataset completes. An `auto_run_enrichment_run_key` is returned immediately that can be used with `/run_status` and `/run_data`.
operationId: buildDatasetFromEnrichment
tags:
- Dataset Builder
parameters:
- name: enrichment_uuid
in: query
required: true
description: UUID of the enrichment whose column structure to use
schema:
type: string
format: uuid
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
prompt:
type: string
description: Natural-language description of the dataset to build. Describes what entities to find for the enrichment's source-data columns.
qualifiers:
type: array
items:
type: string
maxItems: 10
description: Filters or constraints on the dataset (max 10)
max_items:
type: integer
minimum: 1
default: 100
description: Maximum number of rows to generate
dataset_webhook_url:
type: string
format: uri
description: |
URL to receive a POST when the dataset build completes.
Always pass URLs containing query strings (e.g. Power Automate / Azure Logic Apps SAS URLs) here in the body — never as a query parameter — so the `&` characters in the SAS token are preserved.
auto_run_enrichment:
type: boolean
description: Automatically run the enrichment on the generated dataset when complete. When true, an auto_run_enrichment_run_key is returned immediately in the response that can be used to poll run_status / run_data.
default: false
auto_run_enrichment_webhook_url:
type: string
format: uri
description: Webhook URL for the auto-run enrichment (requires auto_run_enrichment). Same body-only guidance applies.
required:
- prompt
- qualifiers
examples:
basic:
summary: Build a dataset of companies for an existing enrichment
value:
prompt: "SaaS companies in the United States"
qualifiers: ["B2B", "revenue > $10M", "founded after 2010"]
max_items: 50
auto_run:
summary: Build dataset and auto-run the enrichment
value:
prompt: "SaaS companies in the United States"
qualifiers: ["B2B", "revenue > $10M"]
max_items: 25
auto_run_enrichment: true
auto_run_enrichment_webhook_url: "https://example.com/webhook"
responses:
"200":
description: Dataset build started
content:
application/json:
schema:
$ref: "#/components/schemas/DatasetBuildResponse"
"400":
$ref: "#/components/responses/BadRequest"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"404":
$ref: "#/components/responses/NotFound"
"422":
$ref: "#/components/responses/UnprocessableEntity"
/account:
get:
summary: account
description: Retrieve information about the current account associated with the API key
operationId: getAccount
tags:
- Account
responses:
"200":
description: Account information retrieved successfully
content:
application/json:
schema:
type: object
properties:
account:
$ref: "#/components/schemas/Account"
api_key_info:
$ref: "#/components/schemas/ApiKeyInfo"
"401":
$ref: "#/components/responses/Unauthorized"
"403":
$ref: "#/components/responses/Forbidden"
"503":
$ref: "#/components/responses/ServiceUnavailable"
components:
securitySchemes:
ApiKeyAuth:
type: http
scheme: bearer
bearerFormat: API_KEY
description: API key authentication. Use 'Bearer YOUR_API_KEY' in the Authorization header.
x-scalar-secret-token: YOUR_API_KEY
schemas:
Account:
type: object
properties:
uuid:
type: string
format: uuid
description: Unique identifier for the account
name:
type: string
description: Account name
plan:
type: string
enum: [free, starter, advanced, pro, enterprise]
description: Current billing plan
credit:
$ref: "#/components/schemas/Credit"
required:
- uuid
- name
- plan
- credit
Credit:
type: object
properties:
count:
type: integer
description: Current credit count
max:
type: integer
description: Maximum credits available
balance:
type: integer
description: Remaining credit balance
required:
- count
- max
- balance
ApiKeyInfo:
type: object
properties:
name:
type: string
description: Name of the API key
last_used_at:
type: [string, "null"]
format: date-time
description: When the API key was last used
created_by:
$ref: "#/components/schemas/User"
required:
- name
- last_used_at
- created_by
User:
type: object
properties:
uuid:
type: string
format: uuid
description: User's unique identifier
name:
type: string
description: User's full name
email:
type: string
format: email
description: User's email address
required:
- uuid
- name
- email
EnrichmentInputData:
type: object
description: |
Keys are your source-data column headers. Values are arrays of strings (one per row).
You may include any column header name; the examples below are illustrative.
properties:
"Company Name":
type: array
description: One string per row (all input columns must have the same array length, max 1000).
items:
type: string
maxItems: 1000
"Website":
type: array
description: One string per row (all input columns must have the same array length, max 1000).
items:
type: string
maxItems: 1000
"Domain":
type: array
description: One string per row (all input columns must have the same array length, max 1000).
items:
type: string
maxItems: 1000
FormatDetails:
type: object
description: |
Format-specific options. Valid keys depend on the column `format` (see run_new_enrichment examples).
Only include keys that apply to your chosen format.
properties:
options:
type: array
items:
type: string
description: tag — allowed tag values (required for tag format)
allow_multiple:
type: boolean
description: tag — allow selecting more than one tag
descriptions:
type: object
description: tag — optional map from tag value to description (keys must be in `options`)
properties:
"Enterprise":
type: string
description: Example tag option description
decimal_places:
type: integer
minimum: 0
maximum: 9
description: number — decimal places
currency_code:
type: string
description: number — 3-letter currency code (mutually exclusive with percentage)
commas:
type: boolean
description: number — display thousands separators
percentage:
type: boolean
description: number — format as percentage (mutually exclusive with currency_code)
description:
type: string
description: json — natural-language schema description (use with or instead of `schema`)
schema:
type: object
description: json — JSON Schema object for structured output
properties:
type:
type: string
example: object
properties:
type: object
description: JSON Schema properties map
properties:
example_field:
type: object
description: Schema for one field (example)
iso_8601:
type: boolean
description: date — output ISO 8601 (cannot combine with month/day/year/delimiter)
month:
type: string
enum: [M, MM, MMM, MMMM]
description: date — month format token
day:
type: string
enum: [D, DD, Do]
description: date — day format token
year:
type: string
enum: [YYYY, YY]
description: date — year format token
delimiter:
type: string
description: date — single-character delimiter between date parts
true_value:
type: string
description: boolean — display string for true
false_value:
type: string
description: boolean — display string for false
EnrichmentOutputColumnConfig:
type: object
description: |
Per-column enrichment config. **Agent mode** (default): include `prompt` and `contexts`.
**Tool-only mode**: set `tool` and its parameters (do not use `prompt`/`contexts`).
properties:
prompt:
type: string
description: Agent instructions for this column (agent mode)
contexts:
type: array
description: Column headers used as input context (agent mode)
items:
type: string
tools:
type: array
description: "Agent tools: web_search, scrape, pdf, image, etc."
items:
type: string
format:
type: string
enum: [text, number, url, email, tag, date, json, boolean]
format_details:
$ref: "#/components/schemas/FormatDetails"
run_when:
type: string
enum: [always, any_filled, all_filled, dynamic]
run_when_config:
type: object
properties:
match_mode:
type: string
enum: [all, any]
rules:
type: array
items:
type: object
properties:
column:
type: string
condition:
type: string
value:
type: string
tool:
type: string
description: "Tool-only mode: scrape, web_search, pdf, image, code, LinkedIn tools, etc."
url:
type: string
description: Column header or static URL (tool-only)
query:
type: string
description: Column header or static query (tool-only, web_search)
date_start:
type: string
description: "Optional start date for filtering search results. Format: YYYY-MM-DD (tool-only, web_search)"
date_end:
type: string
description: "Optional end date for filtering search results. Format: YYYY-MM-DD. Defaults to today if date_start is provided. (tool-only, web_search)"
code:
type: string
description: JavaScript source (tool-only, code tool)
args:
type: object
description: |
Named arguments for the code tool (tool-only). Keys are names referenced in your JavaScript
(e.g. `args.first`). Values are column headers (dynamic per row) or static strings.
You may include any argument name; the examples below are illustrative.
properties:
first:
type: string
description: Column header name or static value
last:
type: string
description: Column header name or static value
revenue:
type: string
description: Column header name or static value (example)
proxy_country_code:
type: string
wait_longer:
type: boolean
skip_cache:
type: boolean
EnrichmentOutputSpec:
type: object
description: |
Keys are output column headers. Values are per-column configuration objects.
Any output column name is allowed; expand the example columns below to see all supported fields.
properties:
"Employee Count":
$ref: "#/components/schemas/EnrichmentOutputColumnConfig"
"Industry":
$ref: "#/components/schemas/EnrichmentOutputColumnConfig"
"CEO":
$ref: "#/components/schemas/EnrichmentOutputColumnConfig"
"Website":
$ref: "#/components/schemas/EnrichmentOutputColumnConfig"
UpdateEnrichmentRequestBody:
type: object
description: |
Output changes keyed by column header. Existing columns can be partially updated; new column names must include
a full output configuration; set `"delete": true` on a column to remove it.
Each column uses the same fields as [/run_new_enrichment](#tag/enrichments/post/run_new_enrichment) output columns
(expand **CEO** or **Industry** below). Any other output column name is allowed.
properties:
"Column Header Name":
$ref: "#/components/schemas/EnrichmentOutputColumnConfig"
example:
"CEO":
prompt: "Find the current CEO of this company using recent news and filings"
contexts: ["Company Name", "Website"]
"Industry":
prompt: "What industry is this company in?"
contexts: ["Company Name"]
format: "tag"
format_details:
options: ["SaaS", "Fintech", "Healthcare", "Other"]
tools: ["web_search", "scrape"]
RunNewEnrichmentRequest:
type: object
required: [input]
properties:
input:
$ref: "#/components/schemas/EnrichmentInputData"
prompt:
type: string
description: "Option 1 (recommended): Natural-language description of the enrichment. Must be provided with attributes. Cannot be combined with output."
attributes:
type: array
items:
type: string
maxItems: 10
description: "Option 1 (recommended): Output column names (max 10). Must be provided with prompt. Cannot be combined with output."
output:
$ref: "#/components/schemas/EnrichmentOutputSpec"
webhook_url:
type: string
format: uri
description: URL to POST results when the run completes (same payload as /run_data)
RunExistingEnrichmentRequest:
type: object
required: [input]
properties:
input:
$ref: "#/components/schemas/EnrichmentInputData"
webhook_url:
type: string
format: uri
description: URL to POST results when the run completes (same payload as /run_data)
EnrichmentCellValue:
type: object
properties:
value:
type: string
description: Cell result text
required: [value]
EnrichmentFormattedData:
type: object
description: |
Columnar results. Keys are column headers; values are arrays of `{ "value": "..." }` objects per row.
Expand the example columns below; additional column headers use the same array shape.
properties:
"Company Name":
type: array
description: One cell per row
items:
type: object
required: [value]
properties:
value:
type: string
description: Cell result text
"CEO":
type: array
description: One cell per row
items:
type: object
required: [value]
properties:
value:
type: string
description: Cell result text
"Website":
type: array
description: One cell per row
items:
type: object
required: [value]
properties:
value:
type: string
description: Cell result text
ApiStatsCounts:
type: object
description: Run counts grouped by status. Additional status keys use the same shape as the examples below.
properties:
pending:
type: object
properties:
count:
type: integer
detail:
type: array
items:
type: object
properties:
run_key:
type: string
url:
type: string
format: uri
enqueued:
type: object
properties:
count:
type: integer
processing:
type: object
properties:
count:
type: integer
detail:
type: array
items:
type: object
properties:
run_key:
type: string
url:
type: string
format: uri
success:
type: object
properties:
count:
type: integer
stopped:
type: object
properties:
count:
type: integer
failed:
type: object
description: Example of another status key
properties:
count:
type: integer
EnrichmentRunStatusDetails:
type: object
description: Status information for the run
properties:
status:
type: string
enum: [pending, enqueued, processing, success, stopped]
description: Current status of the run
credits_used:
type: number
description: Number of credits consumed by this run
total_cells_expected:
type: integer
description: Total number of cells that need to be processed (rows × columns)
completed_cells:
type: integer
description: Number of cells successfully completed
not_found_cells:
type: integer
description: Number of cells with 'not found' results
project_name:
type: string
description: Name of the enrichment (also available as enrichment_name)
project_uuid:
type: string
format: uuid
description: UUID of the enrichment (also available as enrichment_uuid)
enrichment_name:
type: string
description: Name of the enrichment
enrichment_uuid:
type: string
format: uuid
description: UUID of the enrichment
error_message:
type: [string, "null"]
description: Error message if an error occurred
riveter_app_link:
type: string
description: Direct link to view this run in the Riveter application
required:
- status
- credits_used
- total_cells_expected
- completed_cells
- project_name
- project_uuid
- riveter_app_link
EnrichmentStructureResponse:
type: object
properties:
request_status:
type: string
enum: [success, error]
message:
type: string
enrichment_uuid:
type: string
format: uuid
enrichment_name:
type: string
name:
type: string
uuid:
type: string
format: uuid
app_url:
type: string
status:
type: string
input:
type: array
items:
type: string
description: Names of input (source data) columns
output:
$ref: "#/components/schemas/EnrichmentOutputSpec"
required:
- request_status
- message
EnrichmentRunResponse:
type: object
properties:
request_status:
type: string
enum: [success, error]
description: Status of the request
message:
type: string
description: Human-readable message
run_key:
type: string
description: Unique identifier for this run
status:
type: string
enum: [pending, enqueued, processing, success, stopped]
description: Current status of the run
credits_used:
type: number
description: Number of credits consumed by this run
total_cells_expected:
type: integer
description: Total number of cells that need to be processed (rows × columns)
completed_cells:
type: integer
description: Number of cells successfully completed
not_found_cells:
type: integer
description: Number of cells with 'not found' results
project_name:
type: string
description: Name of the enrichment (also available as enrichment_name)
project_uuid:
type: string
format: uuid
description: UUID of the enrichment (also available as enrichment_uuid)
enrichment_name:
type: string
description: Name of the enrichment
enrichment_uuid:
type: string
format: uuid
description: UUID of the enrichment
error_message:
type: [string, "null"]
description: Error message if an error occurred
riveter_app_link:
type: string
description: Direct link to view this run in the Riveter application
required:
- request_status
- message
- run_key
- status
- credits_used
- total_cells_expected
- completed_cells
- not_found_cells
- project_name
- project_uuid
- enrichment_name
- enrichment_uuid
- riveter_app_link
EnrichmentStatusResponse:
type: object
properties:
request_status:
type: string
enum: [success, error]
description: Status of the request
message:
type: string
description: Human-readable message
run_key:
type: string
description: Unique identifier for this run
status:
type: string
enum: [pending, enqueued, processing, success, stopped]
description: Current status of the run
credits_used:
type: number
description: Number of credits consumed by this run
total_cells_expected:
type: integer
description: Total number of cells that need to be processed (rows × columns)
completed_cells:
type: integer
description: Number of cells successfully completed
not_found_cells:
type: integer
description: Number of cells with 'not found' results
project_name:
type: string
description: Name of the enrichment (also available as enrichment_name)
project_uuid:
type: string
format: uuid
description: UUID of the enrichment (also available as enrichment_uuid)
enrichment_name:
type: string
description: Name of the enrichment
enrichment_uuid:
type: string
format: uuid
description: UUID of the enrichment
error_message:
type: [string, "null"]
description: Error message if an error occurred
riveter_app_link:
type: string
description: Direct link to view this run in the Riveter application
required:
- request_status
- message
- run_key
- status
- credits_used
- total_cells_expected
- completed_cells
- not_found_cells
- project_name
- project_uuid
- enrichment_name
- enrichment_uuid
- riveter_app_link
EnrichmentDataResponse:
type: object
properties:
request_status:
type: string
enum: [success, error]
description: Status of the request
message:
type: string
description: Human-readable message
run_key:
type: string
description: Unique identifier for this run
formatted_data:
$ref: "#/components/schemas/EnrichmentFormattedData"
status:
$ref: "#/components/schemas/EnrichmentRunStatusDetails"
required:
- request_status
- message
- run_key
- formatted_data
- status
EnrichmentStopResponse:
type: object
properties:
request_status:
type: string
enum: [success, error]
description: Status of the request
message:
type: string
description: Human-readable message about the stop operation
run_key:
type: string
description: Unique identifier for this run
status:
type: string
enum: [stopped, success]
description: Current status of the run after stop attempt
stopped_at:
type: [string, "null"]
format: date-time
description: When the run was stopped (if stopped)
finished_at:
type: [string, "null"]
format: date-time
description: When the run finished (if already completed)
stopped_cells_count:
type: integer
description: Number of cells that were stopped (only present if run was actively stopped)
project_name:
type: string
description: Name of the enrichment (also available as enrichment_name)
project_uuid:
type: string
format: uuid
description: UUID of the enrichment (also available as enrichment_uuid)
enrichment_name:
type: string
description: Name of the enrichment
enrichment_uuid:
type: string
format: uuid
description: UUID of the enrichment
required:
- request_status
- message
- run_key
- status
- project_name
- project_uuid
- enrichment_name
- enrichment_uuid
ScrapeResponse:
type: object
properties:
request_status:
type: string
enum: [success, error]
description: Status of the request
message:
type: string
description: Human-readable message
run_key:
type: string
description: Unique identifier for this scrape run
data:
type: object
properties:
url:
type: string
format: uri
description: The URL that was scraped
text:
type: string
description: The extracted text content from the webpage
base_url_for_links:
type: string
format: uri
description: The base URL for resolving relative links
status_code:
type: integer
description: The HTTP status code returned by the server (e.g., 200, 404, 500)
example: 200
possibly_blocked:
type: boolean
description: Optional flag indicating if the page may be blocked by anti-scraping measures (captcha, access denied, etc.)
credit_used:
type: number
description: Number of credits consumed (0 for cache hit, 0.05 without proxy, 0.20 with proxy)
riveter_app_link:
type: string
format: uri
description: Direct link to view this scrape in the Riveter application
required:
- url
- text
- base_url_for_links
- credit_used
- riveter_app_link
required:
- request_status
- message
- run_key
- data
MonitorResponse:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
monitor:
type: object
properties:
uuid:
type: string
format: uuid
name:
type: string
cadence:
type: string
enum: [daily, weekly, monthly]
minute:
type: integer
hour:
type: integer
day_of_week:
type: [integer, "null"]
day_of_month:
type: [integer, "null"]
timezone:
type: string
enabled:
type: boolean
webhook_url:
type: [string, "null"]
format: uri
alert_rule:
type: string
enum: [each_run, only_on_change]
output_format:
type: string
enum: [current_only, current_and_previous]
next_run_at:
type: [string, "null"]
format: date-time
schedule_summary:
type: string
project_uuid:
type: string
format: uuid
description: UUID of the enrichment (also available as enrichment_uuid)
project_name:
type: string
description: Name of the enrichment (also available as enrichment_name)
enrichment_uuid:
type: string
format: uuid
enrichment_name:
type: string
created_at:
type: string
format: date-time
has_input:
type: boolean
required:
- uuid
- cadence
- enabled
- project_uuid
- project_name
- enrichment_uuid
- enrichment_name
required:
- request_status
- message
- monitor
ApiStatsResponse:
type: object
properties:
request_status:
type: string
enum: [success]
stats:
$ref: "#/components/schemas/ApiStatsCounts"
required:
- request_status
- stats
DatasetBuildResponse:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
run_key:
type: string
description: Unique identifier for this dataset build (use with /dataset_build_status and /dataset_build_data)
max_items:
type: integer
app_url:
type: string
format: uri
auto_run_enrichment_run_key:
type: string
description: Returned when auto_run_enrichment is true. Use with /run_status and /run_data to poll the enrichment run.
required:
- request_status
- message
- run_key
DatasetStatusResponse:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
run_key:
type: string
state:
type: string
description: Current build state
prompt:
type: string
identifiers:
type: [array, "null"]
items:
type: string
qualifiers:
type: [array, "null"]
items:
type: string
attributes:
type: [array, "null"]
items:
type: string
max_items:
type: integer
has_result:
type: boolean
error:
type: [string, "null"]
started_at:
type: [string, "null"]
format: date-time
completed_at:
type: [string, "null"]
format: date-time
created_at:
type: string
format: date-time
credits_charged:
type: [number, "null"]
credits_refunded:
type: [number, "null"]
app_url:
type: string
format: uri
enrichment_uuid:
type: string
format: uuid
description: Present only if an enrichment has been created from this dataset
required:
- request_status
- message
- run_key
- state
- has_result
- created_at
DatasetDataResponse:
type: object
properties:
request_status:
type: string
enum: [success]
message:
type: string
run_key:
type: string
state:
type: string
has_result:
type: boolean
formatted_data:
description: The generated data in columnar format (keys are column names, values are arrays of {value} objects). Omitted or null if no result yet.
$ref: "#/components/schemas/EnrichmentFormattedData"
row_count:
type: integer
description: Number of rows in the result
credits_charged:
type: [number, "null"]
credits_refunded:
type: [number, "null"]
app_url:
type: string
format: uri
enrichment_uuid:
type: string
format: uuid
description: Present only if an enrichment has been created from this dataset
required:
- request_status
- message
- run_key
- state
- has_result
- row_count
Error:
type: object
properties:
request_status:
type: string
enum: [error]
description: Status of the request
message:
type: string
description: Human-readable error message
errors:
type: array
items:
type: string
description: List of specific error details
error_type:
type: string
enum:
[
bad_request,
validation,
not_found,
unauthorized,
forbidden,
conflict,
server_error,
service_unavailable,
]
description: Type of error that occurred
required:
- request_status
- message
- error_type
responses:
BadRequest:
description: Bad request - invalid input or missing required parameters
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
example:
request_status: error
message: "Input validation failed - data format does not meet requirements"
errors:
[
"All arrays must be the same length. Found different lengths: Company Name: 2, Website: 1",
]
error_type: validation
Unauthorized:
description: Unauthorized - invalid or missing API key
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
example:
request_status: error
message: "Invalid API key"
error_type: unauthorized
Forbidden:
description: Forbidden - API access not enabled for account
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
example:
request_status: error
message: "API access not enabled for this account"
error_type: forbidden
NotFound:
description: Resource not found
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
examples:
enrichment_not_found:
value:
request_status: error
message: "Enrichment not found"
error_type: not_found
run_not_found:
value:
request_status: error
message: "API run not found"
run_key: "some-run-key"
error_type: not_found
Conflict:
description: Conflict - resource already exists or conflicting state
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
example:
request_status: error
message: "Run key already exists for this enrichment"
error_type: conflict
UnprocessableEntity:
description: Unprocessable entity - validation failed
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
example:
request_status: error
message: "Enrichment validation failed"
errors: ["Enrichment must have at least one source data column"]
error_type: validation
ServiceUnavailable:
description: Service unavailable - API access is disabled
content:
application/json:
schema:
$ref: "#/components/schemas/Error"
example:
request_status: error
message: "API access is currently disabled"
error_type: service_unavailable
tags:
- name: Enrichments
description: Create, configure, update, and run enrichments via the API
- name: Runs
description: Check status, retrieve data, and manage running API requests
- name: Monitors
description: Create and manage monitors that run your enrichments on a schedule
- name: Dataset Builder
description: Build datasets from natural-language prompts or structured specs, then optionally create and run enrichments from the results
- name: Tools
description: Standalone tools for web scraping and data extraction
- name: Account
description: Account information and management