ClipAgent API

Turn long videos into viral short-form clips programmatically. Upload a video file, get AI-analyzed clip candidates with virality scores, then render them with captions and branding.

Base URL: https://clip-agent.com/v1

All requests use POST with a JSON body containing an action field.

Authentication

Every request requires an API key in the Authorization header:

Authorization: Bearer ca_live_xxxxxxxxxxxxxxxxx

Generate API keys from your Dashboard.

Credits

Each video analysis costs 1 credit ($2). Rendering is included — no extra charge.

Buy credits from the Dashboard. Your balance is shown there along with usage history.

Claude (MCP connector)

ClipAgent ships an MCP server at https://mcp.clip-agent.com so you can use the entire API inside any Claude client without writing a single line of code. Claude calls the tools, ClipAgent does the work, the rendered MP4s land back in the chat.

The connector implements the Model Context Protocol (2025-06-18) streamable-HTTP transport with full OAuth 2.1 authentication. Same pricing as the REST API: $2 per video, no subscription, no per-clip surcharge.

Endpoint: https://mcp.clip-agent.com
Auth: OAuth 2.1 (one-click sign-in) or Bearer API key
Tools: analyze_video, get_job_status, render_clips, get_render_status

Add to Claude.ai (web)

Open claude.ai/settings/connectors (requires Claude Pro or Max).
Click Add custom connector.
Name: ClipAgent
URL: https://mcp.clip-agent.com
Click Connect. A ClipAgent sign-in page opens — sign in with your account and click Authorize.
ClipAgent now appears in the tools menu of every new chat.

Add to Claude Desktop (Mac & Windows)

Open Claude Desktop → Settings → Connectors.
Click Add custom connector.
Name: ClipAgent
URL: https://mcp.clip-agent.com
Click Connect — your browser opens to the ClipAgent sign-in screen, sign in, click Authorize, the popup closes, you're back in Claude.

Add to Claude Code (CLI)

Two options. OAuth is best for interactive use, API key for automation/CI.

OAuth (interactive):

claude mcp add --transport http clipagent \
  https://mcp.clip-agent.com

The OAuth flow opens in your browser, then writes the token to ~/.claude. No further setup.

API key (headless):

# Generate a key at https://clip-agent.com/dashboard, then:
claude mcp add --transport http clipagent \
  https://mcp.clip-agent.com \
  --header "Authorization: Bearer ca_live_your_api_key"

Tools reference

Four tools. Claude chains them automatically — you describe what you want in plain English and Claude figures out the order.

`analyze_video`

Submit a video for AI viral clip detection. Pass a publicly downloadable URL (Drive share link, S3 URL, Loom export, etc.) — the MCP server fetches it server-side and uploads it to the pipeline on your behalf. Costs 1 credit ($2). Returns a job_id immediately; the actual transcribe + AI rank takes 1–3 minutes for a typical 10-min video.

Field	Type	Required	Description
video_url	string	yes	Direct download URL of the source video (max 3 GB)
filename	string	no	Display name for the project (defaults to URL basename)

Returns: { job_id, status: "processing", credits_remaining }

`get_job_status`

Poll until status is "completed", then read the ranked clips array. Each clip has a clip_id, virality_score (1–10), title_hook, start_ms/end_ms, and why_it_works reasoning.

Field	Type	Required	Description
job_id	string	yes	The job_id returned by analyze_video

`render_clips`

Render selected clips as 9:16 vertical MP4s with burned-in captions and optional branding. No extra credit cost — included in the analyze charge. Returns a render_id for polling.

Field	Type	Required	Description
job_id	string	yes	The original analyze_video job_id
clip_ids	string[]	yes	Array of clip_id strings from get_job_status
caption_style	string	no	One of 15 styles. See caption styles. Default: `boldPop`.
caption_position	string	no	`top`, `middle`, or `bottom`. Default: `bottom`.
words_per_line	int	no	Override how many words appear per chunk (1–10).
smart_crop	bool	no	AI face/speaker tracking + multi-speaker auto-cuts. Default: `true`.
brand_mode	string	no	`none`, `text`, or `logo`.
brand_text	string	no	Handle text when brand_mode is `text`.

`get_render_status`

Poll until status is "completed", then read the clips array — each entry has a download_url (signed for 24 hours) for the rendered MP4.

Example prompts

Drop any of these into a Claude chat after connecting. Claude will chain the tools automatically.

Single-speaker podcast → top 5 viral clips, hormozi style:

Use ClipAgent to make 5 viral clips of this video.
Use the hormozi caption style with captions in the middle.
https://drive.google.com/uc?id=VIDEO_ID&export=download

Two-host interview → render top 3 with speaker tracking:

Clip the 3 most viral moments from this interview using ClipAgent.
Use the viral caption style — it color-codes captions per speaker
and the smart crop should auto-cut between hosts when they switch.
https://my-bucket.s3.amazonaws.com/interview.mp4

Tutorial → highlight reel with your @handle:

Pull the 3 most quotable bits of this talk via ClipAgent.
Render them with the boldPop style and overlay the handle @yourname.
https://my-cdn.com/talks/yc-demo-day.mp4

Bulk repurposing — multiple URLs in one message:

Here are 5 video URLs. For each, use ClipAgent to find the top
moment and render it with the beast caption style. Give me all the
download links when done.

https://my-bucket.s3.amazonaws.com/video1.mp4
https://my-bucket.s3.amazonaws.com/video2.mp4
https://my-bucket.s3.amazonaws.com/video3.mp4
https://my-bucket.s3.amazonaws.com/video4.mp4
https://my-bucket.s3.amazonaws.com/video5.mp4

Custom workflow — pick clips by your own criteria:

Analyze this video with ClipAgent and show me all the candidates
with their virality scores and reasoning. I'll pick which ones to render.
https://drive.google.com/uc?id=VIDEO_ID&export=download

OAuth flow (under the hood)

When a user clicks Connectin Claude.ai or Claude Desktop, here's what happens:

Claude POSTs to https://mcp.clip-agent.com without a token. The server returns 401 with WWW-Authenticate: Bearer realm="ClipAgent", resource_metadata=....
Claude reads /.well-known/oauth-protected-resource to find the authorization server.
Claude reads /.well-known/oauth-authorization-server to find the authorize, token, and registration endpoints.
Claude POSTs to /oauth/register (RFC 7591 Dynamic Client Registration) and gets back a client_id.
Claude opens /oauth/authorize?client_id=... in a popup. The user signs in with their ClipAgent account and clicks Authorize.
ClipAgent issues a one-time authorization code and redirects back to Claude with ?code=....
Claude POSTs the code to /oauth/token with a PKCE code_verifier. The server returns an access_token (1h) and refresh_token (30d).
Claude calls every subsequent tool with Authorization: Bearer ca_at_.... The MCP server resolves the OAuth token to an internal API key under the user's account and proxies to /v1.

To revoke Claude's access, visit your dashboard and revoke the API key named "Claude (via OAuth)".

Upload a video

POST/functions/v1/api

Request a short-lived signed upload URL, then PUT your file directly to Supabase Storage. The returned path is what you pass to analyze. Max 3 GB.

{
  "action": "create_upload",
  "filename": "talk.mp4"
}

Response

{
  "upload_url": "https://...supabase.co/storage/v1/object/...",
  "path": "<user_id>/<uuid>.mp4",
  "expires_in": 900
}

PUT your file to upload_url with the correct Content-Type (e.g., video/mp4), then call analyze with the path below.

Analyze a video

POST/functions/v1/api

Submit an uploaded video for analysis. Costs 1 credit. Returns a job ID for polling.

{
  "action": "analyze",
  "video_path": "<path from create_upload>",
  "filename": "talk.mp4"
}

Response (202)

{
  "job_id": "a1b2c3d4-...",
  "status": "processing"
}

Check job status

POST/functions/v1/api

Poll until status is "completed" or "failed". Recommended interval: 10 seconds.

{
  "action": "status",
  "job_id": "a1b2c3d4-..."
}

Response when completed

{
  "status": "completed",
  "video_title": "The Future of AI",
  "duration_seconds": 1847,
  "clips": [
    {
      "clip_id": "clip-1",
      "start_ms": 12400,
      "end_ms": 72800,
      "title_hook": "This changes everything",
      "virality_score": 9.2,
      "why_it_works": "Strong contrarian opinion in first 3 seconds...",
      "audience_type": "tech enthusiasts",
      "tags": ["hook", "controversial_take"],
      "duration_seconds": 60
    }
  ]
}

Upload brand logo

POST/functions/v1/api

Upload a logo image (PNG or JPG, max 2MB). One active logo per account — uploading replaces the previous one.

curl -X POST https://clip-agent.com/v1 \
  -H "Authorization: Bearer ca_live_xxx" \
  -F "file=@logo.png"

Response

{
  "logo_url": "https://...supabase.co/storage/v1/object/public/brand-logos/..."
}

Render clips

POST/functions/v1/api

Render selected clips with caption style and branding. No extra credit cost.

{
  "action": "render",
  "job_id": "a1b2c3d4-...",
  "clip_ids": ["clip-1", "clip-3"],
  "options": {
    "caption_style": "hormozi",          // see "Caption styles"
    "caption_position": "middle",        // top | middle | bottom
    "words_per_line": 3,                 // optional override for karaoke styles
    "smart_crop": true,                  // AI face/speaker tracking (default)
    "video_bitrate": "8M",               // ffmpeg -b:v
    "audio_bitrate": "192k",             // ffmpeg -b:a
    "brand": {
      "mode": "logo",
      "logo_url": "https://example.com/my-logo.png",
      "position": { "x": 0.5, "y": 0.08 },
      "opacity": 0.5,
      "scale": 0.15
    }
  }
}

Brand settings are per-render — you can use a different logo, handle, or style for every render request. This is useful if you manage multiple channels or clients.

Response (202)

{
  "render_id": "r1e2n3d4-...",
  "status": "rendering"
}

Render status

POST/functions/v1/api

{
  "action": "render_status",
  "render_id": "r1e2n3d4-..."
}

Response when completed

{
  "status": "completed",
  "clips": [
    {
      "clip_id": "clip-1",
      "download_url": "https://...supabase.co/storage/v1/...",
      "expires_at": "2026-04-02T12:00:00Z"
    }
  ]
}

Download URLs expire after 24 hours.

Caption styles

Pass one of these 15 values as options.caption_style. The first 5 use word-by-word karaoke highlighting (best for short-form virality); the last 10 are static (whole chunk on screen at once).

Karaoke styles (per-word highlight)

Value	Look	Words/chunk	Multi-speaker colors
hormozi	Yellow Arial Black, UPPERCASE, classic Alex Hormozi style	3	yellow / cyan / pink
viral	Hot pink Helvetica Neue with fade-in	4	pink / lime / orange
beast	Green Arial Black on semi-box, UPPERCASE, MrBeast vibe	3	green / yellow / cyan
ember	Orange Helvetica Neue with subtle dark outline	4	orange / purple / gold
podcast	Blue Helvetica Neue, calm/clean	5	blue / teal / coral

When the source has multiple speakers and you use a karaoke style, ClipAgent automatically color-codes captions per speaker.

Static styles (whole chunk visible)

Value	Look
boldPop	White bold Helvetica Neue on semi-transparent black box
minimal	Small light Helvetica Neue, gentle subtle look
impact	Yellow Arial Black, heavy weight, UPPERCASE, slam effect
boxed	White text on opaque black box
neonGlow	White with blue/cyan glow outline
outline	White with thick white outline (no fill stroke)
retro	Red Futura on dark box, UPPERCASE, retro/grindhouse
gradient	Cream Avenir Next with orange edge — soft gradient look
typewriter	Green monospace Menlo on dark box, terminal aesthetic
cinematic	Serif Georgia, soft fade-in, gray outline — film-like

Caption position

Pass options.caption_position with one of:

Value	Description
bottom	Default. Captions sit at the bottom of the 9:16 frame.
middle	Centered vertically. Common for face-on-camera podcasts.
top	Top of frame.

Words per line

Override the per-style chunking with options.words_per_line (integer, 1–10). Only affects karaoke styles — static styles always show the whole sentence.

Defaults per preset:

hormozi, beast — 3 words
viral, ember — 4 words
podcast — 5 words

Smart crop

ClipAgent uses MediaPipe face detection at 2 fps + audio voice activity detection to keep speakers centered in the 9:16 frame. For multi-speaker videos (interviews, podcasts), it also auto-cuts between speakers based on who is talking (mouth-aperture variance + audio gating).

Default: on. To disable, pass options.smart_crop: false — the renderer falls back to a simple center crop.

The pipeline includes:

Face detection at 2 fps via MediaPipe Face Mesh
Active-speaker detection for multi-face frames using inner-lip aperture variance + audio VAD gating
Smoothing — 7-sample mode vote on speaker ID, 2-second minimum hold before switching, deadband-based position lock to prevent jitter
Hard-cut transitions on speaker change (no slow pan)
Blur+fit fallback for no-face segments (text, diagrams) — full source frame fitted on a blurred background

Branding options

The options.brand object controls watermark/branding on rendered clips.

Handle mode

Field	Type	Default	Description
mode	string	—	`"handle"`
handle	string	—	Your @handle text
position.x	float	0.5	Horizontal position (0.0–1.0)
position.y	float	0.08	Vertical position (0.0–1.0)
opacity	float	0.5	Transparency (0.0–1.0)
font_size	int	18	Font size in points

Logo mode

Pass logo_url directly in the render request. You can use any publicly accessible image URL. Alternatively, use upload_logo to host one on our servers — if logo_url is omitted, the saved profile logo is used as fallback.

Field	Type	Default	Description
mode	string	—	`"logo"`
logo_url	string	profile logo	URL to a PNG/JPG logo image
position.x	float	0.5	Horizontal position (0.0–1.0)
position.y	float	0.08	Vertical position (0.0–1.0)
opacity	float	0.5	Transparency (0.0–1.0)
scale	float	0.15	Logo width relative to video (0.05–0.4)

Error codes

Status	Meaning
400	Bad request — missing or invalid parameters
401	Unauthorized — invalid or missing API key
402	Payment required — insufficient credits
404	Not found — job or render does not exist
405	Method not allowed — use POST
429	Rate limited — slow down
500	Server error

Rate limits

10 requests per minute per API key. Polling status and render_status count toward this limit — use 10-second intervals.