ClipAgent API

Turn long videos into viral short-form clips programmatically. Upload a video file, get AI-analyzed clip candidates with virality scores, then render them with captions and branding.

Base URL: https://clip-agent.com/v1

All requests use POST with a JSON body containing an action field.

Authentication

Every request requires an API key in the Authorization header:

Authorization: Bearer ca_live_xxxxxxxxxxxxxxxxx

Generate API keys from your Dashboard.

Credits

Each video analysis costs 1 credit ($2). Rendering is included — no extra charge.

Buy credits from the Dashboard. Your balance is shown there along with usage history.

Claude (MCP connector)

ClipAgent ships an MCP server at https://mcp.clip-agent.com so you can use the entire API inside any Claude client without writing a single line of code. Claude calls the tools, ClipAgent does the work, the rendered MP4s land back in the chat.

The connector implements the Model Context Protocol (2025-06-18) streamable-HTTP transport with full OAuth 2.1 authentication. Same pricing as the REST API: $2 per video, no subscription, no per-clip surcharge.

Endpoint: https://mcp.clip-agent.com
Auth: OAuth 2.1 (one-click sign-in) or Bearer API key
Tools: analyze_video, get_job_status, render_clips, get_render_status

Add to Claude.ai (web)

  1. Open claude.ai/settings/connectors (requires Claude Pro or Max).
  2. Click Add custom connector.
  3. Name: ClipAgent
    URL: https://mcp.clip-agent.com
  4. Click Connect. A ClipAgent sign-in page opens — sign in with your account and click Authorize.
  5. ClipAgent now appears in the tools menu of every new chat.

Add to Claude Desktop (Mac & Windows)

  1. Open Claude Desktop → Settings → Connectors.
  2. Click Add custom connector.
  3. Name: ClipAgent
    URL: https://mcp.clip-agent.com
  4. Click Connect — your browser opens to the ClipAgent sign-in screen, sign in, click Authorize, the popup closes, you're back in Claude.

Add to Claude Code (CLI)

Two options. OAuth is best for interactive use, API key for automation/CI.

OAuth (interactive):

claude mcp add --transport http clipagent \
  https://mcp.clip-agent.com

The OAuth flow opens in your browser, then writes the token to ~/.claude. No further setup.

API key (headless):

# Generate a key at https://clip-agent.com/dashboard, then:
claude mcp add --transport http clipagent \
  https://mcp.clip-agent.com \
  --header "Authorization: Bearer ca_live_your_api_key"

Tools reference

Four tools. Claude chains them automatically — you describe what you want in plain English and Claude figures out the order.

analyze_video

Submit a video for AI viral clip detection. Pass a publicly downloadable URL (Drive share link, S3 URL, Loom export, etc.) — the MCP server fetches it server-side and uploads it to the pipeline on your behalf. Costs 1 credit ($2). Returns a job_id immediately; the actual transcribe + AI rank takes 1–3 minutes for a typical 10-min video.

FieldTypeRequiredDescription
video_urlstringyesDirect download URL of the source video (max 3 GB)
filenamestringnoDisplay name for the project (defaults to URL basename)

Returns: { job_id, status: "processing", credits_remaining }

get_job_status

Poll until status is "completed", then read the ranked clips array. Each clip has a clip_id, virality_score (1–10), title_hook, start_ms/end_ms, and why_it_works reasoning.

FieldTypeRequiredDescription
job_idstringyesThe job_id returned by analyze_video

render_clips

Render selected clips as 9:16 vertical MP4s with burned-in captions and optional branding. No extra credit cost — included in the analyze charge. Returns a render_id for polling.

FieldTypeRequiredDescription
job_idstringyesThe original analyze_video job_id
clip_idsstring[]yesArray of clip_id strings from get_job_status
caption_stylestringnoOne of 15 styles. See caption styles. Default: boldPop.
caption_positionstringnotop, middle, or bottom. Default: bottom.
words_per_lineintnoOverride how many words appear per chunk (1–10).
smart_cropboolnoAI face/speaker tracking + multi-speaker auto-cuts. Default: true.
brand_modestringnonone, text, or logo.
brand_textstringnoHandle text when brand_mode is text.

get_render_status

Poll until status is "completed", then read the clips array — each entry has a download_url (signed for 24 hours) for the rendered MP4.

Example prompts

Drop any of these into a Claude chat after connecting. Claude will chain the tools automatically.

Single-speaker podcast → top 5 viral clips, hormozi style:

Use ClipAgent to make 5 viral clips of this video.
Use the hormozi caption style with captions in the middle.
https://drive.google.com/uc?id=VIDEO_ID&export=download

Two-host interview → render top 3 with speaker tracking:

Clip the 3 most viral moments from this interview using ClipAgent.
Use the viral caption style — it color-codes captions per speaker
and the smart crop should auto-cut between hosts when they switch.
https://my-bucket.s3.amazonaws.com/interview.mp4

Tutorial → highlight reel with your @handle:

Pull the 3 most quotable bits of this talk via ClipAgent.
Render them with the boldPop style and overlay the handle @yourname.
https://my-cdn.com/talks/yc-demo-day.mp4

Bulk repurposing — multiple URLs in one message:

Here are 5 video URLs. For each, use ClipAgent to find the top
moment and render it with the beast caption style. Give me all the
download links when done.

https://my-bucket.s3.amazonaws.com/video1.mp4
https://my-bucket.s3.amazonaws.com/video2.mp4
https://my-bucket.s3.amazonaws.com/video3.mp4
https://my-bucket.s3.amazonaws.com/video4.mp4
https://my-bucket.s3.amazonaws.com/video5.mp4

Custom workflow — pick clips by your own criteria:

Analyze this video with ClipAgent and show me all the candidates
with their virality scores and reasoning. I'll pick which ones to render.
https://drive.google.com/uc?id=VIDEO_ID&export=download

OAuth flow (under the hood)

When a user clicks Connectin Claude.ai or Claude Desktop, here's what happens:

  1. Claude POSTs to https://mcp.clip-agent.com without a token. The server returns 401 with WWW-Authenticate: Bearer realm="ClipAgent", resource_metadata=....
  2. Claude reads /.well-known/oauth-protected-resource to find the authorization server.
  3. Claude reads /.well-known/oauth-authorization-server to find the authorize, token, and registration endpoints.
  4. Claude POSTs to /oauth/register (RFC 7591 Dynamic Client Registration) and gets back a client_id.
  5. Claude opens /oauth/authorize?client_id=... in a popup. The user signs in with their ClipAgent account and clicks Authorize.
  6. ClipAgent issues a one-time authorization code and redirects back to Claude with ?code=....
  7. Claude POSTs the code to /oauth/token with a PKCE code_verifier. The server returns an access_token (1h) and refresh_token (30d).
  8. Claude calls every subsequent tool with Authorization: Bearer ca_at_.... The MCP server resolves the OAuth token to an internal API key under the user's account and proxies to /v1.

To revoke Claude's access, visit your dashboard and revoke the API key named "Claude (via OAuth)".

Upload a video

POST/functions/v1/api

Request a short-lived signed upload URL, then PUT your file directly to Supabase Storage. The returned path is what you pass to analyze. Max 3 GB.

{
  "action": "create_upload",
  "filename": "talk.mp4"
}

Response

{
  "upload_url": "https://...supabase.co/storage/v1/object/...",
  "path": "<user_id>/<uuid>.mp4",
  "expires_in": 900
}

PUT your file to upload_url with the correct Content-Type (e.g., video/mp4), then call analyze with the path below.

Analyze a video

POST/functions/v1/api

Submit an uploaded video for analysis. Costs 1 credit. Returns a job ID for polling.

{
  "action": "analyze",
  "video_path": "<path from create_upload>",
  "filename": "talk.mp4"
}

Response (202)

{
  "job_id": "a1b2c3d4-...",
  "status": "processing"
}

Check job status

POST/functions/v1/api

Poll until status is "completed" or "failed". Recommended interval: 10 seconds.

{
  "action": "status",
  "job_id": "a1b2c3d4-..."
}

Response when completed

{
  "status": "completed",
  "video_title": "The Future of AI",
  "duration_seconds": 1847,
  "clips": [
    {
      "clip_id": "clip-1",
      "start_ms": 12400,
      "end_ms": 72800,
      "title_hook": "This changes everything",
      "virality_score": 9.2,
      "why_it_works": "Strong contrarian opinion in first 3 seconds...",
      "audience_type": "tech enthusiasts",
      "tags": ["hook", "controversial_take"],
      "duration_seconds": 60
    }
  ]
}
POST/functions/v1/api

Upload a logo image (PNG or JPG, max 2MB). One active logo per account — uploading replaces the previous one.

curl -X POST https://clip-agent.com/v1 \
  -H "Authorization: Bearer ca_live_xxx" \
  -F "file=@logo.png"

Response

{
  "logo_url": "https://...supabase.co/storage/v1/object/public/brand-logos/..."
}

Render clips

POST/functions/v1/api

Render selected clips with caption style and branding. No extra credit cost.

{
  "action": "render",
  "job_id": "a1b2c3d4-...",
  "clip_ids": ["clip-1", "clip-3"],
  "options": {
    "caption_style": "hormozi",          // see "Caption styles"
    "caption_position": "middle",        // top | middle | bottom
    "words_per_line": 3,                 // optional override for karaoke styles
    "smart_crop": true,                  // AI face/speaker tracking (default)
    "video_bitrate": "8M",               // ffmpeg -b:v
    "audio_bitrate": "192k",             // ffmpeg -b:a
    "brand": {
      "mode": "logo",
      "logo_url": "https://example.com/my-logo.png",
      "position": { "x": 0.5, "y": 0.08 },
      "opacity": 0.5,
      "scale": 0.15
    }
  }
}

Brand settings are per-render — you can use a different logo, handle, or style for every render request. This is useful if you manage multiple channels or clients.

Response (202)

{
  "render_id": "r1e2n3d4-...",
  "status": "rendering"
}

Render status

POST/functions/v1/api
{
  "action": "render_status",
  "render_id": "r1e2n3d4-..."
}

Response when completed

{
  "status": "completed",
  "clips": [
    {
      "clip_id": "clip-1",
      "download_url": "https://...supabase.co/storage/v1/...",
      "expires_at": "2026-04-02T12:00:00Z"
    }
  ]
}

Download URLs expire after 24 hours.

Caption styles

Pass one of these 15 values as options.caption_style. The first 5 use word-by-word karaoke highlighting (best for short-form virality); the last 10 are static (whole chunk on screen at once).

Karaoke styles (per-word highlight)

ValueLookWords/chunkMulti-speaker colors
hormoziYellow Arial Black, UPPERCASE, classic Alex Hormozi style3yellow / cyan / pink
viralHot pink Helvetica Neue with fade-in4pink / lime / orange
beastGreen Arial Black on semi-box, UPPERCASE, MrBeast vibe3green / yellow / cyan
emberOrange Helvetica Neue with subtle dark outline4orange / purple / gold
podcastBlue Helvetica Neue, calm/clean5blue / teal / coral

When the source has multiple speakers and you use a karaoke style, ClipAgent automatically color-codes captions per speaker.

Static styles (whole chunk visible)

ValueLook
boldPopWhite bold Helvetica Neue on semi-transparent black box
minimalSmall light Helvetica Neue, gentle subtle look
impactYellow Arial Black, heavy weight, UPPERCASE, slam effect
boxedWhite text on opaque black box
neonGlowWhite with blue/cyan glow outline
outlineWhite with thick white outline (no fill stroke)
retroRed Futura on dark box, UPPERCASE, retro/grindhouse
gradientCream Avenir Next with orange edge — soft gradient look
typewriterGreen monospace Menlo on dark box, terminal aesthetic
cinematicSerif Georgia, soft fade-in, gray outline — film-like

Caption position

Pass options.caption_position with one of:

ValueDescription
bottomDefault. Captions sit at the bottom of the 9:16 frame.
middleCentered vertically. Common for face-on-camera podcasts.
topTop of frame.

Words per line

Override the per-style chunking with options.words_per_line (integer, 1–10). Only affects karaoke styles — static styles always show the whole sentence.

Defaults per preset:

  • hormozi, beast — 3 words
  • viral, ember — 4 words
  • podcast — 5 words

Smart crop

ClipAgent uses MediaPipe face detection at 2 fps + audio voice activity detection to keep speakers centered in the 9:16 frame. For multi-speaker videos (interviews, podcasts), it also auto-cuts between speakers based on who is talking (mouth-aperture variance + audio gating).

Default: on. To disable, pass options.smart_crop: false — the renderer falls back to a simple center crop.

The pipeline includes:

  • Face detection at 2 fps via MediaPipe Face Mesh
  • Active-speaker detection for multi-face frames using inner-lip aperture variance + audio VAD gating
  • Smoothing — 7-sample mode vote on speaker ID, 2-second minimum hold before switching, deadband-based position lock to prevent jitter
  • Hard-cut transitions on speaker change (no slow pan)
  • Blur+fit fallback for no-face segments (text, diagrams) — full source frame fitted on a blurred background

Branding options

The options.brand object controls watermark/branding on rendered clips.

Handle mode

FieldTypeDefaultDescription
modestring"handle"
handlestringYour @handle text
position.xfloat0.5Horizontal position (0.0–1.0)
position.yfloat0.08Vertical position (0.0–1.0)
opacityfloat0.5Transparency (0.0–1.0)
font_sizeint18Font size in points

Logo mode

Pass logo_url directly in the render request. You can use any publicly accessible image URL. Alternatively, use upload_logo to host one on our servers — if logo_url is omitted, the saved profile logo is used as fallback.

FieldTypeDefaultDescription
modestring"logo"
logo_urlstringprofile logoURL to a PNG/JPG logo image
position.xfloat0.5Horizontal position (0.0–1.0)
position.yfloat0.08Vertical position (0.0–1.0)
opacityfloat0.5Transparency (0.0–1.0)
scalefloat0.15Logo width relative to video (0.05–0.4)

Error codes

StatusMeaning
400Bad request — missing or invalid parameters
401Unauthorized — invalid or missing API key
402Payment required — insufficient credits
404Not found — job or render does not exist
405Method not allowed — use POST
429Rate limited — slow down
500Server error

Rate limits

10 requests per minute per API key. Polling status and render_status count toward this limit — use 10-second intervals.