What is Twelve Labs?

TwelveLabs is a video intelligence platform that extracts structured information from raw video content through multimodal AI. Its core models, Marengo (encoder) and Pegasus (video‑language), enable time‑ and space‑aware analysis for text, speech, audio, and visual elements.

The Search API locates precise moments across large libraries using natural‑language queries. The Analyze API generates descriptive text, summaries, or captions on demand. The Embed API produces multimodal vector embeddings for semantic search, recommendation, and hybrid retrieval systems.

Twelve Labs pricing Freemium

Text-in-video $0.07/min

Conversation $0.008/min

Logo $0.10/min

Visual $0.033/min

Verify on the official pricing page.

View plans

Twelve Labs user reviews

Would you recommend Twelve Labs?

Recommend this tool?

Twelve Labs's key features

Search exact moments
Generate text summaries
Produce multimodal embeddings
Temporal spatial reasoning
Fine-tune on custom data
Deploy cloud or on-prem

Twelve Labs use cases

Automatically generate searchable time‑stamped captions and key moments for user‑generated video content, enabling instant navigation and boosting discoverability without manual tagging
Implement a semantic recommendation engine that surfaces relevant ads and related videos by embedding video content into vector space, improving ad targeting and viewer engagement
Detect and flag security‑sensitive events in surveillance footage in real time, providing alerts and searchable logs for law enforcement and corporate security teams