Introduction
LattifAI SDK - Precision Alignment, Infinite Possibilities
LattifAI SDK
Advanced forced alignment and subtitle generation powered by the Lattice-1 model.
Core Capabilities
| Feature | Description | Status |
|---|---|---|
| Forced Alignment | Precise word-level and segment-level synchronization with audio | Production |
| Multi-Model Transcription | Gemini (100+ languages), Parakeet (24 languages), SenseVoice (5 languages) | Production |
| Speaker Diarization | Automatic multi-speaker identification with label preservation | Production |
| Audio Preprocessing | Multi-channel selection, device optimization (CPU/CUDA/MPS) | Production |
| Streaming Mode | Process audio up to 20 hours with minimal memory footprint | Production |
| Smart Text Processing | Intelligent sentence splitting and non-speech element separation | Production |
| Universal Format Support | 20+ caption/subtitle formats (SRT, VTT, ASS, TTML, NLE formats) | Production |
| Configuration System | YAML-based configs for reproducible workflows | Production |
Key Highlights
- Accuracy: High-precision forced alignment with Lattice-1 model
- Multilingual: Support for 100+ languages via multiple transcription models
- Performance: Hardware-accelerated processing with streaming support
- Flexible: CLI, Python SDK, and Web UI interfaces
- Production-Ready: Battle-tested on diverse audio/video content
Quick Example
from lattifai import LattifAI
client = LattifAI()
caption = client.alignment(
input_media="audio.wav",
input_caption="subtitle.srt",
output_caption_path="aligned.srt",
)Getting Started
Quick Start
Get up and running in 5 minutes
Installation
Install the SDK with pip or uv
Authentication
Get your API key