LattifAI SDK

Introduction

LattifAI SDK - Precision Alignment, Infinite Possibilities

LattifAI SDK

Advanced forced alignment and subtitle generation powered by the Lattice-1 model.

Core Capabilities

FeatureDescriptionStatus
Forced AlignmentPrecise word-level and segment-level synchronization with audioProduction
Multi-Model TranscriptionGemini (100+ languages), Parakeet (24 languages), SenseVoice (5 languages)Production
Speaker DiarizationAutomatic multi-speaker identification with label preservationProduction
Audio PreprocessingMulti-channel selection, device optimization (CPU/CUDA/MPS)Production
Streaming ModeProcess audio up to 20 hours with minimal memory footprintProduction
Smart Text ProcessingIntelligent sentence splitting and non-speech element separationProduction
Universal Format Support20+ caption/subtitle formats (SRT, VTT, ASS, TTML, NLE formats)Production
Configuration SystemYAML-based configs for reproducible workflowsProduction

Key Highlights

  • Accuracy: High-precision forced alignment with Lattice-1 model
  • Multilingual: Support for 100+ languages via multiple transcription models
  • Performance: Hardware-accelerated processing with streaming support
  • Flexible: CLI, Python SDK, and Web UI interfaces
  • Production-Ready: Battle-tested on diverse audio/video content

Quick Example

from lattifai import LattifAI

client = LattifAI()
caption = client.alignment(
    input_media="audio.wav",
    input_caption="subtitle.srt",
    output_caption_path="aligned.srt",
)

Getting Started

Resources

On this page