Details
- IBM released Granite 4.0 1B Speech, a compact speech-language model with half the parameters of Granite Speech 3.3-2B, topping the OpenASR leaderboard for English transcription accuracy.
- Involves IBM's Granite Speech collection; supports ASR and bidirectional speech translation in English, French, German, Spanish, Portuguese, and newly added Japanese.
- Features faster inference via speculative decoding, keyword list biasing for names/acronyms, and Apache 2.0 licensing with transformers and vLLM support; pairs with Granite Guardian for risk detection.
- Improves on predecessor with higher accuracy, expanded languages, and hybrid Mamba-2/transformer architecture enabling 70% less memory and 2x faster inference on long contexts up to 128K tokens.
- Ranks #1 on OpenASR; competitive WER on benchmarks versus larger models; first open-source family ISO 42001 certified for responsible AI, ideal for edge devices and enterprise workloads like healthcare chatbots.
Impact
Granite 4.0 1B Speech advances on-device AI by delivering top-tier speech accuracy with drastic efficiency gains, enabling broader enterprise adoption on constrained hardware. It challenges larger models from competitors like OpenAI and Google in multilingual tasks while prioritizing safety via ISO certification. This positions IBM as a leader in compact, hybrid architectures for scalable speech AI in RAG, agents, and edge deployments.