Jamba 1.5: The New 256K Context Benchmark in AI Speed
Discover AI21's Jamba 1.5 models: unmatched speed, 256K context length, and hybrid architecture for superior efficiency and quality. Ideal for enterprise AI applications.
AI21 has launched the new open models, Jamba 1.5 series, including Jamba 1.5 Mini and Jamba 1.5 Large.
They offer unmatched speed, efficiency, and quality, with the longest context window among open models.
Jamba uses a hybrid Transformer-Mamba MoE architecture, ensuring high throughput and low memory usage across different context lengths while maintaining or exceeding the quality of Transformer models.
Both models are fine-tuned for various conversational and instruction-following tasks, supporting a 256K token context length, the longest among open-weight models.
To support cost-effective inference, Jamba-1.5 introduces ExpertsInt8, a novel quantization technique that enables Jamba-1.5-Large to run on a single machine with 8 x 80GB GPUs while processing 256K tokens, without losing quality.
Jamba models excel in various academic and chatbot benchmarks, delivering high throughput and outperforming other open-weight models in long-context benchmarks.
Keep reading with a 7-day free trial
Subscribe to AI Disruption to keep reading this post and get 7 days of free access to the full post archives.