Amazon-backed Anthropic optimizes its Claude AI model to run on AWS Trainium 2, Amazon’s most advanced AI chip, giving it an efficiency shift to advance AI-assisted tasks.
Faster & efficient
The introduction of Trainium 2 in Claude’s 3.5 Haiku enables it to support latency-optimized inference in Amazon Bedrock, enhancing the speed & efficiency of the model without compromising on accuracy, the blog post by Anthropic stated.
Project Rainier
The two companies vow to launch "Rainier" which will be an EC2 UltraCluster of Trn2 UltraServers containing hundreds of thousands of Trainium2 chips. It will exhibit 5 times faster computing power in contrast to the one used in training the current generation of AI models.
Boost to efficiency
Claude’s 3.5 Haiku has experienced a significant efficiency upgrade accelerating its performance by 60% in inference when operated on Trainium 2. The latency optimization facilitated by Trainium 2 has enabled it to become one of the most efficient AIs to be used in either code completions, real-time content moderation, or even Chatbots.
Model distillation and knowledge transfer
The particular offering associated with Claude’s Haiku 3.5 will also enable Claude’s most cost-effective model Claude 3.0 to achieve significant performance gains at par with Claude 3.5 Sonnet in accuracy at the same price.
This technology allows for the transfer of knowledge from a teacher (Claude 3.5 Sonnet) to a student (Claude 3.5 Haiku) enabling the users to run tasks like retrieval augmented generation (RAG) and data analysis at a fraction of the cost.
Eventually, it has offered :
- Claude 3.5 Haiku with latency optimization, powered by Trainium2, for general use cases
- Claude 3 Haiku, distilled with frontier performance, for high-volume, repetitive use cases.