AI

AWS deploys Cerebras CS-3 chips for fastest AI inference via Bedrock

Friday, March 13, 2026Read Original

Details

  • AWS and Cerebras announced a partnership deploying Cerebras CS-3 chips in AWS data centers, accessible through Amazon Bedrock, launching in the coming months with open-source LLMs and Amazon Nova models available later in 2026.
  • The solution combines AWS Trainium processors, Cerebras CS-3 systems, and Elastic Fabric Adapter (EFA) networking to deliver inference speeds claimed to be an order of magnitude faster than current alternatives.
  • The partnership introduces inference disaggregation architecture: Trainium handles prefill (prompt processing) while CS-3 handles decode (output generation), optimizing each stage for its distinct computational demands—prefill is parallel and compute-intensive, while decode is serial and memory-bandwidth intensive.
  • Cerebras CS-3 delivers thousands of times greater memory bandwidth than GPUs and is already used by OpenAI, Cognition, and Mistral for demanding workloads including agentic coding; the partnership marks AWS's first offering of Cerebras's disaggregated solution and AWS Trainium is committed to by Anthropic and OpenAI (consuming 2 gigawatts of capacity).
  • The announcement arrives alongside Amazon's $37 billion bond sale and $200 billion 2026 capital expenditure plan focused on AI infrastructure expansion, positioning AWS to compete directly with Nvidia-dominated GPU infrastructure on both speed and cost.

Impact

This collaboration directly addresses inference cost and speed as a primary bottleneck constraining AI application deployment at scale. By pairing specialized silicon through disaggregation, AWS signals aggressive investment in alternatives to GPU-based infrastructure, targeting enterprises evaluating inference solutions for chatbots, coding assistants, and AI agents. The exclusive availability through Bedrock strengthens AWS's competitive position in managed AI services against rivals while validating Cerebras's technology after securing over $10 billion in commitments from OpenAI.

Rift Dispatchpractical systems & stories, weekly