We design processors for AI training and inference. We build AI systems to power, cool, and feed the processors data. We develop software to link these systems together into industry-leading supercomputers that are simple to use, even for the most complicated AI work, using familiar ML frameworks like PyTorch. Customers use our supercomputers to train industry-leading models. We use these supercomputers to run inference at speeds unobtainable on alternative commercial technologies. We deliver these AI capabilities to our customers on premise and via the cloud.
AI compute is comprised of training and inference. For training, many of our customers have achieved over 10 times faster training time-to-solution compared to leading 8-way GPU systems of the same generation and have produced their own state-of-the-art models. For inference, we deliver over 10 times faster output generation speeds than GPU-based solutions from top CSPs, as benchmarked on leading open-source models. This enables real-time interactivity for AI applications and the development of smarter, more capable AI agents. It enables faster development and eliminates the complex distributed compute work required when using thousands of GPUs. Cerebras democratizes AI, enabling organizations that have less in-house AI or distributed computing expertise to leverage the full potential of AI.