The focus of this new AI accelerator is inference— the production deployment of AI models in applications. Its architecture combines high compute performance with a newly designed memory system and a ...