Top AI Chip Vendors Powering the Cloud and Edge
Building on the previous discussion about AI hardware, this article delves into the top AI chip vendors driving innovation in cloud and edge applications. As AI’s role expands—from large-scale data centers to resource-constrained edge environments—the demand for specialized chipsets has never been greater. We’ll explore the major players in this market and examine their unique offerings.
Top AI Chip Vendors for the Cloud
NVIDIA
NVIDIA launched its latest AI chip, the B200 “Blackwell.” The Blackwell architecture features 208 billion transistors and can perform tasks up to 30 times faster than its predecessor, the H100.
This chip is designed for generative AI applications and is expected to significantly reduce costs and energy consumption for training AI models by up to 25 times compared to previous models.
Major customers like Amazon, Google, and Microsoft are anticipated to adopt this technology for their cloud services. The Blackwell platform was unveiled at the GTC 2024 conference in March 2024, with widespread availability expected in the fourth quarter of 2024.
Advanced Micro Devices (AMD)
AMD announced its new Ryzen AI 300 series and Ryzen 9000 series processors. The Ryzen 9000 series features up to 16 cores and is built on the latest Zen 5 architecture, promising an average of 16% improvement in instructions per clock (IPC) over the previous generation.
The Ryzen AI series includes models with integrated NPUs capable of delivering substantial performance improvements for AI tasks. These processors were unveiled in June 2024.
Intel
Intel launched its latest Xeon 6 processors and Gaudi 3 AI accelerators, designed to enhance performance in AI and high-performance computing (HPC) environments. The Xeon 6 processors feature performance cores (P-cores) that can double AI vision performance compared to previous generations, while the Gaudi 3 AI accelerators offer 20% more throughput for deep learning tasks.
The Xeon 6 processors are engineered for compute-intensive workloads with increased core counts and double the memory bandwidth, making them suitable for edge, data center, and cloud applications.
The Gaudi 3 AI accelerator is optimized for large-scale generative AI, equipped with 64 Tensor processor cores (TPCs) and eight matrix multiplication engines (MMEs) to accelerate deep neural network computations. It includes 128 GB of HBM2 memory for efficient training and inference.
This launch was announced in April 2024 as part of Intel’s strategy to regain competitiveness in the AI chip market against rivals like NVIDIA and AMD. Intel is also collaborating with major OEMs to develop tailored systems that leverage these new technologies for effective AI deployments.
Qualcomm
Qualcomm introduced its new Snapdragon X Plus processor, featuring an 8-core Qualcomm Oryon CPU and a 45 TOPS NPU designed to deliver on-device AI capabilities. This chip is positioned as a more affordable option compared to the higher-end Snapdragon X Elite.
The Snapdragon X Plus boasts a maximum clock speed of 3.4 GHz and is manufactured using a 4nm process. It retains the same NPU capability as its more powerful counterparts, allowing it to handle AI-driven tasks efficiently without draining battery life.
The announcement was made at the IFA 2024 event on September 2024. This new processor aims to provide a balance of performance and affordability, catering to consumers looking for capable yet cost-effective computing solutions.
Qualcomm also announced enhancements to its Snapdragon platform, integrating dedicated AI processing units (APUs) into its latest mobile processors. These APUs are designed to accelerate on-device AI tasks and enable features like real-time translation and advanced photography enhancements.
Cerebras Systems
Cerebras Systems unveiled its latest AI chip, the Wafer Scale Engine 3 (WSE-3), the world’s largest semiconductor designed specifically for gen AI applications. The WSE-3 features 4 trillion transistors and delivers 125 petaFLOPs of peak AI performance through 900,000 AI-optimized compute cores.
This chip is built on a 5nm process and is purpose-designed for training the largest AI models, capable of handling up to 24 trillion parameters. The WSE-3 maintains the same power draw and price as its predecessor, the WSE-2 while doubling its performance capabilities.
The WSE-3 is integrated into the Cerebras CS-3 supercomputer, which can scale up to 2048 nodes, providing a total computing capacity of up to 256 exaFLOPs. The system also features 44GB of on-chip SRAM, ensuring high-speed access to memory for all cores.
Cerebras aims to simplify the training workflow for AI models by requiring significantly less code—up to 97% less compared to traditional GPU setups. For example, a standard implementation of a GPT-3 sized model can be achieved with just 565 lines of code.
The WSE-3 was officially announced in March 2024 and is expected to play a crucial role in advancing AI research and applications across various sectors, including enterprise and government.
Top Edge AI Chip Vendors
Before we discuss the top Edge AI chip vendors, let’s first understand why this area is so important to embedUR. We are not just observers in this space; we have built strong relationships with most of these chip vendors.
These connections allow us to maximize the capabilities of their chipsets and help our clients make the most out of their Edge AI projects.
Now, let’s look at some of the top players in the Edge AI chip market.
Synaptics
Synaptics launched its Astra platform, which includes the SL-Series of embedded AI-native processors designed for IoT applications. The Astra platform aims to enhance AI capabilities at the edge, providing faster response times, improved privacy, and reduced reliance on cloud computing.
The SL-Series features 64-bit processors that integrate quad-core Arm Cortex-A73 or Cortex-M55 CPUs along with dedicated neural processing units (NPUs). These processors deliver over 240% more AI capability compared to similar embedded systems on chip (SoCs). Notable models include the SL1680, which offers 7.9 TOPS, and the SL1640, which provides 1.6 TOPS of AI performance.
The Astra platform is designed to fill the gap between low-performance microcontrollers and high-performance smartphone SoCs, making it suitable for edge inferencing and multi stream video processing. Synaptics Astra SL-Series was announced in April 2024 as part of Synaptics’ strategy to meet the increasing demand for AI in various consumer and industrial applications.
Silicon Labs
Silicon Labs has launched its latest xG26 family of microcontrollers for ultra-low-power applications in IoT and edge AI environments. The xG26 series includes models equipped with a built-in Matrix Vector Processor (MVP) that enables efficient on-device AI processing.
These microcontrollers are optimized for battery-powered devices, supporting a variety of wireless protocols and offering significant enhancements in energy efficiency. The xG26 family is particularly well-suited for applications requiring real-time data processing, such as smart home devices and industrial automation.
The announcement was made in September 2024, with the xG26 series expected to play a crucial role in advancing the capabilities of IoT devices by enabling local AI processing without relying heavily on cloud resources.
NXP Semiconductors
NXP Semiconductors has released the i.MX 93 applications processor family, designed for efficient machine learning (ML) acceleration and advanced security in connected edge computing applications.
This family features dual-core Arm Cortex-A55 processors combined with an Arm Cortex-M33 real-time co-processor, utilizing NXP’s innovative Energy Flex architecture to maximize performance efficiency. The i.MX 93 includes an integrated Arm Ethos-U65 microNPU, which enhances machine learning capabilities, enabling more capable and energy-efficient AI applications.
Additionally, it offers advanced security through the EdgeLock Secure enclave, providing a preconfigured, self-managed security subsystem for edge applications. This processor family is suitable for a range of use cases, including automotive, industrial, and IoT applications. The i.MX 93 was officially announced in March 2024.
Renesas Electronics
Renesas Electronics introduced the RZ/V2H high-end AI microprocessor (MPU), which features a proprietary dynamically reconfigurable processor AI accelerator (DRP-AI3).
This quad-core processor utilizes four Arm Cortex-A55 cores running at 1.8 GHz, along with dual Cortex-R8 real-time processors operating at 800 MHz. The RZ/V2H is designed to accelerate image processing and dynamic calculations, making it ideal for applications such as autonomous robots and machine vision in factory automation.
The RZ/V2H supports high-speed interfaces, including PCIe, USB 3.2, and Gigabit Ethernet, ensuring efficient data handling for advanced AI processing tasks while maintaining low power consumption. This microprocessor is particularly suited for edge AI applications, where real-time performance and energy efficiency are crucial.
Officially announced on February 29, 2024, the RZ/V2H is already in production, with availability expected for developers looking to implement sophisticated AI capabilities in their systems. The platform also includes an evaluation kit to facilitate software development and testing for various applications, including robotics and smart retail solutions.
Arm Holdings
Arm’s recent innovations in AI hardware include its Neoverse and Cortex-X series, designed to deliver high performance for both cloud and edge applications. The Neoverse N2 and V1 platforms are geared toward cloud data centers, but their low-power designs also make them attractive for edge AI applications.
The Cortex-A78 and Cortex-X1 processors represent Arm’s high-performance cores, designed for mobile and edge computing. These CPUs are optimized for running intensive AI workloads like natural language processing (NLP), object detection, and image recognition. Paired with Arm’s Mali-G78 GPU, which is also optimized for AI tasks, these processors enable real-time AI inferencing in smartphones and IoT devices.
Arm’s architecture is central to edge AI deployments due to its combination of high energy efficiency and performance. Many companies, including NVIDIA, Qualcomm, and Apple, rely on Arm’s designs as the backbone of their AI-enabled SoCs (System on Chip). This flexibility allows developers to scale AI solutions from the cloud down to the smallest IoT devices.
Arm is also driving forward with its Project Trillium initiative, which focuses on enhancing AI capabilities across a broad range of applications. This initiative includes the development of hardware IP for neural networks, computer vision, and machine learning, with the aim of delivering more efficient AI processing across different layers of edge and embedded systems.
Why Knowledge of AI Chip Varieties is Essential
With all these advancements from top chip vendors, one might wonder why understanding these different chip capabilities is critical. The simple answer is – Flexibility. The aim is to easily port pre-trained AI models across different chipsets and ensure that models can be adapted and optimized regardless of the hardware platform.
This ties directly into embedUR’s vision for ModelNova—a platform where pre-trained models, blueprints, and frameworks can be mixed and matched to work seamlessly across various hardware architectures. The ability to deploy AI on a range of chips is central to helping the AI community and driving forward innovation in the field.
embedUR maintains close partnerships with these chip vendors, with early access to their newest developments. This allows us to stay ahead of emerging technologies, share insights that benefit the broader AI and embedded systems community, and meet the growing demand for efficient AI solutions across industries.