Enfabrica Introduces ACF SuperNIC to Advance AI Networking Efficiency
MOUNTAIN VIEW, Calif. and ATLANTA, Nov. 19, 2024 -- Today at Supercomputing 2024 (SC24), Enfabrica Corporation announced the general availability of its 3.2 Terabit/sec (Tbps) Accelerated Compute Fabric (ACF) SuperNIC chip and pilot systems.
The ACF solution delivers multi-port 800-Gigabit-Ethernet connectivity to GPU servers, and four times the bandwidth and multipath resiliency of any other GPU-attached network interface controller (NIC) product in the industry. The Enfabrica silicon will be available in initial quantities in calendar Q1 of 2025. This announcement solidifies Enfabrica’s position in the high-growth AI infrastructure industry and highlights its leadership role in the future of GPU compute networks.
The AI “SuperNIC” has emerged as a high-growth silicon product category to logically interconnect GPUs and accelerators across a high-performance scale-out network in an AI data center. Enfabrica is the first in the industry to build a grounds-up SuperNIC chip delivering the highest performance, resiliency, and efficiency of data movement demanded by large-scale training, inference, and retrieval-augmented generation (RAG) workloads associated with frontier AI models.
“Today is a watershed moment for Enfabrica. We successfully closed a major Series C fundraise and our ACF SuperNIC silicon will be available for customer consumption and ramp in early 2025,” said Rochan Sankar, CEO of Enfabrica. “With a software and hardware co-design approach from day one, our purpose has been to build category-defining AI networking silicon that our customers love, to the delight of system architects and software engineers alike. These are the people responsible for designing, deploying and efficiently maintaining AI compute clusters at scale, and who will decide the future direction of AI infrastructure.”
ACF SuperNIC (ACF-S) Silicon Product Highlights
The Enfabrica ACF SuperNIC, with its high-radix, high-bandwidth, and concurrent PCIe/Ethernet multipathing and data mover capabilities, can uniquely scale-up and scale-out four to eight latest-generation GPUs per server system, bringing unprecedented levels of performance, scale, and resiliency to AI clusters. The ACF SuperNIC solution brings the benefits of full-stack operator control and programmability with software-defined networking (SDN) to remote direct memory access (RDMA) networking widely deployed in AI data centers.
- With 800, 400 and 100 Gigabit-Ethernet interfaces and high radix of 32 network ports and 160 PCIe lanes on a single ACF-S chip, for the first time, AI clusters of more than 500K GPUs can be built using a more efficient two-tier network design, enabling the highest scale-out throughput and lowest end-to-end latency across all GPUs in the cluster.
- The ACF-S software stack supports standard collective communication and RDMA networking operations through a consistent set of libraries compatible with existing interfaces. This offers substantial operational efficiency benefits to data center operators, who can now deploy a common, high-performance backend network fabric across an AI compute fleet composed of GPUs and accelerators from multiple vendors.
- Using Resilient Message Multipathing (RMM) technology, an Enfabrica innovation, the ACF-S solution boosts AI cluster resiliency, serviceability, and uptime at scale. RMM eliminates AI job stalls due to network link flaps and failures to improve effective training time and GPU compute efficiency without requiring changes to the AI software stack or network topology.
- Software Defined RDMA Networking, unique to the ACF-S Solution, enhances debuggability and democratizes the ability to customize and future-proof the transport layer in AI networking for optimized, cloud-scale network topologies without compromising performance.
- Using Collective Memory Zoning, the ACF-S solution delivers low latency zero-copy data transfers, greater host memory management efficiency and burst bandwidth, and higher system resiliency across multiple CPU, GPU, and CXL 2.0-based endpoints attached to the ACF-S chip, collectively improving the efficiency and overall Floating Point Operations per Second (FLOPs) utilization of GPU server fleets.
Availability
The Enfabrica ACF SuperNIC silicon will be available in initial quantities in calendar Q1 of 2025. Both ACF SuperNIC chips and pilot systems are now commercially orderable from Enfabrica and select partners. For more information on Enfabrica’s ACF SuperNIC silicon, system and software, please visit enfabrica.net.
Ecosystem
Enfabrica’s ACF SuperNIC silicon is being designed into multiple server and networking OEM and ODM systems expected to be released in 2025, along with first availability in GPU Infrastructure-as-a-service clouds expected in the latter half of 2025. Enfabrica, already an actively contributing member of the Ultra Ethernet Consortium (UEC) and a member of its Technical Advisory Committee, is also now a Contributor-level member in the newly-formed Ultra Accelerator Link (UALink) Consortium. Enfabrica is committed to developing products that deliver the industry’s most performant and efficient solutions for scale-out and scale-up networking of heterogeneous AI/GPU clusters. For more details on the company’s recent funding milestone propelling its product development and go-to-market, please read more here.
About Enfabrica
Enfabrica is a cutting-edge silicon and software company building disruptive networking solutions for parallel, heterogeneous, and accelerated computing infrastructure. As the inventors of the Accelerated Compute Fabric SuperNIC (ACF-S), Enfabrica’s groundbreaking chips, software stack design, and partner-enabled systems give customers the freedom to stitch the fabric of our AI-enabled future and scale GPU and accelerated compute clusters like no one has. Enfabrica is elevating networking for the age of GenAI by producing the world’s most advanced, performant and efficient solutions that interconnect compute, memory and network.
Source: Enfabrica