Meta showcases the hardware that will power recommendations for Facebook and Instagram — low-cost RISC-V cores and mainstream LPDDR5 memory are at the heart of its MTIA recommendation inference CPU

Bydls

Sep 10, 2024 #Pro

Meta unveiled its first-generation in-house AI inference accelerator designed to power the ranking and recommendation models that are key components of Facebook and Instagram back in 2023.

The Meta Training and Inference Accelerator (MTIA) chip, which can handle inference but not training, was updated in April, and doubled the compute and memory bandwidth of the first solution.

At the recent Hot Chips symposium last month, Meta gave a presentation on its next-generation MTIA and admitted using GPUs for a recommendation engines is not without challenges. The social media giant noted that peak performance doesn’t always translate to effective performance, large deployments can be resource-intensive, and capacity constraints are exacerbated by the growing demand for Generative AI.

Mysterious memory expansion

Taking this into account, Meta’s development goals for the next generation of MTIA include improving performance per TCO and per watt compared to the previous generation, efficiently handling models across multiple Meta services, and enhancing developer efficiency to quickly achieve high-volume deployments.

Meta’s latest MTIA gains a significant boost in performance with GEN-O-GEN, which increases GEMM TOPs by 3.5x to 177 TFLOPS at BF16, hardware-based tensor quantization for accuracy comparable to FP32, and optimized support for PyTorch Eager Mode, enabling job launch times under 1 microsecond and job replacement in less than 0.5 microseconds. Additionally, TBE optimization enhances embedding indices’ download and prefetch times, achieving 2-3x faster run times compared to the previous generation.

The MTIA chip, built on TSMC’s 5nm process, operates at 1.35 GHz with a gate count of 2.35 billion and offers 354 TOPS (INT8) and 177 TOPS (FP16) GEMM performance, utilizing 128GB LPDDR5 memory with a bandwidth of 204.8GB/s, all within a 90-watt TDP.

The Processing Elements are built on RISC-V cores, featuring both scalar and vector extensions, and Meta’s accelerator module includes dual CPUs. At Hot Chips 2024, ServeTheHome noticed a Memory Expansion linked to the PCIe switch and the CPUs. When asked if this was CXL, Meta rather coyly said, “it is an option to add memory in the chassis, but it is not being deployed currently.”

More from TechRadar Pro

Services Marketplace – Listings, Bookings & Reviews

Entertainment blogs & Forums

I found an app that turns the worst of MacBook’s design into a browsing relief

YouTube now lets you filter search results to only see Shorts

This is what impressed me most about Mercedes’ new driver-assist tech

Dutch Court rules AI-written wedding vows invalid, nullifying marriage

I found an app that turns the worst of MacBook’s design into a browsing relief

The Science Fiction and Fantasy Books You Can’t Afford to Miss in September!

Send a newsletter? This $100 list-building tool is just $12 right now.

There’s officially a snake named after Salazar Slytherin now

I found an app that turns the worst of MacBook’s design into a browsing relief

YouTube now lets you filter search results to only see Shorts

This is what impressed me most about Mercedes’ new driver-assist tech

Dutch Court rules AI-written wedding vows invalid, nullifying marriage

Meta showcases the hardware that will power recommendations for Facebook and Instagram — low-cost RISC-V cores and mainstream LPDDR5 memory are at the heart of its MTIA recommendation inference CPU

Bydls

Mysterious memory expansion

More from TechRadar Pro

Related Post

I found an app that turns the worst of MacBook’s design into a browsing relief

YouTube now lets you filter search results to only see Shorts

This is what impressed me most about Mercedes’ new driver-assist tech

You missed

I found an app that turns the worst of MacBook’s design into a browsing relief

YouTube now lets you filter search results to only see Shorts

This is what impressed me most about Mercedes’ new driver-assist tech

Dutch Court rules AI-written wedding vows invalid, nullifying marriage