21 September 21, 07:11
Quote:Continue Reading
It's all about the (EPYC) bandwidth.
Netflix has been serving up to 200 Gbps of TLS-encrypted video from a single server since 2020. Nonetheless, the company aims to double the bandwidth to 400 Gbps. During his presentation at the EuroBSD 2021 conference (via HardwareLuxx), Andrew Gallatin, Senior Software Engineer at Netflix, detailed the challenges of pushing the bandwidth envelope on its FreeBSD-based servers.
Netflix turned to AMD's EPYC Rome processors to achieve its goal. The company equipped its server with the EPYC 7502P, which wields 32 Zen 2 cores with a 2.5 GHz base clock and 3.35 GHz boost clock. More importantly, the 32-core beast offers up to 128 PCIe 4.0 lanes, good for about 250 GBps of bandwidth or around 2 Tbps in networking units. Netflix paired the EPYC 7502P with 256GB of DDR4-3200 memory, with a total memory bandwidth of up to 150 GBps, or 1.2 Tbps in networking units.
For storage, Netflix's AMD-powered server utilizes 18 Western Digital WD SN720 2TB NVMe SSDs. It's also equipped with a pair of Nvidia's Mellanox ConnectX-6 Dx network adapters that communicate through a PCIe 4.0 x16 interface. Initially, Netflix was only getting 240 Gbps out of the server, primarily due to the limitation on the memory.
Netflix experimented with different NUMA (Non Uniform Memory Architecture) configurations to maximize the bandwidth. AMD's EPYC processors support different NUMA nodes per socket, which can either be 1, 2 or 4. Naturally, the processor dictates which modes are available or not. The EPYC 7502P, which is the SKU used in Netflix's server, supports all three NUMA modes. According to Gallatin's slide, a single NUMA node configuration delivers up to 240 Gbps, while a setup with four NUMA nodes bumps the value up to 280 Gbps.
In an attempt to optimize the performance and avoid hardware bottlenecks, Netflix tested offloading the TLS encryption to the Mellanox ConnectX-6 Dx, instead of the EPYC 7502P. With a bit of tinkering with the software and some firmware updates, Netflix managed to squeeze 190 Gbps per Mellanox ConnectX-6 Dx adapter or 380 Gbps with two network adapters. The encryption no longer passes through the processor, so it helps free up resources and cuts memory bandwidth by half. The results showed 50% processor utilization, with four NUMA nodes and around 60% without NUMA.
...