Thread Rating:
  • 1 Vote(s) - 5 Average
  • 1
  • 2
  • 3
  • 4
  • 5
AMD Announces the Instinct MI100 GPU, CDNA Breaks 10 TFLOPS Barrier
#1
Information 
Quote:
[Image: 3WRpESWycgppGERfcq9MiM-1024-80.jpg.webp]

The Instinct MI100 is an FP64 Monster

AMD announced its 7nm Instinct MI100 GPU today, along with a slew of design wins from the likes of Dell, HPE, and Supermicro. The Instinct MI100 marks the first iteration of AMD's compute-focused CDNA GPU architecture. The new architecture offers up to 11.5 TFLOPS of peak FP64 throughput, making the Instinct MI100 the first GPU to break 10 TFLOPS in FP64 and marking a 3X improvement over the previous-gen MI50. It also boasts a peak throughput of 23.1 TFLOPS in FP32 workloads, beating Nvidia's beastly A100 GPU in both of those categories, though it lags with other numerical formats. 

As expected from a data center GPU, the PCIe 4.0 card is designed for AI and HPC workloads and also supports AMD's second-gen Infinity Fabric, which doubles the peer-to-peer (P2P) I/O bandwidth between cards. This fabric allows the cards to share a unified memory address space with CPUs, a key advantage for AMD as it leverages its position as the only CPU vendor that is currently shipping data center-class GPUs. The cards boast up to 340 GB/s of aggregate throughput over three Infinity Fabric links and are designed to be deployed into quad-core hives (up to two per server), with each hive supporting up to 552 GB/s of P2P I/O bandwidth.

The Instinct MI100 also supports AMD's new Matrix Core technology that boosts performance in single- and mixed-precision matrix operations, like FP32, FP16, bFloat 16, INT8, and INT4. That tech boosts FP32 performance up to 46.1 TFLOPS. 

The cards come with 32GB of HBM2 memory spread across four stacks that provide an aggregate of up to 1.23 TB/s of bandwidth. AMD claims the cards offer up to 1.8x to 2.1X more peak performance per dollar compared to Nvidia's A100 GPUs. 

AMD also announced that its open source ROCm 4.0 developer software now has an open source compiler and unified support for OpenMP 5.0, HIP, PyTorch, and Tensorflow. 

The card has a 300W TDP and comes in the standard PCIe Add-In Card (AIC) form factor with two eight-pin connectors for power. Given the data center focus, the card lacks display outputs, and the passively-cooled card has a rear I/O shield with a large mesh for efficient airflow.

AMD dialed back the MI100's peak clock rate to 1,502 MHz, down from 1,725 MHz with the previous-gen MI50, but doubled the number of compute units up to 120. The company also improved memory bandwidth to 1.23 TB/s. The net effect of the improvements to the CDNA architecture (which we'll cover below) delivers a 1.74X gain in peak FP64 and FP32 throughput, and a whopping 3.46X improvement in matrix FP32 and 6.97X gain in matrix FP16. Those gains come courtesy of AMD's new Matrix Core technology that enhances the CUs with new Matrix Core Engines optimized for mixed data types. 

AMD's MI100 beats the Nvidia A100 in peak FP64 and FP32 throughput by ~15%, but Nvidia's A100 still offers far superior throughput in matrix FP32, FP16 and INT4/INT8 and bFloat16 workloads. 

AMD touts that the MI100 rivals the 6 Megawatt ASCI White, the world's fastest supercomputer in 2000 that weighed 106 tons and provided 12.3 TFLOPS of performance. In contrast, the MI1000 brings power down to 300W, weighs only 2.56 pounds, and dishes out 11.5 TFLOPS of performance.
...
Continue Reading
[-] The following 1 user says Thank You to harlan4096 for this post:
  • silversurfer
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)
[-]
Welcome
You have to register before you can post on our site.

Username/Email:


Password:





[-]
Recent Posts
F-Secure 19.4
What's new in the ...harlan4096 — 09:44
Thunderbird Supernova 115.10.1
Thunderbird Supern...harlan4096 — 09:41
Microsoft Edge 124.0.2478.51
Version 124.0.2478...harlan4096 — 09:40
Rogue Anti-Malware 15.16.1
V15.16.1 04/12/202...harlan4096 — 09:39
Intel Xeon 6 6980P “Granite Rapids-AP” C...
Intel Xeon 6 specs...harlan4096 — 09:37

[-]
Birthdays
Today's Birthdays
avatar (36)RobertUtelt
Upcoming Birthdays
avatar (43)wapedDow
avatar (42)techlignub
avatar (41)Stevenmam
avatar (48)onlinbah
avatar (49)steakelask
avatar (43)Termoplenka
avatar (41)bycoPaist
avatar (47)pieloKat
avatar (41)ilyagNeexy
avatar (49)donitascene
avatar (49)Toligo

[-]
Online Staff
There are no staff members currently online.

>