ROCm 6.3 adds several new features including a Fortran compiler, and SGLang
#1
Information 
Quote:ROCm gets more abilities for enterprise customers to take advantage of
 
AMD has announced ROCm version 6.3, which adds many new updates to the ROCm ecosystem. The latest iteration of the open-source driver stack features several additions, including SGLang, FlashAttention-2, and a Fortran Compiler.

SGLang is a new runtime in ROCm 6.3 that purportedly improves latency, throughput, and resource utilization by optimizing "cutting-edge" generative AI models on AMD's homebrewed Instinct GPUs. SGLang purportedly achieves up to 6X higher performance on large language model inferencing and comes with pre-configured Docker containers that use Python to accelerate AI, multimodal workflows, and scalable cloud backends.

FlashAttention-2 is the next iteration of FlashAttention, which reduces memory usage and compute demands with Transformer AI models. FlashAttention-2 purportedly features up to 3x speedup improvements over version one for backward and forward passes, accelerating AI model training time.

AMD has implemented a Fortran compiler into ROCm 6.3, enabling users to run legacy Fortran-based applications on AMD's modern Instinct GPUs. The compiler features direct GPU offloading through OpenMP for scientific workloads, backward compatibility allowing the developers to continue writing Fortran code for existing legacy applications, and simplified integrations with HIP kernels and ROCm libraries.

Multi-NodeFFT support enables high-performance distributed FFT computations in ROCm 6.3. This feature purportedly simplifies multi-node scaling, reducing developers' complexity and enabling seamless scalability across massive datasets.

ROCm 6.3 introduces enhancements to the computer vision libraries rocDecode, rocJPEG, and rocAL, enabling support for the AV1 codec, GPU-accelerated JPEG decoding, and better audio augmentation.

ROCm is an open-source stack of software and drivers designed to run on AMD Instinct GPUs. The platform aims to provide features that enable or improve enterprise GPU-accelerated applications such as high-performance computing (HPC), AI/Machine Learning, communication, and more.

Continue Reading...
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)
[-]
Welcome
You have to register before you can post on our site.

Username/Email:


Password:





[-]
Recent Posts
QOwnNotes 19.1.6
25.5.9   The p...Kool — 15:38
XYplorer
What's new in Rele...Kool — 15:30
FastestVPN PRO Lifetime Plan 15 Logins +...
Link: https://fastes...siriyax320 — 10:40
F-Secure 25.5
Version 25.5 ​R...harlan4096 — 09:31
uBOLite_2025.601.2131
uBOLite_2025.601.2...harlan4096 — 08:54

[-]
Birthdays
Today's Birthdays
avatar (50)nteriageda
Upcoming Birthdays
avatar (47)BrantgoG
avatar (41)tapedDow
avatar (49)eapedDow
avatar (46)Carlosskake
avatar (48)rapedDow
avatar (43)Johnsonsyday
avatar (48)Groktus
avatar (40)efodo
avatar (38)Tedscolo
avatar (45)brakasig
avatar (44)JamesReshy
avatar (46)Francisemefe
avatar (39)leoniDup
avatar (38)Patrizaancem
avatar (50)smudloquask
avatar (45)benchJem
avatar (38)biobdam
avatar (41)zacforat
avatar (46)NemrokReks
avatar (49)Jasoncedia
avatar (37)Barrackleve
avatar (39)Julioagopy
avatar (49)aolaupitt2558
avatar (47)vadimTob
avatar (37)leannauu4
avatar (39)storoBox
avatar (47)kinotHeemn
avatar (38)Ceballos1976
avatar (39)efynu
avatar (31)horancos

[-]
Online Staff
There are no staff members currently online.

>