AMD unveils its first small language model, AMD-135M — AI performance enhanced by spe
#1
Information 
Quote:General and coding-optimized models released.
 
As AMD flexes its muscles in the AI game, it is not only introducing new hardware but is betting on software too, trying to hit new market segments not already dominated by Nvidia. 

Thus, AMD has unveiled its first small language model, AMD-135M, which belongs to the Llama family and is aimed at private business deployments. It is unclear whether the new model has to do anything with the company's recent acquisition of Silo AI (as the deal has to be finalized and cleared by various authorities, so probably not), but this is a clear step in the direction of addressing the needs of specific customers with a pre-trained model done by AMD - using AMD hardware for inference.  

 The main reason why AMD's models are fast is because they use so-called speculative decoding. Speculative decoding introduces a smaller 'draft model' that generates multiple candidate tokens in a single forward pass. Tokens are then passed to a larger, more accurate 'target model' that verifies or corrects them. On the one hand, this approach allows for multiple tokens to be generated simultaneously, yet on the other hand this comes at the cost of power due to increased data transactions.  

AMD's new release comes in two versions: AMD-Llama-135M and AMD-Llama-135M-code, each designed to optimize specific tasks by accelerating inference performance by using speculative decoding technology, a logical thing to do for a small-language model-based AI service. Somehow, both prevail in performance tests conducted by AMD.
  • The base model, AMD-Llama-135M, was trained from the ground up on 670 billion tokens of general data. This process took six days using four 8-way AMD Instinct MI250-based nodes (in AMD's nomenclature these are just 'four AMD MI250 nodes'). 
  • In addition, AMD-Llama-135M-code was fine-tuned with an extra 20 billion tokens specifically focused on coding, completing this task in four days using the same hardware.
AMD believes that further optimizations can lead to even better performance. Yet, as the company shares benchmark numbers of its previous-generation GPUs, we can only imagine what its current-generation (MI300X) and next-generation (MI325X) could do. 
...
Continue Reading
Reply


Messages In This Thread
AMD unveils its first small language model, AMD-135M — AI performance enhanced by spe - by harlan4096 - 30 September 24, 14:55

Forum Jump:


Users browsing this thread: 1 Guest(s)
[-]
Welcome
You have to register before you can post on our site.

Username/Email:


Password:





[-]
Recent Posts
Android Security Bulletin—May 2026
Android Security...harlan4096 — 08:10
AdwCleaner 8.8.0
AdwCleaner 8.8.0 ...harlan4096 — 08:07
AdGuard VPN for Windows 2.9.2
AdGuard VPN for Wi...harlan4096 — 08:06
Google Chrome Is Silently Downloading a ...
Google Chrome has ...harlan4096 — 08:04
QOwnNotes
26.5.2 Pressing Ctr...Kool — 06:39

[-]
Birthdays
Today's Birthdays
avatar (45)xclubDum
avatar (41)Stewartanilm
Upcoming Birthdays
avatar (28)akiratoriyama
avatar (48)Jerrycix
avatar (40)awedoli
avatar (82)WinRARHowTo
avatar (38)owysykan
avatar (49)beautgok
avatar (39)axuben
avatar (45)talsmanthago
avatar (31)mocetor
avatar (46)piomaibhaict
avatar (51)kingbfef
avatar (38)izenesiq
avatar (40)ihijudu
avatar (45)tiojusop
avatar (42)Damiennug
avatar (40)acoraxe
avatar (49)contjrat
avatar (41)axylisyb
avatar (44)tukrublape
avatar (41)iruqi
avatar (42)saitetib
avatar (36)ypasodiny
avatar (39)omapek
avatar (48)Geraldtuh
avatar (44)knigiJow
avatar (46)1stOnecal
avatar (50)Mirzojap
avatar (36)idilysaju
avatar (40)GregoryRog
avatar (45)mediumog
avatar (40)odukoromu
avatar (46)Joanna4589

[-]
Online Staff
There are no staff members currently online.

>