Niche Model of the Day: Nemotron 49B 3bpw exl3

@[email protected] · edit-2 1 day ago

Niche Model of the Day: Nemotron 49B 3bpw exl3

massive_bereavement · 23 hours ago

AFAIK ROCm isn’t yet supported: https://github.com/turboderp-org/exllamav3

I hope the word “yet” means that it might come at some point, but for now it doesn’t seem to be developed in any form or fashion.

@[email protected] · 20 hours ago

There’s a “What’s missing” section there that lists ROCm, so I’m pretty sure it’s planned to be added

@[email protected] · edit-2 17 hours ago

That, and exl2 has ROCm support.

There was always the bugaboo of uttering a prayer to get rocm flash attention working (come on, AMD…), but exl3 has plans to switch to flashinfer, which should eliminate that issue.

Niche Model of the Day: Nemotron 49B 3bpw exl3

Niche Model of the Day: Nemotron 49B 3bpw exl3

turboderp/Llama-3.3-Nemotron-Super-49B-v1-exl3 at 3.0bpw