Microsoft researchers build 1-bit AI LLM with 2B parameters — model small enough to run on some CPUs

@[email protected] · 11 days ago

Microsoft researchers build 1-bit AI LLM with 2B parameters — model small enough to run on some CPUs

@[email protected] · 10 days ago

It’s trinary, and I understand why they instead say “1-bit,” but it still bugs me that they call it “1-bit.”

I’d love to see how low they can push this and still get spooky results. Something with ten million parameters could fit on a Macintosh Classic II - and if it ran at any speed worth calling interactive, it’d undercut a lot of loud complaints about energy use. Training takes a zillion watts. Using the model is like running a video game.

@[email protected] · 8 days ago

Can someone tell me what’s meant by,

The repository describes bitnet.cpp as offering “a suite of optimized kernels that support fast and lossless inference of 1.58-bit models on CPU

Does it mean you need to run your OS with a specific kernel from bitnet.cpp? Or is it a different kind of ‘kernel’?

@[email protected] · edit-2 7 days ago

I think they mean whatever’s handling the model. A program into which you feed this inherently restricted format, so it takes advantage of those limitations, in order to run more efficiently.

Like if every number’s magnitude is 1 or 0, you don’t need to do floating-point multiplication.